Web Search privacy preprocessing

From some discussion on Discord, web search currently passes the whole prompt through to Brave search, which returns relevant results to the model’s context. While Brave is a privacy-aligned company, this component of trust isn’t desirable.

As a pre-processing step to Web Search, it would be helpful to apply a summarization step - possibly with a small model - to generate a search query that encodes as much of the prompt’s “question” semantics as possible, but removing the specifics of the prompt itself.

Additionally, or instead, applying a PII redaction model might be enough. Removing identifying details from the prompt, which might include names, medical information, etc, would reduce the privacy impact of querying a third party service with the prompt. That being said, this might degrade the semantic of the prompt (e.g. “Tell me about Richard Feynman” includes a name).

Please authenticate to join the conversation.

Upvoters
Status

Backlog

Board
💡

Feature Requests

Tags

Privacy

Date

About 1 year ago

Author

Justin Martin

Subscribe to post

Get notified by email when there are changes.