Help Center

Models

Bugfixes / Misc

Features

Token

Characters

veniceai

New Submission

Backlog

In Review

Planned

Up next

In Progress

Completed

Rejected

High Priority

Low Priority

New Model

Chat

History

Image

Support

Web Search

Backup / Sync

Staking Contract

Voice

Points

Settings

Code

Folders

Payment

Video

UI Design

Stickers

Merchandise

Censorship

Context Window

Social

Enhance Prompt

Plugin

File Upload

Upscale/Enhance

Edit Prompt

Language

Privacy

Venice Pro

System Prompts

Vision

LoRAs

Image Editing

Memory

Social Feed

Artifacts

Auto Mode

Blog

Incentive Fund

Agents

Model Deprecation

Credits

Library

Projects

Marketing

Here’s what Venice is working on next!

Next up

Roadmap

All completed and shipped community requested feature requests and improvements can be seen here.

Done

Shipped Features

Hey {name|there}! 👋

With the way that I understand Venice to work, the context of an inference is sent to a given GPU inference provider, which returns a response. On subsequent queries within the same “chat”, the whole context is sent to another inference provider, and so on.If the initial prompt contains, say, a medical test results PDF which I ask the model to extract the results from, and that document contains my other personal information, then is it safe to say that the PDF contents are sent to the next inference provider for each request?It would be useful to be able to click a button and have all context above that point hidden from future requests. Additionally, another mode for that button could be “distillation”, where the model is prompted to produce a scrubbed summary of the above context, such that subsequent requests will only include the distilled context.The purpose of this being to limit the distribution of identifying information between difference inference providers.

So wie ich Venedig verstehe, wird der Kontext einer Abfrage an einen bestimmten GPU-Abfrageanbieter gesendet, der eine Antwort zurückgibt. Bei nachfolgenden Abfragen innerhalb desselben "Chats" wird der gesamte Kontext an einen anderen Inferenzanbieter gesendet usw.Wenn die anfängliche Eingabeaufforderung beispielsweise eine PDF-Datei mit medizinischen Testergebnissen enthält, aus der ich das Modell bitte, die Ergebnisse zu extrahieren, und dieses Dokument meine anderen persönlichen Daten enthält, kann man dann sagen, dass der Inhalt der PDF-Datei bei jeder Anfrage an den nächsten Inferenzanbieter gesendet wird?Es wäre nützlich, auf eine Schaltfläche klicken zu können, um den gesamten Kontext oberhalb dieses Punktes vor zukünftigen Anfragen zu verbergen. Zusätzlich könnte ein anderer Modus für diese Schaltfläche "Destillation" sein, bei dem das Modell aufgefordert wird, eine bereinigte Zusammenfassung des obigen Kontexts zu erstellen, so dass nachfolgende Anfragen nur den destillierten Kontext enthalten.Der Zweck dieses Verfahrens ist es, die Verteilung von identifizierenden Informationen zwischen verschiedenen Inferenzanbietern zu begrenzen.

Intermittierende Kontextabschneidung / Destillation

De la forma en que entiendo que funciona Venecia, el contexto de una inferencia se envía a un determinado proveedor de inferencia GPU, que devuelve una respuesta. En consultas posteriores dentro del mismo "chat", todo el contexto se envía a otro proveedor de inferencia, y así sucesivamente.Si la consulta inicial contiene, digamos, un PDF de resultados de pruebas médicas del que pido al modelo que extraiga los resultados, y ese documento contiene otra información personal mía, ¿es seguro decir que el contenido del PDF se envía al siguiente proveedor de inferencia para cada solicitud?Sería útil poder pulsar un botón y tener todo el contexto por encima de ese punto oculto para futuras solicitudes. Además, otro modo para ese botón podría ser la "destilación", donde el modelo se le pide que produzca un resumen depurado del contexto anterior, de tal manera que las solicitudes posteriores sólo incluirán el contexto destilado.El propósito de esto es limitar la distribución de la información de identificación entre los proveedores de inferencia de diferencia. Sería útil poder hacer clic en el botón y tener todo el contexto por encima de ese punto oculto para futuras solicitudes.

Truncamiento / destilación intermitente del contexto

D'après ce que j'ai compris de Venice, le contexte d'une inférence est envoyé à un fournisseur d'inférence GPU donné, qui renvoie une réponse. Si l'invite initiale contient, par exemple, un PDF de résultats d'examens médicaux dont je demande au modèle d'extraire les résultats, et que ce document contient d'autres informations personnelles, peut-on dire que le contenu du PDF est envoyé au fournisseur d'inférence suivant pour chaque demande ? En outre, un autre mode pour ce bouton pourrait être la "distillation", où le modèle est invité à produire un résumé épuré du contexte ci-dessus, de sorte que les demandes ultérieures n'incluront que le contexte distillé.Le but étant de limiter la distribution d'informations d'identification entre les différents fournisseurs d'inférence.

Troncature / distillation intermittente du contexte

Intermittent context truncation / distillation

Justin Martin

Venice.ai

Intermittent context truncation / distillation

Subscribe to post

Subscribe to post