I’d like to request the adoption of Multi-Token Prediction (MTP drafter for Gemma-4-31b, which can boost its throughput up to 3x without any loss in performance. It would be a game changer for a variety of use cases.
Please authenticate to join the conversation.
New Submission
Feature Requests
New Model
3 days ago

Dongwon
Get notified by email when there are changes.
New Submission
Feature Requests
New Model
3 days ago

Dongwon
Get notified by email when there are changes.