Multi-Token Prediction (MTP) drafter for Gemma-4-31b

I’d like to request the adoption of Multi-Token Prediction (MTP drafter for Gemma-4-31b, which can boost its throughput up to 3x without any loss in performance. It would be a game changer for a variety of use cases.

Please authenticate to join the conversation.

Upvoters
Status

New Submission

Board
💡

Feature Requests

Tags

New Model

Date

3 days ago

Author

Dongwon

Subscribe to post

Get notified by email when there are changes.