I’m using gemma-4-uncensored via API for controlling NPC behavior in a Skyrim mod called SkyrimNet. While it’s perfect for response quality, speed, and cost for what I want to use it for, I’m frequently running into 429 model overloaded errors. Unlike coding or text generation for characters on Venice’s website, this is a use case where I can’t easily just regenerate responses to retry the model multiple times. I’ve only had good consistent results without overloaded errors during extreme off-peak hours (early mornings in EST time zone). Would it be possible to direct some more resources to gemma-4-uncensored so that it’s not overloaded so often?
Please authenticate to join the conversation.
New Submission
Bugs
2 days ago

David Anderson
Get notified by email when there are changes.
New Submission
Bugs
2 days ago

David Anderson
Get notified by email when there are changes.