I tried looking and couldn't find a proper price per token for the chat model. It claims to be free in some places. I did find these prices for the other services:
Text to Speech (Bulbul v3): ₹30 per 10K characters
Text to Speech (Bulbul v2): ₹15 per 10K characters
Sarvam Vision: Free per page
Speech to Text: ₹30 per hour
Speech to Text with Diarization: ₹45 per hour
Speech to Text & Translate: ₹30 per hour
Speech to Text, Translate & Diarization: ₹45 per hour
Sarvam Translate V1: ₹20 per 10K characters
Translate Mayura V1: ₹20 per 10K characters
Transliterate: ₹20 per 10K characters
Language Identification: ₹3.5 per 10K characters
I do this with claude frequently. The best tactic I have found is to include a finished file as a "style guide" alongside your content prompt.
To find an appropriate style guide for your subject matter, find paper on the arxiv that covers similar material and click the "TeX Source" link.
I do this with latex beamer for presentations. The problem is exporting what I make there to .pptx for my coworkers to edit. I think people use the templates to more easily convert to powerpoint later
you could set up a "team admin" user on google drive and then invite your individual team member to that user's files? No matter what service you use, you will always need to set up an admin account
I like the idea. About question 2, I think you need some way to publicly benchmark your stripped-down models' performance. Your models probably won't be able to perform on the standard benchmarks, but the big models will probably be able to work on your custom eval sets, such as those Hindi math problems.
I would publish:
1) your domain specific eval set
2) your model's results on that eval set
3) biglab's model's results on that eval set
That would give users a way to determine if your model is actually capable in that reduced domain