2025 03 01 dhwani work division

Dhwani Work Division

Sachin - Integration, Deployment, Research Plan, Demo
Sahana - Text to Speech - UX, benchmarks, model optimisation
students
- Model Conversion
  - asr - IndicConformer based on Nvidia Nemo
    - onnx export
    - triton server
    - raycast server
- Model tests and optimal GPU inference handling
- Re-training and evaluation
Identify - Lowest Compute GPU cloud to handle Voice mode for 1/ 10/ 100/ 1000/ 10,000 users concurrently
Fit models - ASR + TTS + LLM + Translation
Lazy loading and pre-loading models based on use-case .
scaling / scheduling and observations