2025 03 01 dhwani work division

Dhwani Work Division

  • Sachin - Integration, Deployment, Research Plan, Demo
  • Sahana - Text to Speech - UX, benchmarks, model optimisation

  • students

    • Model Conversion
      • asr - IndicConformer based on Nvidia Nemo
        • onnx export
        • triton server
        • raycast server
    • Model tests and optimal GPU inference handling
    • Re-training and evaluation
  • Identify - Lowest Compute GPU cloud to handle Voice mode for 1/ 10/ 100/ 1000/ 10,000 users concurrently

  • Fit models - ASR + TTS + LLM + Translation

  • Lazy loading and pre-loading models based on use-case .
  • scaling / scheduling and observations