2025 05 27 swara introduction

Swara - Introduction - 27 May, 2025

  • Resources/ GPU server requirements
  • Experiments / Inference
    • 1x H100 GPU : 2.49 $/hour
    • 2.49 * 24 : 59.76 $/day
    • 1852.56 $ /month
    • 22,230.72 $/year
  • Training

    • 10 X H100
    • 18525.6 $/month
  • Phase 1 : Text to Song

  • Use compositions/ varna's from purundaradasa.org and create simple songs without music

  • Phase 2 : Song + Music

  • Sync music to songs with Audiocraft

  • Phase 3 : Editable songs

  • Generate Music based on user requests in real-time

  • Phase 1 - Detail

  • Data collection - https://tkgovindarao.org/compositions.php
  • Data transformation
    • parse document with LLM to convert to Huggingface/dataset format
  • Literature survey and experiments
    • Text to Song models
    • Availability of Open source / Indian language models
  • Annotation and feedback loop software
    • System design, development and deploymeny
  • Inference server setup
    • llama.cpp server optimsation > CPU/GPU compatbility
  • tool_call with qwen3 for music embedings
  • IndicSeamless - Audio conversion to other languages