Projects

Smriti - Cultural Heritage Data Platform

AI4Bharat, IIT Madras | Large-Scale Multimodal Dataset Project

A comprehensive full-stack platform for collecting India’s largest multimodal cultural dataset, covering all 7,000 taluks of India. The project aims to capture approximately 10.5 million cultural items (images, videos, audio) with detailed local-language captions to build culturally-aware AI training datasets and preserve India’s living heritage.

My Role - Full-Stack Development:

  • Developed Flutter mobile app with offline-first architecture, enabling field data collection without internet connectivity
  • Built Flask REST API backend with Celery task processing, PostgreSQL database, and MinIO cloud storage
  • Implemented Vue.js web viewer for data visualization and quality assurance workflows
  • Created comprehensive sync system handling metadata, entities, and chunked resource uploads
  • Designed robust database schemas for cultural metadata (frequency, time periods, significance, keywords)
  • Integrated background processing for multimedia file handling and metadata extraction

Key Technical Features:

  • Offline-first sync: Multi-phase synchronization with automatic conflict resolution
  • Scalable architecture: Supports thousands of field collectors with reliable data sync
  • Quality assurance workflow: Complete QA system with resource review and approval
  • Rich metadata collection: Cultural context, historical significance, location data, and detailed captioning
  • Cross-platform support: Flutter app runs on Android, iOS, and web

Tech Stack: Flutter, Dart, Flask, Python, PostgreSQL, SQLite, Redis, Celery, MinIO, Vue.js 3, SQLAlchemy, JWT Authentication

Impact: Creating the foundation for culturally-intelligent AI models that understand India’s diverse languages and traditions

Cultural Data Collection

View Live Demo


Indic Parler-TTS

Published at Interspeech 2025 (RASMALAI)

Developed a cutting-edge multilingual text-to-speech model capable of synthesizing speech across 24 languages (23 Indic languages + English). Built upon the Parler-TTS architecture, this model enables fine-grained control over various speech characteristics through descriptive text prompts, producing highly nuanced and natural-sounding speech.

My Contributions:

  • Curated and cleaned 8,385-hour multilingual speech dataset with detailed caption annotations
  • Contributed to training setup, performance monitoring, and troubleshooting
  • Participated in experimental design and refined training strategies
  • Co-authored research paper accepted at Interspeech 2025 (RASMALAI)

Research Impact:

  • Introduces RASMALAI dataset for controllable text-to-speech
  • Enables fine-grained prosodic control via text prompts
  • Supports 24 languages with high-quality synthesis
  • Advances multilingual TTS research for Indian languages

Tech Stack: Python, PyTorch, HuggingFace Transformers, TTS, Speech Processing

Indic Parler-TTS

View on HuggingFace


Machine Learning Algorithms

Adopted by IIT Madras as Official Course Notes

A comprehensive resource on machine learning algorithm foundations, covering both theoretical concepts and practical implementations. This project was created as part of my teaching assistant role and has been officially adopted by IIT Madras as standard course material for their Machine Learning course.

Content Includes:

  • Detailed explanations of classical ML algorithms
  • Python implementations from scratch
  • Supervised and unsupervised learning techniques
  • Model internals and mathematical foundations
  • Effective EDA and model building guidance

Recognition: - Used as official course notes at IIT Madras - Comprehensive coverage from basics to advanced topics - Combines theoretical rigor with practical implementation

Tech Stack: Python, Scikit-Learn, Pandas, TensorFlow, Quarto, Markdown, LaTeX

View Documentation View on GitHub

Machine Learning Algorithms

Blog-Lite v2

Awarded ‘Best Project’ - IIT Madras Coursework

Blog Lite v2

A full-stack web application for creating and sharing text and image blogs with social features. Built as part of IIT Madras coursework and recognized as the Best Project in the cohort.

Key Features:

  • User authentication and profile management
  • Text and image blog creation and sharing
  • Social features: follow users, search, and comment on blogs
  • RESTful API for data access and modification
  • Background task processing with Celery
  • Redis caching for performance optimization

Tech Stack: Vue.js 3, Python, Flask, SQLite, Redis, Celery, SQLAlchemy

View on GitHub


Other Projects

View Additional Projects

Business Data Management Capstone: Decoding Real-World Business Challenges

Score: 92/100

Plot showing Sales during Various Time slots

A comprehensive data analytics project tackling real-world business challenges in collaboration with a local cafe. Applied data science techniques to analyze business operations, identify patterns, and provide actionable insights.

Project Highlights:

  • Comprehensive business proposal with problem identification
  • Extensive data analysis and visualization
  • Statistical analysis of sales patterns across time slots
  • Actionable recommendations based on data insights
  • Achieved an impressive score of 92/100

Skills Demonstrated: Data Analysis, Business Intelligence, Statistical Analysis, Data Visualization, Report Writing

View Project


Building a Cat Image Classifier using Neural Networks from Scratch

This project showcases the implementation of deep learning algorithms from scratch for cat image classification using NumPy and SciPy in Python.

The neural network architecture is developed within a Jupyter Notebook, emphasizing the creation of layers, activation functions like ReLU and Sigmoid, and training procedures.

Quarto is utilized to convert the notebook into a webpage, presenting an accessible overview of the custom-built deep learning algorithms employed in the classification of cat images.

View Project

Cat

Exploring Regularization Techniques in Neural Networks for French Football Corporation

Field Tactic Image

This project explores the implementation of deep learning algorithms, focusing on neural networks and regularization techniques like L2 and Dropout, applied to optimize ball kicking positions for French football players.

The project was primarily built using Python’s NumPy and SciPy libraries to create the neural network from scratch.

The notebook has been converted into a webpage using Quarto.

View Project


Neural Image Style Transfer using PyTorch

This project shows how we can use PyTorch to mix different artistic styles in images using something called neural image style transfer. It’s like blending different painting styles onto different pictures.

It talks about using pre-trained VGG19 models, which are like tools that help us understand images better. The notebook explains how to take out important details from images and define a way to measure how close or different two images are. Then, it uses these methods to slowly change one picture’s style to look more like another picture.

This notebook is a good starting point for anyone interested in mixing different artistic styles in images. It helps to understand how images can be changed to look like different art styles by combining their content and artistic details.

View Project

Style Transfer example