Hi, I'm Ashutosh Roy

|

Experienced in Machine Learning, NLP, and LLM-based systems, with a focus on building scalable, production-ready AI solutions. Worked on OCR and RAG pipelines at IIT Delhi. Proficient in Python, PyTorch, Hugging Face, OpenCV, and FastAPI.

Welcome

About Me

Get to know me better

Ashutosh Roy

AI/ML Engineer

Experienced in Machine Learning, NLP, and LLM-based systems, with a focus on building scalable, production-ready AI solutions. Worked on OCR and Retrieval-Augmented Generation (RAG) pipelines at IIT Delhi for large-scale document understanding.

Full Stack AI Developer

Proficient in Python, PyTorch, Hugging Face, OpenCV, and FastAPI, with strong experience in developing, optimizing, and deploying end-to-end AI systems using vector databases like FAISS and Pinecone. Currently pursuing BTech (Hons) in CSE - AI from CSVTU, Bhilai.

Beyond the Code

Outside of AI development, my hobby is playing badminton. I also run a YouTube channel where I upload vlogs and random study tutorials. Additionally, I manage an Instagram page @atrexplains where I post about various useful websites and tech insights.

Download Resume

My Projects

Check out my recent work

Speech Emotion & Stress Detection

Built ML/DL models using MFCC, pitch, and energy features. Applied SVM, Random Forest, CNN, and wav2vec2 embeddings. Improved performance using augmentation and feature engineering. Accepted at RECAAP 2026 (IIT Palakkad).

Python PyTorch Wav2Vec2 BiLSTM SVM Deep Learning

Archaeological Search Engine (IIT Delhi)

Built OCR + ML pipeline for extracting structured data from scanned archaeological documents. Enabled semantic search using embeddings and hybrid retrieval with FAISS/Pinecone and Apache Solr. Co-authored SARCH paper with IIT Delhi.

Python FastAPI FAISS Pinecone OCR RAG

Exam-Helper RAG

Built LLM-based study assistant using LLaMA 3 and RAG architecture. Integrated FAISS/Pinecone for efficient retrieval. Enabled multimodal querying from PDFs, images, and YouTube content for comprehensive exam preparation.

LLaMA 3 RAG FAISS Pinecone Python LangChain

Medical Chatbot (LLM)

Fine-tuned Mistral 7B using QLoRA for domain-specific healthcare NLP tasks. Integrated OCR pipeline for extracting insights from medical reports. Built AI chatbot that improved patient query resolution by 40%.

Mistral 7B QLoRA NLP OCR Python Hugging Face

Music Identification App

Built real-time audio recognition system using ACRCloud API. Designed end-to-end pipeline: audio capture → feature extraction → matching → metadata retrieval for instant song identification.

Python ACRCloud API Audio Processing REST API Streamlit

Portfolio Website

Modern, animated portfolio website featuring advanced CSS animations, smooth transitions, dark/light mode, and an immersive user experience.

HTML5 CSS3 JavaScript

My Skills

Technologies I work with

Python

PyTorch

TensorFlow

Hugging Face

NLP & LLMs

OpenCV

scikit-learn

FastAPI

FAISS & Pinecone

Docker

Streamlit

Git & GitHub

C++

Transformers

RAG Systems

AI Agents

LangChain

LlamaIndex

Get In Touch

Let's work together

Email

ashu2003roy@gmail.com

Send Email

WhatsApp

+91 98105 99837

Message Me