Dhruv Sood

New Delhi · (+91) 7011489011 · dhruvsood6@gmail.com

Currently Doing

AI Agent Simulations

Working on AI agents for targeted Monte Carlo simulations bias across Indian socio context for healthcare and judiciary domains.

Text-to-SQL Agents

Working on text-to-SQL using agentic workflows and novel frameworks.

AIISC

LLM fairness & constitutional alignment; dataset creation and evaluations.

A Multi-level Bias Elimination through a Decoding Approach with Knowledge Augmentation for Robust Constitutional Alignment of Language Models

MIT Tech Review, TACL Review

Probo

Production RAG systems, Microservices Architecture, API-level SQL converters

Experience

AI Research Intern

Artificial Intelligence Institute of South Carolina USA (Remote) {Dr. Amitava Das}
  • Collaborating with Prof. Amitava Das on LLM Fairness Framework targeting caste and religious bias mitigation in Indian socio-cultural contexts; developing novel bias detection algorithms for constitutional AI alignment
  • Curated comprehensive AI Constitution dataset with 46K prompts spanning diverse religious and caste identities; benchmarked GPT-4, LLaMA, and Indic-specific language models for bias consistency evaluation
  • AMBEDKAR: A Multi-level Bias Elimination through a Decoding Approach with Knowledge Augmentation for Robust Constitutional Alignment of Language Models (Under review at TACL 2026) — arXiv:2509.02133
Feb 2025 – Present

Software Engineer

Probo
  • Architected production-grade RAG multi-agent chatbot system with Server-Sent Events, entity extraction algorithms, and real-time widgets; delivered 35% surge in user engagement and page visit metrics
  • Engineered high-performance retrieval infrastructure featuring API-level SQL converters and OpenSearch integration; accelerated query response times from 1.5s to 60ms (96% performance boost) serving production traffic
  • Spearheaded News Content Platform with autonomous content generation pipeline and SEO optimization; scaled platform to serve 3K+ daily active users with intelligent recommendation algorithms
  • Orchestrated intelligent Apache Airflow ETL pipelines processing 100K+ financial records daily; established automated 10-minute update cycles for CPI/CFPI metrics with comprehensive monitoring
Jan 2025 – Sept 2025

Research Intern

Tav Labs, IIITD {Dr. Tavpritesh Sethi}
  • Preprocessed 8000 cancer datasets for the Integrated Healthcare Platform, enabling clustering of similar datasets through dimensionality reduction techniques
  • Conducted data cleaning, integration, and clustering using BioBERT embeddings and t-SNE visualization, facilitating near-instant identification of similar datasets, thereby eliminating manual review and accelerating data integration
April 2024 - Present

ML Engineering Intern

National Stock Exchange (NSE India), New Delhi, India
  • Designed enterprise-scale Retrieval-Augmented Generation system with Qdrant vector database and LlamaIndex orchestration; enhanced regulatory document retrieval accuracy by 50% while eliminating manual processing overhead
  • Deployed 10K+ financial regulatory documents into vector database with hybrid search capabilities; accelerated information retrieval efficiency by 50% through semantic similarity and keyword matching fusion
May 2024 - Jul 2024

Research Intern

AMS Lab, IIITD {Dr. Sujay Deb}
  • Collaborated with orthodontists to develop an intra-oral monitoring system, researching sensors and creating a prototype
  • Integrated circuitry and programming for real-time tracking, increasing wear time reporting accuracy by 23%
  • Achieved 98% precision in varied intra-oral conditions, enhancing oral health monitoring metrics
Feb 2024 - Present

Teaching Assistant (Human Computer Interaction {HCI})

MIDAS Lab, IIITD (Dr. Rajiv Ratn Shah)
  • Led weekly tutorials for 30+ students, resulting in a 100% satisfaction rate
  • Graded assignments for 600+ students, achieving a 98% on-time submission rate
July 2023 - Dec 2023

Education

Indraprastha Institute of Information Technology, Delhi (IIIT Delhi)

B.Tech. in Computer Science with Specialization in Social Sciences

CGPA: 8.14

Dec 2021 - May 2025

Delhi Public School RK Puram, New Delhi

High School

Gold Medalist | Qualified NSEC Exam, School Topper in NGSE Exam

May 2018 - March 2020

Skills

Programming Languages
PythonPython Java C++C++ C SQL TypeScript
Machine Learning & AI
PyTorch TensorFlow scikit-learnscikit-learn LangChain LangGraph Neural Networks Deep Learning
Data Science & Analytics
NumPyNumPy PandasPandas SciPy Matplotlib Librosa Statistical Modeling Data Mining Feature Engineering
Backend & APIs
FastAPI REST APIs Server-Sent Events (SSE) Microservices Architecture
Databases & Search
MySQLMySQL OpenSearch Elasticsearch Qdrant Vector Database Neo4j Database Optimization
DevOps & Cloud
Docker Kubernetes Apache Airflow AWS (S3, EC2) GitGit CI/CD Kong Gateway
Specialized Tools
SeleniumSelenium FigmaFigma Unsloth Jina AI Devtron

Interests

Technical Interests

I have a deep interest in several technical domains, including:

  • Natural Language Processing (NLP)
  • Machine Learning (ML)
  • Large Language Models (LLMs)
  • Information Retrieval
  • Database Management
  • Algorithm Design
  • Object-Oriented Programming (OOP)
  • Computation in Medicine
Personal Interests

Outside the technical realm, I am passionate about:

  • Listening to Music
  • Playing Badminton
  • Mentorship
  • Community Work

Awards & Achievements

  • AIR 88 in Naukri Campus Young Turks Competition among 800k candidates.
  • Top 5 percentile ranked team in Adobe Gensolve’24 hackathon across India.
  • 2nd place in the IEEE BlackSlash Cryptic Hunt 2024.
  • 99+ percentile in the AMCAT Computer Science test among 20 million candidates.
  • Top 1.5% in JEE Mains among 1 million candidates.
  • Global rank 613 in Codechef Starters 52; rated ‘Pupil’ on Codeforces.
  • Awarded Gold Medal for 6 years of academic excellence.
  • Community Worker (Rural Agriculture Development Society).
  • SMP Mentor (Guided a group of freshman students, providing support and advice to help them acclimate to university life and academic challenges).

Projects

Project 1

Tank Stars Clone

Java LibGDX OOP

Tank Stars Clone

Java, LibGDX, OOPS, In-game mechanics, Event-Driven programming.

This project is a clone of the Tank Stars game using LibGDX framework, featuring object-oriented principles, dynamic terrain, and customizable multiplayer settings.

View Project
Project 2

Pharmacy Management System

SQL Triggers OLAP

Pharmacy Management System

SQL, OLAP queries, Triggers, Database Design.

An online database management system that manages inventory, sales, and prescriptions for a retail pharmacy incorporated with olap queries and trigger systems.

View Project
Project 3

Resume Job Matching

Python BERT IR

Resume Job Matching Platform

Python, TF-IDF, BERT, Gemini AI, Information Retrieval

• Developed a BERT-based platform for matching resumes to job postings, achieving high performance metrics with precision, recall, and F1 scores between 0.98 and 0.99. Integrated a resume optimization feature through the Gemini AI API.

View Project
Project 4

Amazon Toolset

LLM CF CNN

Toolset Based on Amazon Dataset

LLM FineTuning, Collaborative Filtering, CNNs, TF-IDF

  • Multimodal Retrieval: Fine-tuned a ResNet CNN for image feature extraction and utilized TF-IDF for text processing, achieving a composite similarity score exceeding 90% in matching images with corresponding text.
  • Recommendation System: Applied collaborative filtering techniques to analyze user interactions and preferences, recommending relevant items with a mean absolute error (MAE) of 0.3 by matching users with similar histories.
  • Review Summarization: Fine-tuned GPT-2 to summarize reviews, achieving a Rouge Score of 55%. This involved building custom DataLoader classes and optimizing hyperparameters.

View Project
Project 5

Signal Separation

DSP ML

Signal Source Seperation

Machine Learning, SSSpy, librosa

Implemented signal source separation using classical ML techniques, achieving 20+ dB Signal- to-Noise Ratio in separating voice and background music.

View Project
Project 6

Unix Shell System

OS CLI Threads

Unix Shell System

Shell, OS, CLI, Threading

Implemented a shell utilising commands like mkdir, ls, date, cd, echo, cat, pwd, and rm. Enhanced features include symbolic link handling, output redirection and support for threading and forking for external commands.

View Project
Project 7

PnL Tracker

FastAPI Backend Python

PnL Tracker

FastAPI, Backend, Python, FIFO Accounting, Decimal Safety

A small, clean backend that records trades, shows current portfolio, and computes realized & unrealized PnL. Single-user, in-memory, FIFO accounting with Decimal safety and a simple FastAPI surface.

View Project
Project 8

Groundwater Quality Analysis

R Regression Statistics

Groundwater Quality Analysis

R, RStudio, Kuznets Curve, Regression Analysis, Statistical Testing

Formulated a robust multiple regression model to explore groundwater quality and socio-economic impact, establishing insights for sustainable development. Presented findings and statistical tests through detailed reports to support data-driven decision-making.

View Project
Project 9

Skin Lesion Detection

Deep Learning CNN HAM10000

Skin Lesion Detection

Deep Learning, CNN, HAM10000, Jupyter, Medical AI

Built a deep learning model to classify skin lesions using the HAM10000 dataset, focusing on distinguishing between benign and malignant types, with a web interface for real-time analysis.

View Project