Nikhil Mourya - ML Engineer · Multi-Agent Systems

About

I'm Nikhil Mourya. I build ML systems where the math decides when to stop, not the model's confidence. Retrieval scores over self-reported certainty. Pruned weights over bloated checkpoints. Shipped code over benchmark screenshots.

BIT Mesra, studying what doesn't fit in a lecture hall. Right now: questioning whether your retrieval pipeline actually needs that vector DB, or just needs a better question.

I'm always maintaining unbroken eye contact with that single code block like we got beef.

Model Compression

Pruning and LoRA are quiet admissions that full fine-tuning is often billing for capacity you never needed. I like models that shrink without forgetting what they're for.

efficiency

NLP & Text Systems

Summarization, classification, multi-agent pipelines where fluent isn't the same as faithful. Production NLP is mostly telling confident hallucinations they can't sit with us.

production

Multi-Agent Systems & MLOps

LangGraph orchestration, eval loops that catch what the LLM won't admit, FastAPI + Docker + AWS, the glue between a Jupyter notebook and someone else's pager.

infra

Competitive Programming

Codeforces Specialist, 1400+. Graphs have a way of humbling you on a schedule; the upside is you stop trusting clever one-liners without proof.

cf · 1400+

ongoing

Specialist Codeforces 1400+

Still grinding rated rounds, the graphs are optional; the ego damage isn't. Proof you can think with a clock breathing down your neck.

peak: 1400+ · still climbing

2024

Finalist SIH - Hospital Mgmt System

99.5% uptime with real models in the loop, the 0.5% was character development. Backend stayed polite even when the night shift wasn't.

99.5% uptime · production ML

2023

Finalist IIIT Delhi - ResNet50

94% accuracy after grid search stopped me from brute-forcing the hyperparameter void. Sometimes the boring search is the clever move.

94% acc · −30% train time

Skills

PyTorch

TensorFlow

HuggingFace

Scikit-learn

NumPy

Pandas

Python

FastAPI

Git

MySQL

LangChain

LangGraph

Redis

🧠 ChromaDB

🦙 Ollama

⚡ Groq

PyTorch

TensorFlow

HuggingFace

Scikit-learn

NumPy

Pandas

Python

FastAPI

Git

MySQL

LangChain

LangGraph

Redis

🧠 ChromaDB

🦙 Ollama

⚡ Groq

C++

AWS

Linux

Jupyter

GitHub

📊 Matplotlib

OpenTelemetry

🔍 LangSmith

Pydantic

📐 RAGAS

🌳 tree-sitter

Docker

RAGs

C++

AWS

Linux

Jupyter

GitHub

📊 Matplotlib

OpenTelemetry

🔍 LangSmith

Pydantic

📐 RAGAS

🌳 tree-sitter

Docker

RAGs

Projects

// flagship.projects

HiveMind

GitHub

"Can a research system know its own answer isn't good enough, and fix it?"

Production-grade autonomous research system with a 5-agent sequential pipeline (Planner → Researcher → Critic → Writer → Evaluator). Orchestrator loops autonomously until a deterministic confidence threshold — calculated as mean retrieval score, not LLM self-report — is met.

Infra: asyncio.Lock single-threaded LLM execution · Docker Compose (FastAPI + Redis + ChromaDB + OpenTelemetry) · SSE streaming

5-agent pipeline autonomous eval loop deterministic confidence scoring

Python FastAPI ChromaDB Redis OpenTelemetry Groq Ollama Docker Pydantic asyncio

CodeLens

GitHub Demo

"Can you search a codebase by intent, not keywords, fully offline?"

Fully offline VS Code extension that indexes codebases using AST-based chunking via tree-sitter and 768-dim embeddings via Ollama. Natural language query → ranked semantic results → one-click jump to exact line. No internet, no API keys.

9K+ chunks indexed sub-10ms ANN query 100% offline

Python TypeScript FastAPI tree-sitter Ollama VS Code API Docker

Pruned U-Net

GitHub

"How much of a segmentation model is actually load-bearing?"

Structured magnitude pruning pipeline targeting channels, not weights. Key finding: IoU held flat until ~96% reduction then degraded sharply, meaning there's a wide safe compression window most implementations never explore. IIT Kharagpur research collaboration.

97.3% parameter reduction 92% FLOPs reduction IoU > 0.95 on MoNuSeg

PyTorch U-Net Model Pruning Computer Vision MoNuSeg

// other.work

01

AttentionIsALLICode

Not "used a framework." Actually from scratch.

Full architecture Multi-head attention Custom training loop

PyTorch Transformers NLP From Scratch

02

Vectorless RAGs

Vector databases are the default. I wanted to know if the default was actually necessary.

Zero embeddings Zero vector DB Full retrieval

Python Ollama LLaMA 3 RAG Tree Traversal Streamlit

03

Second Brain Debugger

A senior engineer code-reviewed my brain. Six stages. Real AI. No affirmations.

6-stage pipeline SSE streaming Multimodal input

Next.js 14 TypeScript Mistral-7B Whisper Stable Diffusion Zod

04

DermaVision

7 skin lesion classes. 57:1 class imbalance. Focal Loss said no problem.

HAM10000 dataset Grad-CAM XAI ONNX export

EfficientNet-B3 PyTorch FastAPI Next.js 14 Docker Albumentations

05

PEGASUS + LoRA · Efficient Summarization

Fine-tuned a 767M parameter model using only 1.57M trainable parameters via LoRA. Full fine-tuning produced incoherent outputs on unseen domains. LoRA didn't.

99.8% param reduction 27× faster training 767M → 1.57M

PyTorch HuggingFace PEFT LoRA NLP XSum

// ventures

Founder

DevPath

Structured learning for developers who are tired of tutorial hell.

Growingcurated roadmaps

Dailyfocused tasks

Freeto get started

Visit DevPath ↗

// experience.log

LAYER_01: HireBuddy_Software_Engineer_ML (2025) [STATUS: DEPLOYED]

[ROLE] Software Engineer - Machine Learning

[INIT] Resume-JD matching via RAG + Transformers. Zero regex. Zero vibes.

[THROUGHPUT] NLP_pipeline.ingest(resumes=10k/day, mode="shortlist")

[METRIC] shortlist_acc: +35% · chain_latency: −28%

[DEPLOY] FastAPI + Docker on AWS · REST API, Friday-proof since day one.

[EVAL] eval_stack: RAGAS · LangSmith, because vibes aren't a metric.

[SIGNAL] Engagement: +40%, the model got better at reading people than the recruiters did.

█

If you've made it this far,
you might as well say Hi.

Email tsmftxnikhil14@gmail.com

LinkedIn nikhil-mourya

GitHub TryingtobeingNikhil

Twitter / X @GonnabeNikhil

Instagram @not_nikhil14

NikhilMourya

About

Model Compression

NLP & Text Systems

Multi-Agent Systems & MLOps

Competitive Programming

Specialist Codeforces 1400+

Finalist SIH - Hospital Mgmt System

Finalist IIIT Delhi - ResNet50

Skills

Projects

HiveMind

CodeLens

Pruned U-Net

AttentionIsALLICode

Vectorless RAGs

Second Brain Debugger

DermaVision

PEGASUS + LoRA · Efficient Summarization

// ventures

DevPath

// experience.log

If you've made it this far,you might as well say Hi.

Nikhil
Mourya

If you've made it this far,
you might as well say Hi.