Hi, I'm Dhananjay,
AI Engineer specializing in
Computer Vision & LLMs.

Building end-to-end AI platforms for education, construction, and document intelligence.
IIT Bombay | Based in Delhi, India

About

Dhananjay Agnihotri

Building Intelligent Systems

AI Engineer specializing in Computer Vision, LLMs, and RAG systems, with hands-on experience designing and deploying end-to-end AI platforms for education, construction, and document intelligence.

Proven expertise in handwritten OCR, automated grading systems, blueprint analysis, and AI agents that combine vision models with large language models for real-world decision-making. Strong background in system design, scalable inference, and explainable AI.

2+ Years Experience
IIT Bombay
1 Preprint

Skills & Expertise

Core AI & ML

Computer Vision OCR & Layout Parsing LLMs & VLMs RAG Systems End-to-End CV–LLM Pipelines Fine-tuning (SFT, RFT, GRPO)

Frameworks & Libraries

PyTorch TensorFlow LangChain LlamaIndex YOLO OpenCV FAISS Transformers Hugging Face

APIs & Models

OpenAI API Gemini API vLLM Sentence Transformers hOCR & Document Parsing

Backend & Deployment

AWS GCP FastAPI Docker Vast.ai RunPod CI/CD

Featured Projects

Inscanner — Research Preprint

Preprint

Research on dual-phase detection and classification of auxiliary insulation using YOLOv8 models. AI solution for detecting missing insulation in construction blueprints achieving 95% accuracy.

YOLOv8 Computer Vision PyTorch OpenCV
View Preprint →

EchoMind — Document Chat AI

RAG

Full-stack AI chatbot for context-aware document conversations. Supports PDF, TXT, DOCX uploads with hybrid query engine that classifies user intent between document retrieval and general AI chat.

FastAPI LlamaIndex FAISS AWS S3
View Project →

Smart Scheduler Agent

AI Agent

Voice-enabled meeting assistant integrating Google Calendar. Uses Gemini LLM for intent extraction, Whisper for speech-to-text, and OpenAI TTS for voice responses with Google OAuth2.

FastAPI React Gemini Whisper
View Project →

Blueprint Intelligence System

Proprietary

Developed AI system for construction blueprint management at DoAZ, enabling natural language queries on drawing content. Implemented layout parsing, OCR, and VLMs with hybrid semantic search.

VLMs OCR Vector DB Semantic Search
Proprietary Work

Geotechnical Report AI

Proprietary

Built AI-powered agent at DoAZ for automating borehole log extraction from PDFs using OCR, VLLM, and Computer Vision. Generates 2D/3D visualizations with AI-based soil classification insights.

OCR VLLMs 3D Visualization GPU Inference
Proprietary Work

Automated Grading System

Proprietary

Developed AI-driven handwritten OCR and automated grading platform at Infutrix for educational institutions. Achieved ~80% cost reduction with rubric-aware grading and explainable AI feedback.

Computer Vision OCR LLMs Explainable AI
Proprietary Work

Code Snippets

RAG Pipeline

Python
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA

# Initialize RAG pipeline
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(docs, embeddings)
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=vectorstore.as_retriever()
)
View Full Code →

Model Fine-tuning

Python
from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=3,
    per_device_train_batch_size=16,
    learning_rate=2e-5
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=dataset
)
trainer.train()
View Full Code →

Let's Connect

I'm always open to discussing new projects, creative ideas, or opportunities to be part of your vision. Let's build something amazing together.