Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Arjun Parasuram Prasad
About me
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
portfolio
Fine-Tuning Llama 3.2–3B on IPCC Climate Reports
Finetuning, Pretraining, and building a RAG pipeline using a Llama3.2-3B model on IPCC Climate Reports in a distributed environment
AI as a Service – Containerized ML Pipeline on Kubernetes
Cloud-native ML pipeline with CNN training as a K8s batch job and scalable multi-replica FastAPI inference on GKE
Market Recovery Dynamics During the 2020 Crash
Large-scale data lake pipeline integrating stock, COVID, mobility, and macroeconomic data via HDFS, MapReduce, and Hive/Trino for crash/recovery analysis
MOT-TR: DETR Fine-Tuning for Moved-Object Detection
Robust, reproducible pipeline designed to fine-tune the DETR (DEtection TRansformer) model for the task of detecting moved objects in pairs of images taking from parking lots and intersections (VIRAT dataset).
Limit Order Book Midpoint prediction for Cryptocurrency Stocks
Evaluated classical ML model performance and designed light-weight neural networks with custom loss functions for SoTA limit order book midpoint prediction for a cryptocurrency stock
OS Simulation Suite
Compact suite of operating system simulations in modern C++ spanning - linking/loading, CPU scheduling, memory paging, and disk I/O scheduling
CoBaLI – Continuous Batching for LLM Inference
C++/CUDA inference engine achieving 6.2× throughput via GPU-assisted scheduling on top of llama.cpp, without modifying model kernels
EcoScan – Sustainable Fashion Scanner
1st Place HackNYU 2025 (MLH) — Mobile app that scans clothing items and returns Eco-Scores with AI-powered greener alternatives
RepoRecSys – GitHub Repository Recommendation System
Two-tower neural recommender with cold-start handling, hot model reloading, and containerized FastAPI inference (<100ms latency)
VisaPulse – AI-Powered F-1 Visa Mock Interview Platform
LangGraph-orchestrated agentic interview system with voice interaction, document analysis, RAG over Reddit cases, and personalized PDF evaluations
Handwritten Character Classification Pipeline
ResNet50V2-based EMNIST classifier achieving 93.55% test accuracy with mixed precision and optimized data pipeline
publications
Context-Aware Behavioral Fingerprinting in IoT Devices
Published in 20th International Conference on Security and Cryptography, SECRYPT 2023, 2023
We propose a context-aware behavioral fingerprinting of IoT devices that takes into account the circumstances or contexts under which the devices are operating. Our fingerprinting strategy uses supervised learning for classifying the IoT devices.Finally, Experimental results show that our fingerprinting technique is quite effective and is capable of identifying IoT devices with more than 94% accuracy.
Recommended citation: Prasad, A.; Biju, K.; Somani, S. and Mitra, B. (2023). Context-Aware Behavioral Fingerprinting of IoT Devices via Network Traffic Analysis. 20th International Conference on Security and Cryptography, SECRYPT 2023.
Download Paper | Download Bibtex
Auto-Markup BenchMark: towards an industry-standard benchmark for evaluating automatic document markup
Published in Balisage: The Markup Conference 2023 — Balisage Series on Markup Technologies, Vol. 28, 2023
We introduce an early benchmark (Auto-Markup BenchMark) for evaluating automatic markup engines and propose XATER (XML Translation Edit Rate) alongside a validation-error metric to standardize comparisons across tools and tasks.
Recommended citation: Prescod, P.; Feuer, B.; Hladkyi, A.; Paulk, S.; Prasad, A. (2023). Auto-Markup BenchMark: towards an industry-standard benchmark for evaluating automatic document markup. Proceedings of Balisage: The Markup Conference 2023, Balisage Series on Markup Technologies, Vol. 28.
Download Paper | Download Bibtex
research_experience
Graduate Researcher – Search and Ranking
Architected multi-stage RAG pipelines and embedding infrastructure for 5.7M+ documents; reduced retrieval latency from 380ms to 130ms.
Research Project
Built a metric-extraction + ML pipeline to flag bug-prone Java files with >91% accuracy; informed Agile policy updates to reduce post-release defects.
Research Project
Built a traffic-analysis pipeline and models that identify IoT devices with >94% accuracy; published at SECRYPT 2023.
Graduate Independent Study
Benchmarking ML algorithms for image-change detection tasks, focused on identifying objects removed from vending machines.
talks
Talk 1 on Relevant Topic in Your Field
Published:
This is a description of your talk, which is a markdown file that can be all markdown-ified like any other post. Yay markdown!
Conference Proceeding talk 3 on Relevant Topic in Your Field
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
teaching
Teaching Assistant
Vision Meets Machine Learning, New York University, Courant Institute of Mathematical Sciences, 2025
Course Assistant and Grader for the Vision Meets Machine Learning course under Prof. Davi Geiger.
work_experience
Software Engineer Intern, AI/ML
Building AI-driven data enrichment pipelines, MCP tool integrations, and embedding infrastructure for an AI-native data platform.
Data Analytics Engineer
Built marketing analytics and ETL pipelines for large scale data migrations; automated deployments; eliminated major outsourcing spend.
Machine Learning Intern
Developed analytical dashboards, OCR, and an early RAG assistant to automate ops and reduce drop-offs.
