Skip to content

Open to full-time roles · Graduating December 2026

Aaditya Pai

ML Researcher, LLM Agent Security

Columbia MS Data Science · Purdue BS Computer Engineering · New York, NY

01

About

I'm an ML researcher focused on the security of LLM agent systems, where I build attacks, defenses, and evaluation frameworks for agents deployed in the real world. I'm currently a Graduate Researcher at Columbia's DAPLab (Data, Agents, and Processes Lab), where I work on prompt injection, unsafe tool use, and policy enforcement across multi-step workflows, and a Software Engineer Intern at CodeIntegrity. Alongside the security work, I'm drawn to quantitative finance and to problems where careful evaluation meets real deployment constraints.

03

Experience

  1. Software Engineer Intern · CodeIntegrity

    Jul 2026 – Present

    New York, NY

    • Funded AI agent security startup. Building next-generation agent security benchmarking and evaluation infrastructure.
  2. Graduate Researcher · Columbia DAPLab (Data, Agents, and Processes Lab)

    Feb 2026 – Present

    New York, NY

    • Researching adversarial robustness and security of LLM agent systems: prompt injection, unsafe tool use, and policy enforcement across multi-step workflows.
    • Built evaluation infrastructure and attack pipelines across AgentDojo, WebArena, InjecAgent, and ToolBench to measure detector failure modes, attack transferability, and agent vulnerability under adversarial conditions.
    • Designed domain-camouflaged injection attacks and provenance-aware defenses for multi-agent systems.
  3. Data Engineer · Cummins Inc.

    Jul 2023 – Jul 2025

    Chicago, IL

    • Engineered anomaly detection and validation pipelines over millions of enterprise telemetry records using Python and SQL; generated automated action lists that reduced annual spend by $550K through systematic identification of inactive licenses and unused resources.
    • Built scalable ETL automation frameworks with Python, SQL, PowerShell, and REST APIs; defined data quality checks for missing values, anomalous distributions, and schema drift across staging and production workflows.
  4. Software Engineer Intern · ADVANCE.AI

    May 2022 – Aug 2022

    Singapore

    • Deployed CI/CD workflows and service routing infrastructure for distributed ML services; integrated Istio service mesh within Kubeflow across 3+ microservices to improve observability and deployment reliability.
    • Engineered real-time monitoring and anomaly detection for a 10+ node Kubernetes cluster on Linux; defined threshold-based alerting rules for latency, request size, and data transfer.
  5. Undergraduate Researcher · Autonomous Motorsports Purdue (AMP)

    Aug 2022 – May 2023

    West Lafayette, IN

    • Built a CNN with ELU activations for real-time steering prediction in a simulated autonomous driving environment, with a hybrid FAST-ORB feature pipeline and a regression steering model trained on 45K+ frames of sensor data, reaching under 5 degrees mean absolute error at 30 FPS and improving obstacle avoidance accuracy by 30% under real-time latency constraints.
    • Won the "Share with the World" VIP Award at the Purdue Undergraduate Research Conference; benchmarked feature extraction methods, evaluated model stability across simulation conditions, and optimized the inference pipeline for real-time deployment.
  6. Undergraduate Researcher · Lunabotics (NASA Robotic Mining Competition)

    Aug 2021 – Dec 2021

    West Lafayette, IN

    • Benchmarked object detection and tracking algorithms under noisy visual conditions across varied lighting and terrain, analyzing accuracy, stability, and failure modes; selected the MOSSE filter through systematic evaluation, reaching 95% tracking accuracy under real-time constraints.
    • Built an end-to-end OpenCV perception pipeline integrating detection and tracking for autonomous navigation in the NASA Lunabotics Mining Competition environment, validated across edge cases with documented, reproducible evaluation methodology.
04

Projects

NBA Betting Multi-Agent Reasoning System

Columbia UniversityJan 2026 – May 2026

Multi-agent debate system for pre-game NBA line prediction. Ingested live odds, statistical, and contextual data across six sources, implemented a ReAct-style agent loop with ChromaDB retrieval, and benchmarked multi-agent debate against single-agent chain-of-thought using Brier scores and calibration curves. Ran ablations isolating each data source and backtested against closing line value on a held-out season.

Statistical Arbitrage Research Platform

Columbia UniversityJan 2026 – May 2026

Reimplemented the Avellaneda and Lee (2010) statistical arbitrage framework from scratch: PCA factor decomposition, OU residual modeling, s-score signal generation, HMM regime filtering, and Almgren-Chriss execution. Extended it with HMM regime filtering to suppress signals in trending markets, volatility-targeted position sizing, and optimal execution modeling. Interactive React frontend for backtesting with automated Sharpe ratio, drawdown, and turnover reporting.

ZK-KYA: Zero-Knowledge Identity Layer for AI Agents

Columbia UniversityJan 2026 – May 2026

Privacy-preserving authorization protocol for autonomous AI agents using zk-SNARKs. Agents prove compliance with identity and scope constraints without revealing credentials. Implemented Groth16 circuits in circom, deployed Solidity verifier contracts on an Ethereum testnet, and benchmarked proof generation time, R1CS constraints, and on-chain gas costs.

Low-Latency Order Book Engine

Columbia UniversityJan 2026 – Mar 2026

High-performance limit order book in C++ with price-time priority matching, a custom object-pool allocator, and an Avellaneda-Stoikov market maker. Achieved 166ns median and 750ns p99 matching latency across 10M+ order events. Validated reconstructed book state against LOBSTER NASDAQ tick data over 400K real market events, and implemented a FIX 4.2 protocol parser with an end-to-end order round-trip demo.

05

Skills

Languages

  • Python
  • C++
  • C
  • SQL
  • R
  • SAS
  • MATLAB
  • Solidity
  • PowerShell
  • Embedded C
  • System Verilog
  • VHDL
  • Assembly

ML & AI

  • PyTorch
  • TensorFlow
  • scikit-learn
  • pandas
  • NumPy
  • statsmodels
  • LangChain
  • ChromaDB
  • OpenCV

Cloud & Infrastructure

  • AWS
  • GCP
  • Azure
  • Vertex AI
  • Docker
  • Kubernetes
  • Kubeflow
  • Helm
  • Amazon EKS
  • CI/CD
  • Git
  • Linux
  • Grafana
  • Prometheus
  • BigQuery
  • Firebase

Frontend & Visualization

  • React
  • Power BI
  • Tableau
  • Matplotlib
  • Jupyter

Focus Areas

  • LLM agent security
  • prompt injection
  • adversarial robustness
  • RAG
  • multi-agent systems
  • quantitative modeling
  • backtesting
06

Education

Columbia University

M.S. in Data Science

New York, NYAug 2025 – Dec 2026

Purdue University

B.S. in Computer Engineering

West Lafayette, INAug 2019 – May 2023
07

Awards & Certifications

Awards

  • Share with the World ML Research Award

    Purdue Undergraduate Research Exposition

  • Eta Kappa Nu (IEEE-HKN), Beta Chapter

    ECE Honor Society

  • Dean's List & Semester Honors

    7 of 8 semesters

  • Charles W. Brown ECE Scholarship

    Purdue ECE

  • Eli Shay Electrical Engineering Scholarship

    Purdue ECE

Certifications

  • Advanced Calculus for Financial Engineering

    Baruch College, CUNY

  • Financial Markets

    Yale University

  • Managing ML Projects with Google Cloud

    Google

  • Software Engineering Virtual Experience

    JPMorgan Chase & Co.

  • Intermediate C++

    Microsoft