Skip to content
AP

Aman Pandey

Master's candidate in Data Science at Arizona State (4.00 GPA) with 4+ years shipping machine-learning and data systems in production. Building toward agentic AI and large-scale ML.

  • 4.00 GPA @ ASU
  • 2M+ DAILY TXNS
  • 114 LANGUAGES · NLP
  • MS · DEC 2026

TEMPE, AZ

Four years of shipping production ML, now building toward research depth.

Before graduate school I spent four years as a software and ML engineer in Delhi, architecting distributed task queues that processed two million transactions a day, then building a 114-language NLP pipeline that reshaped how an entertainment studio localized its catalog.

At ASU I've carried that practitioner instinct into research-leaning work: 450+ experiments characterizing Temporal Fusion Transformers on financial time series, a diffusion framework for privacy-preserving CGM data, and a market-basket system on 32.4M transactions.

Off the clock I'm usually deep in a paper I don't strictly need to read (lately, why a Temporal Fusion Transformer refuses to stabilise on noisy financial returns), or annotating a plot that already works. I keep a shortlist of side projects I'll probably never finish; the good ones tend to finish themselves once the right question shows up.

Coursework that earns the GPA behind it.

Arizona State University

M.S., Data Science, Analytics and Engineering

Jan 2025 – Dec 2026 · Tempe, AZ

4.00 / 4.00
19
Dec 2026
Coursework · 6 completed · 3 in progress
  • CSE 511Data Processing at ScaleA+
  • CSE 572Data MiningA+
  • DSE 501Statistics for Data AnalystsA
  • CSE 575Statistical Machine LearningA
  • DSE 506Computing Data-Driven OptimizationA+
  • EEE 598Deep Learning: Foundations & ApplicationsA
  • CSE 543Information Assurance & Securityin progress
  • CSE 571Artificial Intelligencein progress
  • EEE 515Machine Vision & Pattern Recognitionin progress

Amity University

B.Tech., Computer Science

Jul 2016 – May 2020 · Noida, India

  • Adelphi University

    Garden City, NY

  • Birkbeck, University of London

    London, UK

Four years of shipped production systems.

Distributed queues, multilingual NLP pipelines, and churn analytics. The places where the model's RMSE matters a lot less than whether the pipeline stayed up.

  1. Software Engineer · My Next Film

    Apr 2023 – Dec 2024 · New Delhi, India

    • Engineered a multilingual NLP pipeline supporting 114 languages using seq2seq Transformers on AWS with Google and Azure speech APIs, improving translation accuracy by 76% and reducing manual review costs.
    • Built a reviewer web app with automated task allocation that cut project cycle time by 41% and lifted translation quality by 20%.
    • Automated 400+ voice narrations with Amazon Polly, matching accent and timbre to character profiles across markets.
    PythonPyTorchseq2seq TransformersAWS (EC2, S3, Lambda)Google/Azure SpeechTableau
  2. Data Analyst · Youth Buzz

    Sep 2022 – Mar 2023 · Noida, India

    • Lifted Net Promoter Score by +10 through analytics-driven strategy; automated survey reporting with zero-shot NLP classification and shipped Power BI dashboards for leadership.
    • Built churn prediction models (logistic regression, 0.84 AUC) on 50K+ customer records and ran RFM clustering to identify fee-driven attrition.
    • Advised a retention strategy that cut attrition ~50% in fee-sensitive cohorts within one quarter.
    Pythonpandasscikit-learnSQLPower BIZero-shot NLP
  3. Software Developer · Invesca Technology

    Dec 2020 – Jul 2022 · Noida, India

    • Architected Celery and Redis distributed task queues processing 2M+ daily transactions, cutting pipeline latency by 40% and sustaining 99.9% SLA across peak windows at 10x normal traffic.
    • Built log analytics dashboards and multithreaded Python services that raised backend throughput by 35%.
    • Automated anomaly detection alerts that prevented overload incidents during peak campaign operations.
    PythonCeleryRedisDistributed systemsMultithreadingLog analytics

A research-leaning portfolio that ships.

Three projects that each demonstrate something different: graduate-grade rigor, end-to-end analytics at scale, and generative modeling where data can't leave the room.

2026

FinFusion

Result59.1%Directional accuracy · weekly · 9-fold rolling.

Temporal Fusion Transformers for S&P 500 return forecasting, with 450+ experiments and documented negative results.

PyTorch Lightningpytorch-forecastingPythonFRED APIyfinance
Case studyGitHub
2026

GlucoCastIn progress

Result18%RMSE improvement vs. LSTM / CNN baselines.

Conditional diffusion framework for privacy-preserving blood glucose forecasting.

PyTorchDiffusion modelsOhioT1DM datasetConditional generation
Case study
2026

BasketIQ

Result32.4MInstacart transactions analysed.

Market-basket analysis and customer segmentation at 32.4M-transaction scale.

Pythonpandasmlxtend (Apriori)scikit-learnChart.js
Case studyGitHub

Traitlytics

Predicting Big-Five personality traits from LinkedIn text with fine-tuned BERT.

Read more

Pulse2SymphonyIn progress

Biosignal-conditioned music generation from smartphone-camera PPG.

Read more

Gaze-Tracker

Real-time driver drowsiness detection from facial landmarks on CPU.

Read more

Tools I reach for.

Grouped honestly. Chips I haven't shipped to production recently are tagged as such elsewhere on this site. Nothing inflated to fill a keyword list.

PythonSQLRJavaGitShell
PyTorchTensorFlowTransformersLSTM · CNN · RNNDiffusion modelsscikit-learnFeature engineeringStatistical inference
BERT · RoBERTaseq2seq TransformersZero-shot classificationLangChainRetrieval-Augmented GenerationModel Context Protocol (MCP)Prompt engineering
pandasNumPy · SciPyPySparkPostgreSQLBigQuerySnowflakeETL / ELTData modeling
DockerKubernetesCI/CDFastAPICelery · RedisAWS (EC2, S3, Lambda, API Gateway)AzureDistributed systems
TableauPower BILooker StudioMatplotlib · SeabornChart.jsGoogle Analytics

Let’s talk about ML engineering, research, or a problem you’re stuck on.

The fastest path is email. For recruiter outreach, LinkedIn works too. For code-flavored conversations, GitHub issues and DMs are fine.

Or download the PDF:

Résumé