Govind Thakur | AI Systems Engineer

About & Education

Background

I am an AI Systems Engineer with a deep focus on bridging the gap between cutting-edge model experimentation and reliable, large-scale production deployments.

Currently at American Express, I drive multiple high-impact AI initiatives. Beyond building GenAI enterprise Search APIs serving 1M+ users and architecting the AGX evaluation platform, I am part of a specialized team developing an intelligent GenAI voice bots initiative designed to transform and elevate customer service experiences.

My expertise covers the full MLOps lifecycle: from deploying distributed data pipelines via Airflow and Spark, to designing production-grade LLM evaluation frameworks and semantic search optimization.

Education

Master of Science, Computer Science – Artificial Intelligence

University of Southern California (USC)

Aug 2022 – May 2024

Bachelor of Engineering, Computer Engineering

Dwarkadas J Sanghvi College of Engineering

Jul 2018 – Jun 2022

Technical Skills

Core Competencies

AI Engineering

ML Systems

LLMs

Semantic Search

Generative AI

RAG

Deep Learning

Tech Stack

Python

C++

Java

SQL

React

FastAPI

Flask

TensorFlow

PyTorch

Docker

AWS (SageMaker, EC2, S3)

Tools & Platforms

GitHub Actions

Jenkins

Kubernetes

SonarQube

Splunk

Grafana

CI/CD

Cloud Deployments

Professional Experience

AIML Engineer-II, GenAI Search Team

American Express

📍 Phoenix, AZ 🗓 Sept 2024 – Present

Designed and maintained production Search APIs (Python, Java) powering GenAI Search for 1M+ monthly mobile users; optimized retrieval pipelines and parallelization to reduce latency across releases.
Led end-to-end development of AGX, a full-stack (React + Python) internal search evaluation platform enabling engineers to simulate production behavior, analyze quality metrics, and detect regressions prior to release.
Defined and operationalized search quality metrics (NDCG@k, precision/recall, containment, latency) across 1M+ monthly user traffic to guide production release decisions.
Led LLM-based experimentation initiatives (RAG, ranking refinement, containment optimization), delivering AGX-powered executive demos that translated large-scale metric analysis into product and governance decisions.
Architected automated regression and content validation pipelines (Python, GitHub Actions) running daily to detect application runtime failures and search content defects, reducing regression defects by 35% and removing significant manual QA effort.

Machine Learning Engineer

Northern Lights

📍 Los Angeles, CA 🗓 Jun 2023 – Sept 2024

Co-led architecture and implementation of a multimodal analytics MVP, designing distributed ingestion pipelines and backend integrations for production deployment.
Built containerized ML workflows on AWS SageMaker and Docker for automated training, evaluation, and CI/CD delivery.
Designed Python Airflow DAGs orchestrating data pipelines processing millions of records for multimodal model training.
Deployed and productionized LLM-based systems to extract structured insights from unstructured enterprise data.

Software Engineering Intern

JP Morgan Chase & Co

📍 Mumbai, India 🗓 Jun 2021 – Aug 2021

Improved NLP accuracy of JPMC's E-Trading Assistant Bot by 20% through Rasa model tuning and advanced intent classification.

Patents & Projects

Patent Pending

System and Method for Estimating Intrinsic Popularity of Content

US Serial No. 63/642,980. Developed predictive models for ranking content popularity, improving engagement metrics in adaptive systems.

Predictive Modeling Ranking Systems

Patent Registered

Automated Attendance Management System

ROC No. L-110318/2022. Designed an automated ecosystem to efficiently manage user attendance without relying on manual intervention.

Automation Systems Architecture

Project • Dec 2020 – Jun 2022

Speech to Code: Voice-Driven Developer Environment

Built a PyTorch-based voice-controlled coding interface using LSTM seq2seq targeting visually impaired developers. Integrated StackOverflow API for autonomous error correction and contextual code suggestions.

PyTorch LSTM seq2seq API Integration

Project

TextGuard: NLP Toxicity Detection

Multi-model NLP system using BERT masked language modeling to detect and reverse toxic content. Achieved 87% toxicity reduction maintaining text coherence.

BERT NLP Python

Project

Sentiment Analysis API

Production-ready sentiment analysis system deployed on AWS SageMaker. Built end-to-end ML pipeline representing a complete MLOps workflow.

AWS SageMaker Deep Learning REST API

Project

Plagiarism Detection System

NLP-based plagiarism detection deploying text similarity algorithms. Deployed as production-ready REST API on AWS SageMaker with 92% detection accuracy.

Python NLP AWS SageMaker

Research, Certifications & Honors

Publications & Research

Emergent Ethics in Agentic Simulations: Moral Incubation and Conscious Development across GPT, Claude, Gemini, and XAI

Springer Nature – AI and Ethics Journal (Under Peer Review, 2025)

Thakur, G.

Comparison of Tabular Synthetic Data Generation Techniques using Propensity and Cluster Log Metric

Elsevier – International Journal of Information Management Data Insights (2023)

Pathare, A., Mangrulkar, R., Thakur, G., et al.

Google Scholar Profile

Certifications & Honors

IBM AIML Specialization
Udacity ML Nanodegrees (3)
AWS Certified Cloud Practitioner (Planned)

Extracurriculars

Course Producer for USC CSCI 401 & 102
Mentored 50+ students in ML and GenAI projects, providing architectural guidance and debugging distributed systems issues.

Govind Thakur. I build AI systems that improve lives.

About & Education

Background

Education

Master of Science, Computer Science – Artificial Intelligence

Bachelor of Engineering, Computer Engineering

Technical Skills

Core Competencies

Tech Stack

Tools & Platforms

Professional Experience

AIML Engineer-II, GenAI Search Team

Machine Learning Engineer

Software Engineering Intern

Patents & Projects

System and Method for Estimating Intrinsic Popularity of Content

Automated Attendance Management System

Speech to Code: Voice-Driven Developer Environment

TextGuard: NLP Toxicity Detection

Sentiment Analysis API

Plagiarism Detection System

Research, Certifications & Honors

Publications & Research

Emergent Ethics in Agentic Simulations: Moral Incubation and Conscious Development across GPT, Claude, Gemini, and XAI

Comparison of Tabular Synthetic Data Generation Techniques using Propensity and Cluster Log Metric

Certifications & Honors

Extracurriculars

Restricted Access