Taj Khunkhun

Senior AI Engineer

Taj Khunkhun

Transforming experimental prototypes into production-grade agentic AI systems. 8+ years in backend systems, specializing in Multi-Agent Orchestration and Autonomous Reasoners.

LangGraph · CrewAI · AutoGen · MCP · RAG

What I Work With

Technical Skills

🤖

AI & Agents

LangChainLangGraphCrewAIAutoGenAutoGPTMCPAgentic RAGLangSmithMulti-Agent OrchestrationPrompt Engineering
🧠

ML & Deep Learning

PyTorchTensorFlowKerasscikit-learnspaCyNLPLoRA / QLoRAReinforcement LearningNeural NetworksMLOps / LLMOps

Languages & Frameworks

PythonGolangTypeScriptJavaScriptDjangoFastAPIFlaskNode.jsReactAngular
🗄️

Data & Infrastructure

SnowflakeDatabricksSparkKafkaAirflowBigQueryNeo4jRedisPineconeElasticsearch
☁️

Cloud & DevOps

AWSAzureGCPKubernetesDockerTerraformCI/CDMicrosoft FabricCopilot StudioAzure Data Lake

Where I've Worked

Experience

Hewlett Packard Enterprise

Senior AI Engineer | Agentic AI Engineer

Spring, Texas (Remote)

Jan 2023 - Dec 2025

HPE Private Cloud AI - NVIDIA AI Computing

2024-2025
  • Designed multi-agent Plan-and-Execute architecture using LangGraph, reducing infinite loop failures by 35%.
  • Architected MCP server layer standardizing tool integration, reducing custom tool-binding code by 60%.
  • Reduced inference costs by 45% via Router Agent dynamically triaging between Llama 2 7B and GPT-4.
  • Implemented HITL checkpoint system for financial workflows with 0.85 confidence threshold.
  • Built observability dashboards using LangSmith and Arize Phoenix, resolving 10s+ latency bottlenecks.
  • Architected LLM evaluation framework using DeepEval across 5,000+ test cases.
  • Reduced costs by 52% and p95 latency by 300ms through semantic caching with Redis.

HPE Ezmeral Unified Analytics & Data Fabric

2023-2024
  • Eliminated catastrophic forgetting in Llama 2 by mixing 15% pre-training replay data.
  • Reduced training VRAM by 65% using QLoRA 4-bit quantization for 70B parameter models.
  • Improved inference throughput 4x via multi-LoRA serving (vLLM/LoRAX) for 10+ adapters.
  • Architected multi-region Kafka with sub-second cross-region replication.
  • Engineered tiered memory system preserving intent across 20+ agent handoffs.

HPE GreenLake Cloud & OpsRamp AIOps

2023-2025
  • Led migration from monolithic Django to Microservices with Docker and Kubernetes.
  • Built high-concurrency FastAPI architecture handling 10k+ concurrent WebSocket connections.
  • Built MCP-compliant tool registry enabling dynamic tool discovery at runtime.
  • Architected Self-Correction loop for SQL agent, reducing syntax errors by 50%.
  • Built NER pipeline with Keras Bi-LSTM achieving 25% F1-score improvement.

Adobe

Data & Machine Learning Engineer

San Jose, California

Aug 2019 - Jan 2023

Adobe Experience Platform Pipeline & Data Lake

2019-2022
  • Architected ETL pipeline processing 5TB+ multi-modal data using Spark.
  • Eliminated vector-relational desync via CDC (Debezium + Kafka) with 99.9% consistency.
  • Engineered Blue-Green re-indexing for zero-downtime migrations across 50M+ vectors.
  • Optimized semantic search by 40% via hierarchical document indexing.
  • Reduced vector storage costs by $8k/month through tiered data strategy.

Adobe Sensei ML Framework & Content Intelligence

2020-2023
  • Deployed Semantic Data Guard monitoring data drift with 15% deviation alerting.
  • Standardized AI Data Contracts across four teams enforcing GDPR/CCPA compliance.
  • Reduced inference latency by 65% via model distillation on NVIDIA A100 GPUs.
  • Built synthetic data engine using SDV and GPT-3.5, improving minority tasks by 18%.
  • Engineered Fail-Soft orchestration saving $15k/month in compute costs.

Enterprise Work

Projects

HPE Private Cloud AI & NVIDIA AI Computing

2024 - 2025

Multi-agent Plan-and-Execute architectures, Router Agents, MCP server integrations, HITL checkpoints, semantic caching, shadow deployment pipelines, and LLM evaluation frameworks for HPE's enterprise AI platform.

LangGraphMCPNVIDIA NIMRAGMulti-AgentLangSmithDeepEvalGPT-4Llama 2

HPE Ezmeral Unified Analytics & Data Fabric

2023 - 2024

Fine-tuned LLMs with QLoRA on GPU clusters, multi-region Kafka replication, NER pipelines, multi-agent memory systems, and multi-LoRA inference serving across Kubernetes-based ML platform.

SparkKafkaKubernetesspaCyKerasBi-LSTMPyTorchQLoRAvLLM

HPE GreenLake Cloud & OpsRamp AIOps

2023 - 2025

Migrated monolithic services to microservices, high-concurrency event-driven APIs, MCP-compliant tool registries, SQL-generating agents, and GIL/integration bottleneck resolution.

DjangoFastAPIDockerKubernetesMCPAgentic AICeleryRedis

Adobe Experience Platform & Sensei ML

2019 - 2023

High-throughput ETL/search pipelines, CDC-based vector sync, embedding drift management, synthetic data generation, inference optimization, and A/B testing frameworks.

SparkKafkaDebeziumPineconeElasticsearchGrafanaGPT-3.5NVIDIA A100

Open Source

Side Projects

Personal projects exploring agentic AI patterns, multi-agent architectures, and intelligent automation.

Agentic AI Chat Analyzer

1

AI-powered platform for analyzing agent chat transcripts. Performs exploratory data analysis, LLM-based summarization, and sentiment classification through an interactive Streamlit frontend.

  • Modular data pipeline (ingestion, cleaning, transformation)
  • EDA with word clouds and sentiment visualizations
  • Model caching for offline operation
FastAPIStreamlitHuggingFaceFlan-T5RoBERTaDockerPandas
View on GitHub

AI Recruiter

2

Intelligent recruitment platform that automates candidate discovery by scanning GitHub profiles and Google Scholar to identify qualified AI/ML professionals with relevance scoring.

  • Multi-source profile analysis (GitHub + Google Scholar)
  • Relevance-based intelligence scoring for AI/ML skills
  • Geographic filtering and co-author extraction
PythonFlaskWeb ScrapingDockerNLP
View on GitHub

AI Email Agent (Supervisor Mode)

Email automation system using supervisor-pattern multi-agent architecture that categorizes emails, generates RAG-powered responses, proofreads with AI, and sends replies via Gmail.

  • Supervisor pattern for dynamic agent coordination
  • RAG-powered response generation from knowledge base
  • AI proofreading layer before sending
LangChainLangGraphGroqLlama 3.3ChromaDBGmail APIFastAPI
View on GitHub

Simple Chatbot

Lightweight, rule-based chatbot with Gradio UI that answers questions about healthcare automation agents using fuzzy string matching -- no external LLM calls required.

  • Weighted scoring: string similarity (60%) + keyword matching (40%)
  • Confidence threshold for answer selection
  • Graceful fallback responses listing available topics
PythonGradioNLPFuzzy Matching
View on GitHub

Academic Background

Education

Master of Computer Science

Santa Clara University

Aug 2019 - Jun 2020

Bachelor of Computer Science

Santa Clara University

Sep 2015 - Jun 2019

Let's Connect

Get in Touch

I'm always open to discussing new opportunities in Agentic AI, Multi-Agent Systems, and production ML engineering.