Hi, I'm
Kevin Shah
Software Engineer | AI Engineer
I build and operate production LLM systems — agentic workflows, RAG pipelines, and eval-driven AI platforms on AWS.
I take AI systems from prototype to production: LLM orchestration platforms, retrieval-augmented pipelines, prompt evaluation/optimization, and the observability and guardrails that keep them reliable at scale. Currently an AI Engineer on the MLOps platform team at DNV; previously a Software Development Engineer at Amazon. Open to AI Engineer, MLOps, and Forward Deployed roles in the USA, Canada, Europe, and remote.
📍 Houston, TX — Open to USA, Canada, Europe & Remote
Experience
My professional journey building LLM systems, ML platforms, and backend services.
AI Engineer, Machine Learning Operations
Current- ▸Designed and own a production-grade LLM orchestration platform that lets internal teams build, deploy, and manage agentic workflows with persistent state, multi-step reasoning, and fault tolerance — built with Python, FastAPI, LangChain, and LangGraph.
- ▸Applied Andrej Karpathy's LLM Wiki concept to internal repos — LLM-compiled, cross-referenced wiki documents that give AI coding tools (Kiro, Claude) synthesized repo context instead of raw source, reducing token usage by ~60–70% and improving AI response time by ~80%.
- ▸Built a prompt-management system with an LLM-based optimizer enabling systematic versioning, evaluation, and iterative refinement of prompts — cutting a 20-hour manual task to 3–4 hours (~85%) while reducing hallucinations and improving output reliability.
- ▸Built a long-term memory system for agents using MongoDB vector embeddings, retrieving the top-k semantically similar memories per query for personalized, context-aware responses at scale.
- ▸Deployed AWS Bedrock Guardrails across environments via CloudFormation IaC, enforcing content safety, PII filtering, and hallucination-mitigation policies at the inference layer.
- ▸Implemented end-to-end observability for LLM systems (distributed tracing, token-usage metrics, agent-step dashboards), sharply reducing incident triage time and improving visibility into model behavior in production.
- ▸Architected serverless backend services on AWS Lambda, ECS, and S3, reducing infrastructure costs by ~50% while improving scalability and reliability; built a state-machine-inspired workflow engine integrating SageMaker and lightweight workers for ~40% lower execution latency.
- ▸Led backend design discussions and architecture reviews, standardized build/deploy workflows, partnered with QA and product on complex production issues, and mentored junior engineers and interns.
Software Development Engineer
- ▸Led backend changes supporting customer-experience analysis across 21 global marketplaces, contributing to a zero-downtime migration impacting 550M+ users.
- ▸Built runtime monitoring and alerting for latency and customer-impact metrics, enabling real-time detection of production issues at global scale.
- ▸Designed and implemented backend components for a large-scale Order Summary system, integrating with multiple critical services and legacy systems.
- ▸Participated in on-call rotations, independently diagnosing and resolving high-severity production incidents and contributing to operational reviews for a global team.
Education
Master of Science in Computer Science
Bachelor in Computer Engineering
Skills
Technologies and tools I work with day-to-day.
MLOps / AI
Languages & Scripting
Backend & API
Cloud & DevOps
Databases
Frontend & Reliability
Get In Touch
Open to AI Engineer, MLOps, and Forward Deployed roles in the USA, Canada, Europe & remote.
Whether you have a role in mind, want to collaborate on a project, or just want to say hi — my inbox is always open. I'll do my best to get back to you promptly.
Email
kevinjshah2207 [at] gmail [dot] com
Location
Houston, TX · Open to USA, Canada, Europe & Remote
* Email shown as plain text to reduce spam. Use the form to send a message directly.