Available for freelance & remote work
I build reliable agent systems.
AI engineer specialising in agentic automation, voice agents and RAG — plus the evaluation and guardrails that keep them working in production.
Currently AI Engineer at Joblogic · MS Artificial Intelligence, LUMS
Selected work
Systems shipped, not slides.
Four builds — three at Joblogic, one for a paying freelance client. Each links to a full case study with the architecture and the hard parts.
Currently building
What's next.
Open-source and product work in progress — the tools I wish existed for building and trusting agents.
Voice AI
Coming soonVertical Voice AI Agent
Booking, lead qualification and FAQs over the phone, with calendar/CRM sync and a call-outcomes dashboard.
- Voice AI
- ElevenLabs
- Vapi / Retell
- Twilio
Infrastructure
Coming soonGoverned MCP Server / Gateway
A production Model Context Protocol server/gateway adding auth, per-tool authorization, rate-limiting, usage metering and audit logging. Open-source.
- MCP
- Infra
- Governance
- Open-source
LLMOps
Coming soonAgent Reliability & Eval Suite
Hallucination/faithfulness plus cost/latency scoring with proper statistical confidence intervals, and a CI action that fails PRs on regressions. Open-source + hosted demo.
- Evals
- Observability
- LLMOps
- Open-source
What I do
Hire me to build the hard parts.
I work with founders and teams shipping LLM products — from a first agent to the evaluation and guardrails that keep it reliable.
Agentic automation
Multi-step workflows an LLM agent plans and executes end to end — collapsing manual processes into a guided pipeline.
- LangGraph & multi-agent orchestration
- Tool-calling and API integration
- RAG over your code, docs and data
Voice AI agents
Inbound and outbound voice agents that book, qualify leads and answer questions — wired into your calendar and CRM.
- ElevenLabs · Vapi · Retell + Twilio
- Calendar & CRM sync
- Call-outcome analytics
AI reliability & evals
The tracing, evaluation and guardrails that turn a flaky demo into something you can trust in production.
- Faithfulness & hallucination scoring
- Regression gating in CI
- Cost & latency observability
RAG & data pipelines
Turning messy, multi-modal archives into retrieval and insight at scale — the groundwork good agents stand on.
- Embeddings & vector search
- Clustering & topic discovery
- Transcription & extraction
About
Engineering judgement, not just prompts.
I’m Hanzla — an AI engineer in Lahore, Pakistan, working with clients worldwide. For the past few years I’ve built production AI: agentic platforms that write their own integrations, no-code agent builders, voice agents, and the pipelines that feed them.
I care most about the unglamorous part — the evals, guardrails and tracing that decide whether an agent survives contact with real users. That’s the difference between a demo and a system you can sell.
Credentials
MS Artificial Intelligence
2024 – 2026LUMS
CGPA 3.82/4.0 · Full merit scholarship · Top-5 cohort
BS Mathematics with Computer Science
GraduatedGCU Lahore
Gold Medal · Top of cohort