Amit
Das.
I build distributed systems that scale, Gen AI products that ship, and platforms that don't break at 2 a.m.
◆ Software Development Engineer II at Yellow.ai · NIT Trichy CSE · Dhaka, Bangladesh
About me.
High-impact engineer who treats shipping as a craft and observability as a first-class citizen.
I'm a Product Engineer at Yellow.ai with 4+ years of experience building Gen AI solutions, distributed messaging systems, and data pipelines that run at production scale across hundreds of enterprise clients.
My work sits at the intersection of backend engineering and applied AI. I design systems that need to be fast, reliable, and maintainable — simultaneously. I've scaled platforms from idea to 1.5M messages per minute, replaced legacy infrastructure with solutions that save six figures annually, and taken features from concept to production in under a week.
I care about the whole product: not just the code that ships, but the observability around it, the cost model behind it, and the engineers who'll maintain it after me. That means architecture docs, code reviews, and setting up dashboards before the incident, not after.
Nexus — a Gen AI-powered agentic interface that lets enterprises build and manage AI agents entirely through natural language, eliminating the manual configuration friction that slows platform adoption. 750+ enterprise AI agents. 40%+ adoption increase.
Also currently rebuilding the notification engine in Go to scale beyond 5,000 TPS — pushing past the limits of the existing Node.js architecture with a leaner, more concurrent runtime.
The best systems are the ones you stop thinking about — because they just work.— On reliability
Where I've worked.
Four years at one company, three levels deep, and a long list of production incidents that didn't page anyone.
Yellow.ai
- Spearheaded Nexus, a Gen AI-powered agentic interface enabling enterprises to build and manage AI agents via natural language — driving a 40%+ increase in platform adoption across 750+ enterprise AI agents.
- Architected the Engage notification engine to handle peak loads of 1.5M messages/min, redesigning BullMQ queues into a per-channel throttling layer with idempotency and rate-limiting — zero message loss at scale.
- Replaced legacy cron-based archival (JSON dumps to Minio) with a structured cold storage layer using S3 Tables in Parquet + AWS Athena — eliminating hot database bloat and cutting annual infrastructure costs by $93,000.
- Designed real-time analytics pipelines on PostgreSQL with time-based partitioning via
pg_partman, benchmarked at production scale across 18B+ conversations per quarter. - Overhauled observability across all modules (Grafana, Prometheus, New Relic) — cutting MTTD and MTTR by 90%, improving platform SLA from 98.1% to 99.5%.
- Shipped a high-precision metering and rate-limiting system using Openmeter, eliminating $50K+ in annual revenue leakage across 900+ enterprise accounts.
- Mentored junior engineers through code reviews and led technical design reviews for cross-module features to align implementation with platform-wide standards.
- Owned the Analyze module end-to-end — built an LLM-powered conversation analysis pipeline that indexed insights into ElasticSearch and automatically derived knowledge bases to improve downstream AI agent performance.
- Built core Engage platform features: an omni-channel marketing workflow builder, a CDP with user segmentation, and an event-driven campaign builder enabling personalized outreach across 10+ communication channels.
- Optimized query performance on ElasticSearch and MongoDB through index planning, query plan analysis, and schema restructuring — reducing p95 latency and ensuring 99.9% uptime for high-traffic integrations.
- Led rapid prototyping and end-to-end delivery of Meta platform integrations (WhatsApp, Facebook) — taking WhatsApp Calling and MMLite from concept to production in under 1 week, directly enabling new enterprise client acquisition.
- Built a cross-platform SDK using Xamarin, streamlining the client integration process and accelerating platform adoption.
Featured work.
The systems I'm most proud of — each one a lesson in scale, cost, or reliability.
Nexus — Natural Language Agent Builder
40%+ adoption increase · 750+ enterprise agentsA Gen AI-powered interface that lets enterprises build, configure, and manage AI agents entirely through natural language — eliminating manual onboarding friction. Nexus abstracts complex agent configuration into a conversational loop, making enterprise AI accessible without engineering expertise. Directly drove platform-wide adoption growth.
Engage Notification Engine
1.5M messages/min peak · Zero message lossRedesigned a plain BullMQ queue architecture into a per-channel throttling layer with idempotency guarantees and configurable rate-limiting. Burst traffic is shaped into controlled, fixed-rate downstream delivery — enterprise SLAs hold even during spikes. Simultaneously handles WhatsApp, email, SMS, push, and 10+ other channels.
Cold Storage Migration — S3 Tables + Athena
$93,000 saved annually · 18B+ conversationsReplaced a legacy cron-based pipeline (single-day JSON dumps to Minio) with a structured cold storage layer using S3 Tables in Parquet format and AWS Athena for on-demand querying. Eliminated hot database bloat, enabled granular time-range queries, and cut infrastructure costs by six figures per year without sacrificing data accessibility.
Analyze — LLM Conversation Pipeline
99.9% uptime · ElasticSearch at scaleEnd-to-end ownership of the Analyze module: an LLM-powered pipeline that processes conversation data, indexes actionable insights into ElasticSearch, and automatically derives knowledge bases to improve downstream AI agent performance. Built the query optimization layer that kept p95 latency under control at high-traffic volume.
Metering & Rate-Limiting — Openmeter
$50K+ revenue leakage eliminated · 900+ accountsDesigned and shipped a high-precision metering system using Openmeter to track consumption and enforce rate limits across 900+ enterprise accounts. Resolved systematic billing discrepancies that had been quietly leaking revenue, and gave finance and ops teams a reliable, auditable source of truth for usage data.
Meta Platform Integrations — WhatsApp & Facebook
Concept → production in < 1 weekLed end-to-end delivery of WhatsApp Calling and MMLite beta features — from API design to production rollout. Compressed the typical vendor integration cycle from weeks into under seven days. These integrations became a competitive differentiator for Yellow.ai's go-to-market strategy and directly enabled new enterprise client acquisition.
Technical skills.
The tools I reach for first — and the ones I'll pick up when the job demands it.
Where I studied.
CS fundamentals from one of India's top engineering institutions.
National Institute of Technology, Tiruchirappalli
NIT Trichy — Top 10 Engineering Institution, IndiaBuilt a strong foundation in algorithms, data structures, distributed systems, and software engineering principles at one of India's premier technical institutes. The rigour of NIT Trichy's CS program directly shapes how I think about system design, trade-offs, and correctness under constraints.
Writing & thinking.
Engineering essays on distributed systems, Gen AI, and the hard lessons from production at scale.
I'm working on a series of deep-dives on the systems I've built — the decisions, the trade-offs, and what I'd do differently with hindsight. First posts dropping soon. Drop me a line and I'll let you know when they're live.
Scaling to 1.5M messages per minute — lessons from the Engage engine
How we redesigned a flat BullMQ queue into a per-channel throttling layer with idempotency. What burst traffic actually looks like at enterprise scale, and why "just add more workers" is the wrong answer.
→Why we migrated from JSON dumps to Parquet + Athena (and saved $93K)
A walkthrough of replacing a cron-based Minio archival pipeline with S3 Tables in Parquet format. Why the cold storage decision compounded into six-figure annual infrastructure savings — and the benchmarks that made the case.
→Building agentic interfaces with natural language — inside Nexus
What it takes to let non-technical users configure enterprise AI agents through conversation. Architecture, prompting patterns, and the UX challenges nobody talks about when building Gen AI products at scale.
→Observability as a product decision, not an afterthought
How adding Grafana, Prometheus, and New Relic across our platform cut MTTD and MTTR by 90%. Why I now treat dashboards and alerting as a first-class engineering concern — before the first line of feature code.
→Let's talk.
Open to interesting conversations — product engineering, Gen AI, distributed systems, or what you're building.
I'm currently at Yellow.ai and open to conversations about ambitious engineering challenges. If you're building something interesting, want a second opinion on a system design, or are looking for an engineer who's shipped at scale — reach out.
I'm particularly interested in opportunities at the intersection of AI and large-scale backend systems. Remote-friendly. Based in Dhaka, Bangladesh.
Based in Dhaka, Bangladesh · Available for remote roles worldwide.