ADR-011: Gradio Demo — Not Deployed to Production¶

Status: Accepted
Date: 2026-03-10
Authors: Duque Ortega Mutis
Related: ADR-009 (simplification philosophy)

TL;DR: Kept Gradio as a local development tool only. Deploying it to K8s would create architectural inconsistency (1 of 3 services with a UI), duplicate Swagger UI functionality, and consume cluster resources for a component that doesn't serve the portfolio's MLOps thesis.

Decision¶

Do NOT deploy Gradio to the Kubernetes cluster. Keep it as a local development tool only.

Context¶

After removing CarVision (and its Streamlit dashboard) from the portfolio in v3.5.0, the three remaining services (BankChurn, NLPInsight, ChicagoTaxi) are all API-only FastAPI applications. The question arose whether BankChurn's Gradio demo should be deployed to fill the "interactive UI" gap.

Evaluation¶

Arguments FOR deploying Gradio¶

Argument	Weight
Visual demo impresses non-technical reviewers	Medium
Shows ability to build user-facing ML interfaces	Low (not the portfolio's focus)
Fills gap left by CarVision Streamlit removal	Low

Arguments AGAINST deploying Gradio¶

Argument	Weight
Inconsistency: Only 1 of 3 projects would have a UI → asymmetric architecture	High
Not suited for a hardened service: No auth, no RBAC, limited scaling, no production hardening	High
Redundant: Swagger UI already provides interactive API testing for all 3 services	High
Resource impact: Additional pod on an e2-medium cluster (shared vCPU, 4GB RAM)	Medium
Scope creep: Portfolio demonstrates MLOps (CI/CD, K8s, monitoring), not UI/UX	Medium
Over-engineering: Adding infrastructure that doesn't serve the portfolio's thesis	Medium

What mature teams actually do¶

In production ML platforms, the frontend is typically: - A dedicated React/Vue/Next.js application built by a frontend team - An internal tool (Retool, Streamlit Cloud) for stakeholders - NOT a Gradio widget embedded in the ML service pod

Deploying Gradio alongside FastAPI conflates two concerns: ML serving (FastAPI) and user interface (Gradio). Mature service architecture separates these.

Alternatives Considered¶

Alternative	Verdict
Deploy Gradio as sidecar in BankChurn pod	Rejected — increases pod resource usage, inconsistent with other services
Deploy Gradio as separate Deployment + Ingress path	Rejected — full K8s stack for a demo UI that duplicates Swagger UI
Add Gradio to all 3 projects	Rejected — NLPInsight and ChicagoTaxi have simple inputs; Swagger UI is sufficient
Replace Gradio with a React frontend	Rejected — significant effort, out of portfolio scope (MLOps, not frontend)

Consequences¶

app/gradio_demo.py remains in BankChurn for local development and demos (python app/gradio_demo.py)
No K8s deployment, service, or Ingress path for Gradio
Portfolio UI interaction is demonstrated via Swagger UI (/bankchurn/docs, /nlpinsight/docs, /chicagotaxi/docs)
If a recruiter or interviewer asks about UI capabilities, point to Gradio demo code as evidence of the skill, and explain the architectural decision to keep it out of production

Principle¶

Every component in the cluster must justify its resource cost and operational burden. A demo UI that duplicates Swagger UI functionality does not meet this bar. The portfolio's value comes from the MLOps infrastructure (CI/CD, monitoring, auto-scaling, canary deployments), not from a prototype UI.