Production Template¶

Production template and agentic operating model

A reusable ML service template with governed AI-assisted engineering¶

The ML-MLOps Production Template is the strongest artifact in this portfolio. It packages the production lessons from the monorepo into a starter system for ML services: FastAPI serving, training/serving parity, CI/CD, Docker, Kubernetes, Terraform examples, observability hooks, runbooks and an explicit agentic governance model. The template also encodes 32 anti-patterns with corrective actions, SLSA L2 supply-chain security practices and closed-loop monitoring with statistical promotion gates.

Open the template repo Read the quick start Review technical evidence

Serving baseline FastAPI scaffold Predict, batch predict, health, readiness, metrics and model operations.

Cloud posture GKE + EKS ready Kubernetes overlays and Terraform examples for both paths.

Failure modes 32 anti-patterns Corrective actions for serving, HPA, SHAP, IAM, CI/CD and deployment risks.

Supply chain SLSA L2 posture Security scanning, build hygiene and provenance-oriented release practices.

Agentic governance AUTO / CONSULT / STOP Rules for when agents can act, ask or halt for safety.

Monitoring loop Promotion gates Closed-loop monitoring ideas tied to statistical quality gates.

How To Read This Template¶

Recruiter view

Reusable system, not only a project¶

The important signal is that portfolio lessons became a repeatable starter system for future ML services, with defaults, guardrails and documentation.

Technical lead view

Inspect the operating contracts¶

The best evidence is in the rules, skills, workflows, manifest and anti-pattern catalog that constrain how services are generated and operated.

Team adoption view

Can another engineer use it?¶

The template is designed around quick start, service scaffold, tests, deployment artifacts and reviewable AI-assisted workflows.

Code Review Shortcuts¶

Service Dockerfile K8s deployment CI template Deploy GCP Agent manifest Agent rules Agent skills Agent workflows

Why This Is More Than A Template¶

The template is not only a folder structure. It is an operating contract for starting ML services with fewer avoidable mistakes. It defines what a generated service should contain, how it should be validated, which deployment paths it can follow and how AI agents are allowed to participate in the work.

The important point is the second-order artifact: I did not only build three portfolio services. I extracted the reusable production system behind them.

Technical reviewer signal

The most distinctive part is the agentic operating model: canonical rules, skills, workflows and risk escalation are written as code-adjacent governance, not left as informal prompting.

Production Scaffold Contract¶

1. Service API

FastAPI by default¶

The scaffold includes prediction endpoints, batch prediction, readiness, health, metrics, error envelopes, auth hooks and model metadata.

2. ML lifecycle

Training to serving parity¶

Feature engineering, schema validation, model loading and inference contracts are treated as one path instead of separate notebook and API worlds.

3. Quality gates

CI before adoption¶

YAML checks, workflow checks, scaffold tests, smoke paths, pre-commit hooks and targeted validations protect the generated service.

4. Runtime

Docker and Kubernetes¶

Containers, Kustomize overlays, readiness probes and deployment defaults are part of the service contract from the beginning.

5. Observability

Metrics and operations hooks¶

Prometheus-compatible metrics, prediction logging, tracing options, drift checks and retraining workflows are built into the template story.

6. Cloud path

GCP and AWS patterns¶

The template documents GKE/EKS deployment expectations, identity patterns, artifact registries and operational runbooks without pretending local tests are cloud proof.

Agentic Operating Model¶

Canonical rules

One source of truth¶

AGENTS.md, agent context and the manifest define the behavior contract. Adapter files point to canonical rules instead of drifting into parallel policy.

Risk matrix

AUTO / CONSULT / STOP¶

Low-risk work can be automated, ambiguous production work requires user consultation, and destructive or safety-sensitive actions must stop.

Workflow skills

Repeatable MLOps actions¶

New service creation, drift checks, retraining, rollback, cost review, release, incident and security workflows are encoded as reusable agent skills.

Reviewability

Agents leave evidence¶

The model is designed around validation logs, changelogs, ADRs, runbooks and explicit test commands so agent-assisted work remains auditable.

Rules, Skills And Workflows¶

Rules

Context-aware engineering constraints ¶

Rules cover Python serving, training, Kubernetes, Terraform, Docker, GitHub Actions, monitoring, data validation, security, API contracts and documentation. The goal is to make failure modes harder to reintroduce.

Skills

Reusable MLOps procedures ¶

Skills include new service creation, EDA, deploy to GKE/EKS, drift checks, model retraining, release checklist, rollback, cost audit, security audit and incident response.

Workflows

Slash-command operating paths ¶

Workflows such as /new-service, /incident, /release, /drift-check, /retrain, /rollback and /secret-breach turn repeatable MLOps work into auditable steps.

Anti-Pattern Catalog¶

Serving

No `uvicorn --workers N` under Kubernetes¶

Corrective action: one worker per pod, HPA for horizontal scaling, and ThreadPoolExecutor for CPU-bound inference.

Autoscaling

No memory-based HPA for ML pods¶

Corrective action: use CPU as the scaling signal because loaded models keep a fixed memory footprint even when traffic drops.

Async APIs

No direct `model.predict()` in async endpoints¶

Corrective action: move CPU-bound prediction work behind asyncio.run_in_executor() so request handling stays responsive.

Explainability

No TreeExplainer for StackingClassifier¶

Corrective action: use KernelExplainer with a predict-proba wrapper in the original feature space.

Packaging

No model artifacts baked into Docker images¶

Corrective action: keep images immutable and load model artifacts through runtime storage patterns such as init containers.

Security

No static cloud credentials in production paths¶

Corrective action: use Workload Identity on GCP and IRSA on AWS, with CI/deploy/runtime identities separated by purpose.

Multi-IDE Governance¶

Canonical source

Windsurf as the body store¶

The rules, skills and workflows live in the canonical .windsurf/ tree so the operating model has one source of truth.

Portable adapters

Codex, Cursor and Claude pointers¶

Adapter directories are discovery layers. They point back to the canonical rules instead of becoming separate, drifting copies of the same policy.

Manifest

Cross-surface index ¶

The manifest maps rules, skills and workflows across IDE surfaces so the governance model can be validated instead of trusted by memory.

Behavior protocol

AUTO / CONSULT / STOP everywhere¶

Risk class is independent of the assistant being used. Low-risk work can run, ambiguous work asks, and destructive or production-sensitive work stops.

What A Reviewer Should Inspect¶

First scaffold

Can it generate a usable service?¶

A reviewer should be able to scaffold a service, run the local checks and see a coherent FastAPI project with tests and deployment artifacts.

Serving contract

Does inference match training?¶

The service should preserve feature parity, validate inputs and expose readiness based on model loading rather than a superficial health endpoint.

Governance contract

Can agents act safely?¶

The strongest technical signal is whether agent workflows are bounded by rules, manifests, validations and explicit escalation points.

Cloud realism

Are claims scoped honestly?¶

The template distinguishes local validation from real GKE/EKS validation, which keeps the documentation useful without overstating production proof.

What It Shows About Me¶

Signal	What it means
Product thinking	I turned portfolio lessons into a reusable starter system.
MLOps discipline	The template treats serving, testing, packaging and deployment as one system.
Governance mindset	AI-assisted engineering is bounded by explicit rules, not only prompts.
Operational honesty	Cloud validation, cost control and limitations are documented instead of hidden.
Documentation taste	The repo is designed so another engineer can adopt, review and improve it.

Where To Go Next¶

Repository