Production Template¶
Production template and agentic operating model
A reusable ML service template with governed AI-assisted engineering¶
The ML-MLOps Production Template is the strongest artifact in this portfolio. It packages the production lessons from the monorepo into a starter system for ML services: FastAPI serving, training/serving parity, CI/CD, Docker, Kubernetes, Terraform examples, observability hooks, runbooks and an explicit agentic governance model. The template also encodes 32 anti-patterns with corrective actions, SLSA L2 supply-chain security practices and closed-loop monitoring with statistical promotion gates.
How To Read This Template¶
Recruiter view
Reusable system, not only a project¶
The important signal is that portfolio lessons became a repeatable starter system for future ML services, with defaults, guardrails and documentation.
Technical lead view
Inspect the operating contracts¶
The best evidence is in the rules, skills, workflows, manifest and anti-pattern catalog that constrain how services are generated and operated.
Team adoption view
Can another engineer use it?¶
The template is designed around quick start, service scaffold, tests, deployment artifacts and reviewable AI-assisted workflows.
Code Review Shortcuts¶
Service Dockerfile K8s deployment CI template Deploy GCP Agent manifest Agent rules Agent skills Agent workflows
Why This Is More Than A Template¶
The template is not only a folder structure. It is an operating contract for starting ML services with fewer avoidable mistakes. It defines what a generated service should contain, how it should be validated, which deployment paths it can follow and how AI agents are allowed to participate in the work.
The important point is the second-order artifact: I did not only build three portfolio services. I extracted the reusable production system behind them.
Technical reviewer signal
The most distinctive part is the agentic operating model: canonical rules, skills, workflows and risk escalation are written as code-adjacent governance, not left as informal prompting.
Production Scaffold Contract¶
1. Service API
FastAPI by default¶
The scaffold includes prediction endpoints, batch prediction, readiness, health, metrics, error envelopes, auth hooks and model metadata.
2. ML lifecycle
Training to serving parity¶
Feature engineering, schema validation, model loading and inference contracts are treated as one path instead of separate notebook and API worlds.
3. Quality gates
CI before adoption¶
YAML checks, workflow checks, scaffold tests, smoke paths, pre-commit hooks and targeted validations protect the generated service.
4. Runtime
Docker and Kubernetes¶
Containers, Kustomize overlays, readiness probes and deployment defaults are part of the service contract from the beginning.
5. Observability
Metrics and operations hooks¶
Prometheus-compatible metrics, prediction logging, tracing options, drift checks and retraining workflows are built into the template story.
6. Cloud path
GCP and AWS patterns¶
The template documents GKE/EKS deployment expectations, identity patterns, artifact registries and operational runbooks without pretending local tests are cloud proof.
Agentic Operating Model¶
Canonical rules
One source of truth¶
AGENTS.md, agent context and the manifest define the behavior contract.
Adapter files point to canonical rules instead of drifting into parallel policy.
Risk matrix
AUTO / CONSULT / STOP¶
Low-risk work can be automated, ambiguous production work requires user consultation, and destructive or safety-sensitive actions must stop.
Workflow skills
Repeatable MLOps actions¶
New service creation, drift checks, retraining, rollback, cost review, release, incident and security workflows are encoded as reusable agent skills.
Reviewability
Agents leave evidence¶
The model is designed around validation logs, changelogs, ADRs, runbooks and explicit test commands so agent-assisted work remains auditable.
Rules, Skills And Workflows¶
Rules
Context-aware engineering constraints¶
Rules cover Python serving, training, Kubernetes, Terraform, Docker, GitHub Actions, monitoring, data validation, security, API contracts and documentation. The goal is to make failure modes harder to reintroduce.
Skills
Reusable MLOps procedures¶
Skills include new service creation, EDA, deploy to GKE/EKS, drift checks, model retraining, release checklist, rollback, cost audit, security audit and incident response.
Workflows
Slash-command operating paths¶
Workflows such as /new-service, /incident,
/release, /drift-check, /retrain,
/rollback and /secret-breach turn repeatable MLOps
work into auditable steps.
Anti-Pattern Catalog¶
Serving
No uvicorn --workers N under Kubernetes¶
Corrective action: one worker per pod, HPA for horizontal
scaling, and ThreadPoolExecutor for CPU-bound inference.
Autoscaling
No memory-based HPA for ML pods¶
Corrective action: use CPU as the scaling signal because loaded models keep a fixed memory footprint even when traffic drops.
Async APIs
No direct model.predict() in async endpoints¶
Corrective action: move CPU-bound prediction work behind
asyncio.run_in_executor() so request handling stays responsive.
Explainability
No TreeExplainer for StackingClassifier¶
Corrective action: use KernelExplainer with a predict-proba wrapper in the original feature space.
Packaging
No model artifacts baked into Docker images¶
Corrective action: keep images immutable and load model artifacts through runtime storage patterns such as init containers.
Security
No static cloud credentials in production paths¶
Corrective action: use Workload Identity on GCP and IRSA on AWS, with CI/deploy/runtime identities separated by purpose.
Multi-IDE Governance¶
Canonical source
Windsurf as the body store¶
The rules, skills and workflows live in the canonical .windsurf/
tree so the operating model has one source of truth.
Portable adapters
Codex, Cursor and Claude pointers¶
Adapter directories are discovery layers. They point back to the canonical rules instead of becoming separate, drifting copies of the same policy.
Manifest
Cross-surface index¶
The manifest maps rules, skills and workflows across IDE surfaces so the governance model can be validated instead of trusted by memory.
Behavior protocol
AUTO / CONSULT / STOP everywhere¶
Risk class is independent of the assistant being used. Low-risk work can run, ambiguous work asks, and destructive or production-sensitive work stops.
What A Reviewer Should Inspect¶
First scaffold
Can it generate a usable service?¶
A reviewer should be able to scaffold a service, run the local checks and see a coherent FastAPI project with tests and deployment artifacts.
Serving contract
Does inference match training?¶
The service should preserve feature parity, validate inputs and expose readiness based on model loading rather than a superficial health endpoint.
Governance contract
Can agents act safely?¶
The strongest technical signal is whether agent workflows are bounded by rules, manifests, validations and explicit escalation points.
Cloud realism
Are claims scoped honestly?¶
The template distinguishes local validation from real GKE/EKS validation, which keeps the documentation useful without overstating production proof.
What It Shows About Me¶
| Signal | What it means |
|---|---|
| Product thinking | I turned portfolio lessons into a reusable starter system. |
| MLOps discipline | The template treats serving, testing, packaging and deployment as one system. |
| Governance mindset | AI-assisted engineering is bounded by explicit rules, not only prompts. |
| Operational honesty | Cloud validation, cost control and limitations are documented instead of hidden. |
| Documentation taste | The repo is designed so another engineer can adopt, review and improve it. |
Where To Go Next¶
Repository
Template source¶
Start here for the actual scaffold, docs, workflows, rules and release history.
Adoption path
Quick Start¶
Best entry point for generating and validating a new service from the template.
Decision trail
Architecture decisions¶
Review the trade-offs behind the template instead of only reading the final structure.
Portfolio context
Technical evidence¶
See how the template relates to the broader monorepo, cloud evidence and production ML portfolio.