Skip to content

Production Template

Production template and agentic operating model

A reusable ML service template with governed AI-assisted engineering

The ML-MLOps Production Template is the strongest artifact in this portfolio. It packages the production lessons from the monorepo into a starter system for ML services: FastAPI serving, training/serving parity, CI/CD, Docker, Kubernetes, Terraform examples, observability hooks, runbooks and an explicit agentic governance model. The template also encodes 32 anti-patterns with corrective actions, SLSA L2 supply-chain security practices and closed-loop monitoring with statistical promotion gates.

Serving baseline FastAPI scaffold Predict, batch predict, health, readiness, metrics and model operations.
Cloud posture GKE + EKS ready Kubernetes overlays and Terraform examples for both paths.
Failure modes 32 anti-patterns Corrective actions for serving, HPA, SHAP, IAM, CI/CD and deployment risks.
Supply chain SLSA L2 posture Security scanning, build hygiene and provenance-oriented release practices.
Agentic governance AUTO / CONSULT / STOP Rules for when agents can act, ask or halt for safety.
Monitoring loop Promotion gates Closed-loop monitoring ideas tied to statistical quality gates.

How To Read This Template

Recruiter view

Reusable system, not only a project

The important signal is that portfolio lessons became a repeatable starter system for future ML services, with defaults, guardrails and documentation.

Technical lead view

Inspect the operating contracts

The best evidence is in the rules, skills, workflows, manifest and anti-pattern catalog that constrain how services are generated and operated.

Team adoption view

Can another engineer use it?

The template is designed around quick start, service scaffold, tests, deployment artifacts and reviewable AI-assisted workflows.

Code Review Shortcuts

Why This Is More Than A Template

The template is not only a folder structure. It is an operating contract for starting ML services with fewer avoidable mistakes. It defines what a generated service should contain, how it should be validated, which deployment paths it can follow and how AI agents are allowed to participate in the work.

The important point is the second-order artifact: I did not only build three portfolio services. I extracted the reusable production system behind them.

Technical reviewer signal

The most distinctive part is the agentic operating model: canonical rules, skills, workflows and risk escalation are written as code-adjacent governance, not left as informal prompting.

Production Scaffold Contract

1. Service API

FastAPI by default

The scaffold includes prediction endpoints, batch prediction, readiness, health, metrics, error envelopes, auth hooks and model metadata.

2. ML lifecycle

Training to serving parity

Feature engineering, schema validation, model loading and inference contracts are treated as one path instead of separate notebook and API worlds.

3. Quality gates

CI before adoption

YAML checks, workflow checks, scaffold tests, smoke paths, pre-commit hooks and targeted validations protect the generated service.

4. Runtime

Docker and Kubernetes

Containers, Kustomize overlays, readiness probes and deployment defaults are part of the service contract from the beginning.

5. Observability

Metrics and operations hooks

Prometheus-compatible metrics, prediction logging, tracing options, drift checks and retraining workflows are built into the template story.

6. Cloud path

GCP and AWS patterns

The template documents GKE/EKS deployment expectations, identity patterns, artifact registries and operational runbooks without pretending local tests are cloud proof.

Agentic Operating Model

Canonical rules

One source of truth

AGENTS.md, agent context and the manifest define the behavior contract. Adapter files point to canonical rules instead of drifting into parallel policy.

Risk matrix

AUTO / CONSULT / STOP

Low-risk work can be automated, ambiguous production work requires user consultation, and destructive or safety-sensitive actions must stop.

Workflow skills

Repeatable MLOps actions

New service creation, drift checks, retraining, rollback, cost review, release, incident and security workflows are encoded as reusable agent skills.

Reviewability

Agents leave evidence

The model is designed around validation logs, changelogs, ADRs, runbooks and explicit test commands so agent-assisted work remains auditable.

Rules, Skills And Workflows

Rules

Context-aware engineering constraints

Rules cover Python serving, training, Kubernetes, Terraform, Docker, GitHub Actions, monitoring, data validation, security, API contracts and documentation. The goal is to make failure modes harder to reintroduce.

Skills

Reusable MLOps procedures

Skills include new service creation, EDA, deploy to GKE/EKS, drift checks, model retraining, release checklist, rollback, cost audit, security audit and incident response.

Workflows

Slash-command operating paths

Workflows such as /new-service, /incident, /release, /drift-check, /retrain, /rollback and /secret-breach turn repeatable MLOps work into auditable steps.

Anti-Pattern Catalog

Serving

No uvicorn --workers N under Kubernetes

Corrective action: one worker per pod, HPA for horizontal scaling, and ThreadPoolExecutor for CPU-bound inference.

Autoscaling

No memory-based HPA for ML pods

Corrective action: use CPU as the scaling signal because loaded models keep a fixed memory footprint even when traffic drops.

Async APIs

No direct model.predict() in async endpoints

Corrective action: move CPU-bound prediction work behind asyncio.run_in_executor() so request handling stays responsive.

Explainability

No TreeExplainer for StackingClassifier

Corrective action: use KernelExplainer with a predict-proba wrapper in the original feature space.

Packaging

No model artifacts baked into Docker images

Corrective action: keep images immutable and load model artifacts through runtime storage patterns such as init containers.

Security

No static cloud credentials in production paths

Corrective action: use Workload Identity on GCP and IRSA on AWS, with CI/deploy/runtime identities separated by purpose.

Multi-IDE Governance

Canonical source

Windsurf as the body store

The rules, skills and workflows live in the canonical .windsurf/ tree so the operating model has one source of truth.

Portable adapters

Codex, Cursor and Claude pointers

Adapter directories are discovery layers. They point back to the canonical rules instead of becoming separate, drifting copies of the same policy.

Manifest

Cross-surface index

The manifest maps rules, skills and workflows across IDE surfaces so the governance model can be validated instead of trusted by memory.

Behavior protocol

AUTO / CONSULT / STOP everywhere

Risk class is independent of the assistant being used. Low-risk work can run, ambiguous work asks, and destructive or production-sensitive work stops.

What A Reviewer Should Inspect

First scaffold

Can it generate a usable service?

A reviewer should be able to scaffold a service, run the local checks and see a coherent FastAPI project with tests and deployment artifacts.

Serving contract

Does inference match training?

The service should preserve feature parity, validate inputs and expose readiness based on model loading rather than a superficial health endpoint.

Governance contract

Can agents act safely?

The strongest technical signal is whether agent workflows are bounded by rules, manifests, validations and explicit escalation points.

Cloud realism

Are claims scoped honestly?

The template distinguishes local validation from real GKE/EKS validation, which keeps the documentation useful without overstating production proof.

What It Shows About Me

Signal What it means
Product thinking I turned portfolio lessons into a reusable starter system.
MLOps discipline The template treats serving, testing, packaging and deployment as one system.
Governance mindset AI-assisted engineering is bounded by explicit rules, not only prompts.
Operational honesty Cloud validation, cost control and limitations are documented instead of hidden.
Documentation taste The repo is designed so another engineer can adopt, review and improve it.

Where To Go Next

Repository

Template source

Start here for the actual scaffold, docs, workflows, rules and release history.

Adoption path

Quick Start

Best entry point for generating and validating a new service from the template.

Decision trail

Architecture decisions

Review the trade-offs behind the template instead of only reading the final structure.

Portfolio context

Technical evidence

See how the template relates to the broader monorepo, cloud evidence and production ML portfolio.