// MILESTONE PROJECT · REGULATED KUBERNETES · VIRTUAL LAB
GxP BioInfra Platform
Regulated Kubernetes Architecture & Virtual Lab for Bioinformatics Workloads
A GxP-aligned Kubernetes platform architecture and virtual lab implementation in progress, designed to run reproducible bioinformatics pipelines with GitOps change control, policy-as-code, observability, runtime audit trails, disaster recovery planning, and IQ/OQ/PQ-style validation documentation. This is designed and currently prototyping — a compliance-ready platform design exercise, not a claim of full production certification.
Implementation repository currently private while the virtual lab build is in progress.
// 01 · ARCHITECTURE & INTENT
GxP BioInfra Platform
A regulated Kubernetes platform architecture and virtual lab implementation designed to demonstrate how cloud-native infrastructure can support reproducible bioinformatics workloads, auditability, access control, GitOps change management, and compliance-ready operations.
This project started as a project tracing document: a full breakdown of the environment, components, risks, controls, and validation evidence required to build the platform correctly. I am now building the virtual lab implementation to test the design and convert the architecture into working proof.
The goal is not to claim a finished production-certified system. The goal is to prove that I understand how to design, document, secure, operate, and validate a Kubernetes-based platform in a regulated context.
Why this project matters
Pharma, biotech, fintech, and regulated SaaS environments need platform engineers who understand more than deployment. They need engineers who can connect infrastructure decisions to auditability, change control, risk, access management, security controls, and operational evidence.
This project is designed to show that bridge:
- Platform engineering: Kubernetes, GitOps, workload placement, storage, ingress, and observability.
- DevSecOps: policy-as-code, secrets management, runtime security, image pinning, and incident response.
- Compliance thinking: GxP-style validation, ISO 27001 alignment, risk registers, supplier assessment, and IQ/OQ/PQ-style documentation.
- Operational maturity: runbooks, disaster recovery plans, patch management, onboarding, and test scripts.
Target architecture
The target platform is designed around a Kubernetes cluster running regulated bioinformatics workloads through a controlled GitOps workflow. Core components:
- Kubernetes / Talos OS — immutable cluster operating system and workload orchestration.
- ArgoCD — GitOps controller enforcing repository-defined cluster state.
- Forgejo — self-hosted Git platform for change control and pull-request-based infrastructure changes.
- Authentik — OIDC/SSO identity provider for centralised access control.
- MinIO — S3-compatible object storage for pipeline input, output, work directories, and audit log persistence.
- OPA Gatekeeper — policy-as-code admission control for resource limits, image pinning, labels, approved registries, and privilege restrictions.
- Falco — runtime security and audit event capture.
- Prometheus / Grafana / Loki — metrics, dashboards, logs, and audit trail visibility.
- Nextflow / nf-core — reproducible bioinformatics pipeline execution.
- Cloudflare Tunnel / Ingress-NGINX / Cert-Manager — secure access, routing, and TLS.
Architecture flow
Horizontal control path from source of truth to validation evidence (virtual lab / target design).
// 02 · WHAT THIS PROVES
Capabilities & evidence
This project demonstrates my ability to think beyond installing tools — designing a platform around operational, security, and compliance requirements from the beginning.
| Capability | Evidence |
|---|---|
| Kubernetes platform design | Node roles, workload placement, ingress, storage, and phased deployment plan |
| GitOps and change control | ArgoCD, Forgejo PR workflow, configuration-as-code, no manual cluster drift |
| DevSecOps controls | OPA Gatekeeper policies, image digest pinning, approved registries, privileged container restrictions |
| Runtime auditability | Falco event capture, Loki storage, Grafana investigation workflow |
| Observability | Prometheus scrape targets, alert rules, Grafana dashboards, Loki logs |
| Secrets management | Sealed Secrets lifecycle, rotation procedures, no plaintext credentials in Git |
| Disaster recovery | etcd snapshot process, MinIO mirror plan, recovery priority order |
| Compliance mapping | EU GMP Annex 11, FDA 21 CFR Part 11, GAMP 5, and ISO 27001-style control mapping |
| Risk management | Formal risk register with likelihood, impact, controls, and residual risk |
| Operational maturity | Runbooks, patch management, incident response, onboarding, OQ test scripts |
// 03 · DOCUMENTATION
Documentation produced
The project includes a structured documentation set similar to what would be expected in regulated infrastructure environments:
- Architecture Decision Records: key decisions, alternatives, and trade-offs.
- Infrastructure baseline: hardware, network, cluster state, storage, and topology.
- Architecture and component map: how all services connect and why they exist.
- Operational runbooks: daily, weekly, and monthly platform procedures.
- Disaster recovery plan: etcd recovery, storage failure recovery, GitOps restoration.
- Patch management process: CVE severity classification and response windows.
- Security architecture: threat model, protected assets, attack vectors, and layered controls.
- Component hardening guide: verification steps for Talos, ArgoCD, Forgejo, Authentik, MinIO, Gatekeeper, Falco, and other components.
- Incident response playbook: detection, containment, investigation, and recovery procedures.
- Secrets and key management standard: creation, sealing, storage, rotation, and retirement.
- Network security policy: current Flannel limitation and target Cilium enforcement model.
- Compliance matrix: mapping platform controls to Annex 11, Part 11, GAMP 5, and ISO 27001.
- Risk register: formal risks, mitigations, residual ratings, and review triggers.
- Supplier assessment register: third-party software risk and advisory tracking.
- Statement of Applicability: ISO 27001-style control applicability and implementation status.
- IQ/OQ/PQ validation templates: installation, operational, and performance qualification evidence structure.
// 04 · VALIDATION APPROACH
IQ / OQ / PQ-style evidence
Validation-style documentation means structuring evidence the way regulated teams expect: Installation Qualification (IQ) for “built as designed,” Operational Qualification (OQ) for “behaves correctly under defined conditions,” and Performance Qualification (PQ) for reproducible performance under realistic workload assumptions. In the virtual lab, I am aligning test scripts, sync evidence, policy denial cases, and observability exports to that structure — as educational and architectural evidence, not as legal certification.
Phase 7 of the roadmap (validation evidence pack) consolidates IQ baselines, OQ scripted tests (including negative tests where policies must deny bad deploys), and PQ-style reproducibility checks for pipeline runs — always labeled honestly against what the lab can prove today versus target state.
// 05 · ROADMAP
Phased delivery
Status pills reflect honest progress: Complete In progress Not started Roadmap
PH-00 — Cluster preparation
Complete- Talos nodes Ready with node-role labels; control-plane node untainted for early workloads.
- Beelink storage node: HDD mounted, `local-hdd` default StorageClass, PVC provisioning verified on disk.
- PH-00 etcd snapshot captured outside Git (implementation log).
PH-01 — Forgejo
Complete- Forgejo on Beelink with `local-hdd` PVC, SealedSecret admin credentials, TLS from homelab CA, SSH via NodePort 32222.
- MetalLB VIP on 80/443; platform repo `gxp-admin/gxp-platform` in Forgejo.
- ArgoCD root Application pulls from internal Forgejo mirror URL.
PH-02 — Authentik SSO
In progress- Authentik server, worker, and PostgreSQL running.
- OIDC clients for Forgejo, ArgoCD, and Grafana; platform groups (`platform-admin`, `developer`, `readonly`) created.
- Remaining: end-to-end browser logins per app, confirm group-claim role mapping in live tokens, disable local password fallback.
PH-03 — MinIO object storage
In progress- MinIO live on Beelink HDD with planned buckets (`pipeline-input`, `pipeline-output`, `pipeline-work`, `loki-chunks`) and least-privilege IAM-style users (`nextflow-sa`, `loki-sa`).
- Remaining: migrate Loki storage to `loki-chunks` bucket; migrate Prometheus TSDB onto `local-hdd` (per phase plan).
PH-04 — OPA Gatekeeper
Not started- Policy-as-code admission control: image digest pinning, resource limits, approved registry constraints enforced at the API before workloads run.
PH-05 — Falco runtime security
Not started- Falco DaemonSet across nodes; events to Loki via Falcosidekick for Annex 11-style runtime audit trail.
PH-06 — Nextflow and nf-core
Not started- nf-core pipeline run (e.g. rnaseq) with Falco audit trail and outputs in MinIO — integration test for phases 0–5.
PH-07 — GxP validation documentation
Not started- IQ / OQ / PQ documentation and supporting SOPs to QA-auditor standard; drafting can start in parallel from PH-04 onward.
Future improvements
Roadmap- Migrate from Flannel to Cilium for real NetworkPolicy enforcement.
- Add Hubble for network observability.
- Add Trivy scanning to CI/CD.
- Add Cosign image signing.
- Add Harbor registry.
- Replace Sealed Secrets with Vault or External Secrets Operator.
- Add external SIEM integration.
- Expand to multi-node HA control plane.
// 06 · KNOWN LIMITATIONS
Honest constraints
This is a virtual lab and architecture prototype, not a certified production environment.
- Implementation is in progress.
- The design is GxP-aligned but not externally audited.
- The virtual lab is intended to prove architecture and controls, not production scale.
- NetworkPolicy enforcement depends on a future Cilium migration.
- High availability is a target-state concept, not assumed in the first prototype.
- Full bioinformatics workload scale requires larger compute and memory resources.
- Compliance artifacts are educational and architectural evidence, not legal certification.
These limitations are documented deliberately because real platform engineering requires understanding risk, not hiding it.
// 07 · RECRUITER SUMMARY
At a glance
This project shows my ability to design a Kubernetes platform around real operational and compliance requirements. It combines GitOps, observability, security controls, risk management, incident response, secrets handling, and validation-style documentation into one coherent platform architecture. For cloud and DevOps roles, it demonstrates platform engineering thinking. For regulated industries, it demonstrates that I understand how infrastructure decisions connect to auditability, access control, change management, data integrity, and validation evidence.
// 08 · TARGET ROLES
Roles this project supports
Amsterdam focus
- Junior Cloud Engineer
- Junior DevOps Engineer
- Cloud Platform Engineer
- DevOps Consultant
- Junior Site Reliability Engineer
- DevSecOps Engineer
Basel focus
- Cloud Platform Engineer — regulated systems
- DevOps Engineer — pharma / biotech
- GxP Infrastructure Engineer
- CSV / Cloud Infrastructure Engineer
- DevSecOps Engineer
- Kubernetes Platform Engineer
// 09 · TECH STACK
Tags
Technologies, controls, and frameworks referenced in the architecture and documentation.