Secure · Scalable · Observable
Cloud & DevOps Engineer — AWS · GCP · Terraform · Kubernetes · SRE · DevSecOps
I specialise in production-grade platforms across AWS and GCP: Terraform-driven infrastructure, Kubernetes delivery, and CI/CD with SRE practices and DevSecOps. I use AI-assisted workflows where they improve speed without sacrificing correctness or compliance.
Based in Chester, UK · Open to remote and UK-wide roles
Featured projects
Recent work across AI ops, reliability, and platform engineering.
AIOps & SRE
Autonomous DevOps Incident Response Agent
Production-style AIOps system for first-line incident triage: FAISS-backed retrieval over runbooks and logs, LangGraph agents with guardrails, and structured JSON APIs for safe automation. Shipped with FastAPI services, n8n orchestration, Gradio and Next.js operator UIs, containerised on AWS ECS with Terraform, CloudWatch observability, and GitHub Actions CI/CD—targeting sub-30-second first responses and clearer root-cause narratives for SRE teams.
Case study →AI/ML Platform
Enterprise AI/ML Platform (Production)
Operated ML inference and training paths like any other production service: SRE-style service level objectives, dashboards, and alerting on latency, errors, and cost; FinOps-style controls on GPU/CPU spend and autoscaling behaviour. The outcome was roughly 60% lower inference cost alongside more predictable capacity and faster incident detection for data science and platform stakeholders.
Case study →Platform Engineering
FinBankOps — Multi-Region EKS (Fintech)
Fintech-grade Kubernetes platform on AWS EKS across regions: Istio service mesh, GitOps delivery with Argo CD, and PCI-aligned controls woven into pipelines and clusters. Observability uses Prometheus and Grafana alongside deployment patterns that support controlled rollouts—giving security, platform, and application teams a shared picture of compliance, cost, and reliability.
Case study →