Skip to content

Anthony Maio

Staff AI Platform Engineer | LLM Infrastructure & Reliability | Agent Systems & Safety Engineering

anthony@making-minds.ai Danbury, CT (Remote US)

Professional Summary

Staff+ AI Platform and Reliability Engineer with 20 years of production systems experience across regulated fintech, identity/access management, and high-throughput distributed systems. Currently focused on LLM infrastructure, agent orchestration, model lifecycle management, and AI safety engineering. Track record of measurable production impact: 99.99% login uptime, ~20% auth latency improvement, ~$654K/year infrastructure savings, event-sourced systems sustaining ~5K tx/sec at <10ms latency, and Kafka-based compliance pipelines processing ~5TB/day. Combines research-grade prototyping (model training, evaluation harnesses, preference optimization) with production shipping discipline (CI-gated validation, rollout gates, incident response, observability).

Core Skills

AI Platform & LLMOps: LLM evaluation harnesses, agent orchestration, model lifecycle management, inference infrastructure, regression testing, failure-mode analysis, monitoring/observability, safe rollout measurement, prompt/tool risk controls, sandboxed tool execution, AI platform governance, dataset curation, preference optimization (DPO/RLHF)
ML & Training Infrastructure: Python, PyTorch, Hugging Face Transformers/Datasets/TRL, distributed training (DDP), full fine-tuning, LoRA/PEFT, quantization (4-bit/GPTQ/AWQ), model optimization, Mixture-of-Experts architectures
Distributed Systems & Reliability: AWS (EC2, ECS, Lambda, S3, SQS, Kinesis), Kubernetes, Terraform, CI/CD (GitHub Actions, Jenkins), Kafka, CQRS/event sourcing, PostgreSQL, Redis, reliability engineering, incident response, cost optimization, autoscaling
Security & Compliance: AuthN/AuthZ, OAuth 2.0/OIDC, MFA/2FA, API security, audit/compliance pipelines, SOC 2, ISO 27001, PCI DSS, GDPR, KYC/KYB/AML, regulated environment operations
Languages & Tools: Python, C#/.NET, JavaScript/TypeScript, Bash, SQL, Git, Docker

Professional Experience

Staff Artificial Intelligence Engineer

Sep 2024 – Present

Making-Minds.ai · Remote

Independent AI platform engineering practice delivering LLM infrastructure, agent reliability tooling, and AI safety evaluation systems. Clients include C-level stakeholders at startups and mid-sized industrial organizations (manufacturing/IoT, mineral processing). All artifacts published as open-source with reproducible training/evaluation configurations.

  • Designed and built agent execution and verification workflows treating agents as production actors with explicit latency/cost budgets, deterministic replay/verification gates, and CI-gated trust scoring for tool outputs and generated code.
  • Implemented sandboxed, headless tool-execution infrastructure with constrained permissions, auditable failure modes, and documented operational tradeoffs for agent orchestration deployments (Agent Hardening Pack).
  • Shipped LLM evaluation and monitoring tooling focused on regression detection and failure classification under operational constraints (throughput/latency/cost), with automated CI pipeline integration for model/prompt/tool change validation.
  • Trained and released the Eve-2 model family: a 272M-parameter Mixture-of-Experts base model pretrained from scratch on ~10.5B tokens (FineWeb-edu) using PyTorch DDP, plus instruction-tuned and task-specialist derivatives optimized for CPU/edge inference.
  • Built Argos-Swarm, an automated red/blue teaming and orchestration framework for multi-model adversarial safety evaluation, including cross-model epistemic divergence measurement for detecting weak-verified failures in LLM outputs.
  • Applied preference optimization (DPO/RLHF-style) to fine-tune GLM-4.7 for domain-specific protocol adherence; documented quantization/pruning tradeoffs under controlled adversarial testing (PV-EAT framework).
  • Delivered AI product engineering reviews and modernization roadmaps to C-level stakeholders, covering inference infrastructure architecture, model lifecycle management, AI platform governance, and cost optimization strategies.

Staff Software Engineer, Identity & Access Management Platform

Jan 2023 – Aug 2024

DraftKings · Remote

Technical lead and architect for an 8-engineer IAM platform team supporting authentication, authorization, and compliance infrastructure across DraftKings' consumer and enterprise product lines.

  • Led and mentored 8-engineer remote IAM platform team; materially improved delivery throughput through tightened ownership norms, structured review workflows, paired programming practices, and pragmatic process standardization.
  • Executed zero-downtime 2FA/SMS provider migration: maintained 99.99% login uptime, improved auth latency ~20%, and reduced annual vendor spend by ~$450K; established organization-wide MFA standards across all platforms.
  • Reduced AWS infrastructure spend by ~$204K/year through right-sizing, targeted serverless adoption, Kubernetes HPA autoscaling tuning, and resource utilization optimization.
  • Architected and operated Kafka-based compliance data pipelines (Kafka to Redshift), processing ~5TB/day to support regulatory reporting, audit requirements, and KYC/AML compliance workflows.
  • Shipped AI-assisted development tooling (code review automation + knowledge retrieval workflows) adopted across 7 engineering teams; reduced review cycle time ~75% and measurably unblocked feature delivery velocity.

Staff Software Engineer, Trading Systems (New Ventures)

Aug 2022 – Dec 2022

DraftKings · Remote

  • Delivered 0-to-1 MVP for a greenfield crypto trading platform; drove technical strategy and architecture decisions across two distributed engineering teams during early-stage product development.
  • Built event-sourced CQRS core sustaining ~5K transactions/sec at <10ms p99 latency (PostgreSQL + Kafka), establishing the reliability and performance foundation for the trading platform.
  • Designed and implemented self-service experimentation infrastructure that reduced feature launch cycle time from days to <1 hour, enabling rapid product iteration and data-driven decision-making.

Staff Product Engineer, Enterprise FX Platform

Jan 2015 – Aug 2022

Broadridge Financial Solutions · Remote

  • Led staged modernization from monolithic .NET/WCF architecture to AWS microservices for an enterprise FX platform (UBS partnership; ~$2B AUM context), delivering incremental MVPs while onboarding pilot enterprise clients.
  • Architected Kafka-based orchestration spanning ~22 interdependent business processes across 7 financial institutions; designed for consistency/partition tolerance tradeoffs with full operational visibility and monitoring.
  • Enabled material revenue impact through reliability improvements, incremental platform adoption, and staged delivery methodology that de-risked enterprise client migrations.

Senior Product Engineer, Trading & Risk Platforms

Jul 2007 – Dec 2014

TwoFour Systems (acquired by Broadridge Jan 2015) · Remote

  • Founding engineer building trading and risk platforms for Tier-1 financial institutions in client-facing, on-site delivery roles; delivered real-time risk/margin capabilities as a key product differentiator.
  • Designed and shipped production risk calculation engines processing real-time market data feeds, supporting portfolio-level margin and exposure monitoring for institutional trading operations.

Education

Bachelor of Science, Computer Science, May 2007, 3.74 GPA

Binghamton University — Thomas J. Watson College of Engineering and Applied Science, Binghamton NY

Selected Projects

CoDA-GQA-L: Differential Attention Mechanism for LLM reducing KV-Cache VRAM 10–1,000x + 2 Triton Kernels. (info, paper, code)
Mnemos: Biomimetic Memory Architectures for Large Language Model Agents. (info, code, paper)
Cartograph: Repo analysis tool for agents, ships with skills for agents on how to use both the CLI and MCP. (info, code)
Safety-Lens: AI Safety Visualization Tool; see how models think, not just what they say. (code, paper)
Slipstream: 60–80% Agent Coordination Token Reduction Protocol + Models. (info, code, paper)
Argos-Swarm: Automated LLM red/blue teaming with evolutionary adversarial pipeline and swarm verification. (code)
Download Resume (PDF)