Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Foundations of Agentic Systems in Production
- Agentic architectures: loops, tools, memory, and orchestration layers.
- Agent lifecycle: development, deployment, and continuous operation.
- Challenges associated with managing agents at production scale.
Infrastructure and Deployment Models
- Deploying agents within containerized and cloud environments.
- Scaling patterns: horizontal vs. vertical scaling, concurrency, and throttling.
- Multi-agent orchestration and workload balancing.
Monitoring and Observability
- Key metrics: latency, success rate, memory usage, and agent call depth.
- Tracing agent activity and call graphs.
- Implementing observability using Prometheus, OpenTelemetry, and Grafana.
Logging, Auditing, and Compliance
- Centralized logging and structured event collection.
- Compliance and auditability within agentic workflows.
- Designing audit trails and replay mechanisms for debugging purposes.
Performance Tuning and Resource Optimization
- Reducing inference overhead and optimizing agent orchestration cycles.
- Model caching and lightweight embeddings for faster retrieval.
- Load testing and stress scenarios for AI pipelines.
Cost Control and Governance
- Understanding agent cost drivers: API calls, memory, compute, and external integrations.
- Tracking agent-level costs and implementing chargeback models.
- Automation policies to prevent agent sprawl and idle resource consumption.
CI/CD and Rollout Strategies for Agents
- Integrating agent pipelines into CI/CD systems.
- Testing, versioning, and rollback strategies for iterative agent updates.
- Progressive rollouts and safe deployment mechanisms.
Failure Recovery and Reliability Engineering
- Designing for fault tolerance and graceful degradation.
- Retry, timeout, and circuit breaker patterns for agent reliability.
- Incident response and post-mortem frameworks for AI operations.
Capstone Project
- Building and deploying an agentic AI system with comprehensive monitoring and cost tracking.
- Simulating load, measuring performance, and optimizing resource usage.
- Presenting the final architecture and monitoring dashboard to peers.
Summary and Next Steps
Requirements
- Solid grasp of MLOps and production machine learning systems.
- Hands-on experience with containerized deployments (Docker/Kubernetes).
- Familiarity with cloud cost optimization and observability tools.
Audience
- MLOps engineers.
- Site Reliability Engineers (SREs).
- Engineering managers responsible for AI infrastructure.
21 Hours
Testimonials (3)
The trainer is patient and very helpful. He knows the topic well.
CLIFFORD TABARES - Universal Leaf Philippines, Inc.
Course - Agentic AI for Business Automation: Use Cases & Integration
Good mixvof knowledge and practice
Ion Mironescu - Facultatea S.A.I.A.P.M.
Course - Agentic AI for Enterprise Applications
The mix of theory and practice and of high level and low level perspectives