Multimodal LLM Workflows in Vertex AI Training Course
Vertex AI offers robust tools for constructing multimodal LLM workflows that seamlessly integrate text, audio, and image data into a unified pipeline. Leveraging support for long context windows and configurable Gemini API parameters, the platform enables sophisticated applications focused on planning, reasoning, and cross-modal intelligence.
This instructor-led, live training, available online or on-site, targets intermediate to advanced practitioners aiming to design, build, and optimize multimodal AI workflows within Vertex AI.
Upon completion of this training, participants will be equipped to:
- Utilize Gemini models for processing multimodal inputs and generating outputs.
- Implement long-context workflows to facilitate complex reasoning tasks.
- Design pipelines that effectively combine text, audio, and image analysis.
- Optimize Gemini API parameters to enhance performance and ensure cost efficiency.
Course Format
- Interactive lectures and discussions.
- Practical labs focused on multimodal workflows.
- Project-based exercises applied to real-world multimodal use cases.
Customization Options
- For customized training arrangements, please contact us.
Course Outline
Introduction to Multimodal LLMs in Vertex AI
- Overview of multimodal capabilities in Vertex AI.
- Gemini models and supported modalities.
- Enterprise and research use cases.
Setting Up the Development Environment
- Configuring Vertex AI for multimodal workflows.
- Working with datasets across different modalities.
- Hands-on lab: environment setup and dataset preparation.
Long Context Windows and Advanced Reasoning
- Understanding long-context workflows.
- Use cases in planning and decision-making.
- Hands-on lab: implementing long-context analysis.
Cross-Modal Workflow Design
- Combining text, audio, and image analysis.
- Chaining multimodal steps within pipelines.
- Hands-on lab: designing a multimodal pipeline.
Working with Gemini API Parameters
- Configuring multimodal inputs and outputs.
- Optimizing inference and efficiency.
- Hands-on lab: tuning Gemini API parameters.
Advanced Applications and Integrations
- Interactive multimodal agents and assistants.
- Integrating external APIs and tools.
- Hands-on lab: building a multimodal application.
Evaluation and Iteration
- Testing multimodal performance.
- Metrics for accuracy, alignment, and drift.
- Hands-on lab: evaluating multimodal workflows.
Summary and Next Steps
Requirements
- Proficiency in Python programming.
- Experience in developing machine learning models.
- Familiarity with multimodal data types, including text, audio, and images.
Audience
- AI researchers.
- Advanced developers.
- Machine learning scientists.
Need help picking the right course?
Multimodal LLM Workflows in Vertex AI Training Course - Enquiry
Upcoming Courses
Related Courses
Advanced LangGraph: Optimization, Debugging, and Monitoring Complex Graphs
35 HoursLangGraph serves as a framework for constructing stateful, multi-actor LLM applications through composable graphs that maintain persistent state and provide precise control over execution flows.
This instructor-led, live training session, available both online and onsite, is designed for advanced AI platform engineers, AI-focused DevOps professionals, and ML architects seeking to optimize, debug, monitor, and operate production-grade LangGraph systems.
Upon completion of this training, participants will be capable of:
- Designing and optimizing complex LangGraph topologies to enhance speed, reduce costs, and improve scalability.
- Building system reliability through retries, timeouts, idempotency, and checkpoint-based recovery mechanisms.
- Debugging and tracing graph executions, inspecting state variables, and systematically reproducing issues encountered in production.
- Instrumenting graphs with logs, metrics, and traces; deploying to production environments; and monitoring SLAs and associated costs.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practical practice.
- Hands-on implementation within a live-lab environment.
Customization Options
- To request customized training for this course, please contact us to make arrangements.
Building Coding Agents with Devstral: From Agent Design to Tooling
14 HoursDevstral is an open-source framework specifically crafted for the creation and operation of coding agents. These agents interact with codebases, developer tools, and APIs to significantly boost engineering productivity.
This instructor-led live training, available online or on-site, targets intermediate to advanced ML engineers, developer-tooling teams, and Site Reliability Engineers (SREs) who aim to design, implement, and optimize coding agents leveraging Devstral.
Upon completion of this training, participants will be equipped to:
- Establish and configure Devstral for coding agent development.
- Design agentic workflows for exploring and modifying codebases.
- Integrate coding agents seamlessly with developer tools and APIs.
- Apply best practices for secure and efficient agent deployment.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practical application.
- Hands-on implementation within a live lab environment.
Customization Options
- To arrange customized training for this course, please contact us.
Open-Source Model Ops: Self-Hosting, Fine-Tuning and Governance with Devstral & Mistral Models
14 HoursDevstral and Mistral are open-source AI technologies engineered for flexible deployment, fine-tuning, and scalable integration.
This instructor-led live training (available online or onsite) is designed for intermediate to advanced machine learning engineers, platform teams, and research engineers who aim to self-host, fine-tune, and govern Mistral and Devstral models within production environments.
Upon completing this training, participants will be capable of:
- Setting up and configuring self-hosted environments for Mistral and Devstral models.
- Applying fine-tuning techniques to achieve domain-specific performance.
- Implementing versioning, monitoring, and lifecycle governance.
- Ensuring security, compliance, and responsible usage of open-source models.
Course Format
- Interactive lectures and discussions.
- Hands-on exercises focused on self-hosting and fine-tuning.
- Live-lab implementation of governance and monitoring pipelines.
Customization Options
- To arrange customized training for this course, please contact us to make the necessary arrangements.
LangGraph Applications in Finance
35 HoursLangGraph serves as a robust framework for developing stateful, multi-agent LLM applications through composable graphs, offering persistent state management and precise execution control.
This instructor-led live training, available both online and onsite, is designed for intermediate to advanced professionals seeking to design, implement, and manage LangGraph-based financial solutions that adhere to strict governance, observability, and compliance standards.
Upon completion of this training, participants will be able to:
- Design LangGraph workflows tailored to financial sectors, ensuring alignment with regulatory and audit requirements.
- Integrate financial data standards and ontologies into graph states and supporting tools.
- Implement robust reliability, safety mechanisms, and human-in-the-loop controls for critical operations.
- Deploy, monitor, and optimize LangGraph systems to meet performance, cost, and Service Level Agreement (SLA) targets.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practical practice sessions.
- Hands-on implementation within a live lab environment.
Customization Options
- To request a customized training program for this course, please contact us to arrange your schedule.
LangGraph Foundations: Graph-Based LLM Prompting and Chaining
14 HoursLangGraph serves as a framework designed for developing LLM applications structured as graphs, enabling capabilities such as planning, branching, tool utilization, memory management, and controlled execution.
This live, instructor-led training, available either online or onsite, is tailored for beginner-level developers, prompt engineers, and data practitioners aiming to design and construct reliable, multi-step LLM workflows using LangGraph.
Upon completing this training, participants will be equipped to:
- Articulate core LangGraph concepts, including nodes, edges, and state, and understand their appropriate applications.
- Construct prompt chains that support branching, tool invocation, and memory retention.
- Integrate retrieval mechanisms and external APIs into graph-based workflows.
- Test, debug, and evaluate LangGraph applications to ensure reliability and safety.
Course Format
- Interactive lectures accompanied by facilitated discussions.
- Guided labs and code walkthroughs conducted within a sandbox environment.
- Scenario-based exercises focusing on design, testing, and evaluation.
Customization Options for the Course
- To arrange customized training for this course, please reach out to us.
LangGraph in Healthcare: Workflow Orchestration for Regulated Environments
35 HoursLangGraph empowers the creation of stateful, multi-actor workflows driven by Large Language Models (LLMs), offering precise control over execution paths and state persistence. These capabilities are essential in the healthcare sector for ensuring compliance, enabling interoperability, and developing decision-support systems that seamlessly align with medical workflows.
This instructor-led training, available online or onsite, is designed for intermediate to advanced professionals seeking to design, implement, and manage LangGraph-based healthcare solutions while navigating regulatory, ethical, and operational challenges.
Upon completion of this training, participants will be equipped to:
- Design healthcare-specific LangGraph workflows with a strong focus on compliance and auditability.
- Integrate LangGraph applications with medical ontologies and standards, including FHIR, SNOMED CT, and ICD.
- Apply best practices for reliability, traceability, and explainability in sensitive environments.
- Deploy, monitor, and validate LangGraph applications within healthcare production settings.
Course Format
- Interactive lectures and discussions.
- Hands-on exercises featuring real-world case studies.
- Practical implementation in a live-lab environment.
Course Customization Options
- To arrange customized training for this course, please contact us.
LangGraph for Legal Applications
35 HoursLangGraph is a framework designed for constructing stateful, multi-agent LLM applications as composable graphs, featuring persistent state and precise control over execution.
This instructor-led, live training (available online or onsite) targets intermediate to advanced professionals seeking to design, implement, and manage LangGraph-based legal solutions equipped with necessary compliance, traceability, and governance controls.
Upon completion of this training, participants will be capable of:
- Designing legal-specific LangGraph workflows that maintain auditability and compliance.
- Integrating legal ontologies and document standards into graph state and processing.
- Implementing guardrails, human-in-the-loop approvals, and traceable decision paths.
- Deploying, monitoring, and maintaining LangGraph services in production with observability and cost controls.
Course Format
- Interactive lectures and discussions.
- Numerous exercises and practice sessions.
- Hands-on implementation within a live-lab environment.
Course Customization Options
- For customized training arrangements for this course, please contact us.
Building Dynamic Workflows with LangGraph and LLM Agents
14 HoursLangGraph serves as a framework designed for composing graph-structured LLM workflows, supporting features such as branching, tool utilization, memory management, and controllable execution.
This instructor-led live training, available both online and onsite, targets intermediate-level engineers and product teams seeking to integrate LangGraph’s graph logic with LLM agent loops. The goal is to develop dynamic, context-aware applications, including customer support agents, decision trees, and information retrieval systems.
Upon completion of this training, participants will be capable of:
- Designing graph-based workflows that effectively coordinate LLM agents, tools, and memory.
- Implementing conditional routing, retries, and fallback mechanisms to ensure robust execution.
- Integrating retrieval processes, APIs, and structured outputs into agent loops.
- Evaluating, monitoring, and securing agent behavior to enhance reliability and safety.
Course Format
- Interactive lectures accompanied by facilitated discussions.
- Guided labs and code walkthroughs conducted within a sandbox environment.
- Scenario-based design exercises and peer reviews.
Customization Options
- For customized training arrangements, please contact us directly.
LangGraph for Marketing Automation
14 HoursLangGraph operates as a graph-based orchestration framework designed to facilitate conditional, multi-step workflows involving Large Language Models (LLMs) and tools, making it highly suitable for automating and personalizing content pipelines.
This live, instructor-led training, available both online and on-site, targets intermediate-level marketers, content strategists, and automation developers eager to implement dynamic, branching email campaigns and content generation pipelines using LangGraph.
Upon completion of this training, participants will be equipped to:
- Design graph-structured workflows for content and email campaigns that incorporate conditional logic.
- Integrate LLMs, APIs, and various data sources to enable automated personalization.
- Effectively manage state, memory, and context throughout multi-step campaigns.
- Evaluate, monitor, and optimize workflow performance to improve delivery outcomes.
Course Format
- Interactive lectures paired with group discussions.
- Practical hands-on labs focused on implementing email workflows and content pipelines.
- Scenario-based exercises addressing personalization, segmentation, and branching logic.
Course Customization Options
- For organizations seeking tailored training, please reach out to us to arrange a customized session.
Le Chat Enterprise: Private ChatOps, Integrations & Admin Controls
14 HoursLe Chat Enterprise offers a secure, customizable, and governed conversational AI solution designed for organizational use, featuring support for RBAC, SSO, connectors, and enterprise app integrations.
This instructor-led live training (available online or onsite) is designed for intermediate-level product managers, IT leads, solution engineers, and security/compliance teams who wish to deploy, configure, and govern Le Chat Enterprise in enterprise environments.
By the end of this training, participants will be able to:
- Set up and configure Le Chat Enterprise for secure deployments.
- Enable RBAC, SSO, and compliance-driven controls.
- Integrate Le Chat with enterprise applications and data stores.
- Design and implement governance and admin playbooks for ChatOps.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Cost-Effective LLM Architectures: Mistral at Scale (Performance / Cost Engineering)
14 HoursMistral is a high-performance suite of large language models designed for cost-effective production deployment at scale.
This instructor-led, live training (available online or onsite) is tailored for advanced-level infrastructure engineers, cloud architects, and MLOps leads who aim to design, deploy, and optimize Mistral-based architectures to achieve maximum throughput while minimizing costs.
Upon completing this training, participants will be able to:
- Implement scalable deployment patterns for Mistral Medium 3.
- Apply batching, quantization, and efficient serving strategies.
- Optimize inference costs without compromising performance.
- Design production-ready serving topologies for enterprise workloads.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practical sessions.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Productizing Conversational Assistants with Mistral Connectors & Integrations
14 HoursMistral AI provides an open AI platform that empowers teams to build and embed conversational assistants within enterprise operations and customer-facing workflows.
This instructor-led training, available online or onsite, is designed for product managers, full-stack developers, and integration engineers at the beginner to intermediate level who aim to design, integrate, and productize conversational assistants using Mistral’s connectors and integrations.
Upon completing this training, participants will be able to:
- Connect Mistral conversational models with enterprise and SaaS connectors.
- Implement retrieval-augmented generation (RAG) to ensure grounded, accurate responses.
- Create UX patterns for both internal and external chat assistants.
- Deploy assistants into product workflows to address real-world use cases.
Course Format
- Interactive lectures and discussions.
- Practical hands-on integration exercises.
- Live lab sessions for developing conversational assistants.
Customization Options
- To request customized training for this course, please contact us to arrange.
Enterprise-Grade Deployments with Mistral Medium 3
14 HoursMistral Medium 3 is a high-performance, multimodal large language model designed for production-grade deployment across enterprise environments.
This instructor-led, live training (online or onsite) is aimed at intermediate-level to advanced-level AI/ML engineers, platform architects, and MLOps teams who wish to deploy, optimize, and secure Mistral Medium 3 for enterprise use cases.
By the end of this training, participants will be able to:
- Deploy Mistral Medium 3 using API and self-hosted options.
- Optimize inference performance and costs.
- Implement multimodal use cases with Mistral Medium 3.
- Apply security and compliance best practices for enterprise environments.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Mistral for Responsible AI: Privacy, Data Residency & Enterprise Controls
14 HoursMistral AI serves as an open, enterprise-ready AI platform equipped with features designed to facilitate secure, compliant, and responsible AI deployment.
This instructor-led live training, available both online and onsite, targets intermediate-level compliance leads, security architects, and legal or operations stakeholders aiming to adopt responsible AI practices. Participants will learn to leverage privacy mechanisms, data residency options, and enterprise control frameworks within the Mistral ecosystem.
Upon completing this training, participants will be equipped to:
- Deploy privacy-preserving techniques within Mistral environments.
- Execute data residency strategies that satisfy regulatory mandates.
- Configure enterprise-grade controls, including Role-Based Access Control (RBAC), Single Sign-On (SSO), and comprehensive audit logging.
- Assess vendor and deployment alternatives to ensure alignment with compliance standards.
Course Format
- Interactive lectures and group discussions.
- Case studies and exercises focused on compliance.
- Practical, hands-on implementation of enterprise AI controls.
Customization Options
- For customized training solutions tailored to your organization, please contact us to arrange.
Multimodal Applications with Mistral Models (Vision, OCR, & Document Understanding)
14 HoursMistral models are open-source AI technologies that now extend into multimodal workflows, supporting both language and vision tasks for enterprise and research applications.
This instructor-led, live training (online or onsite) is aimed at intermediate-level ML researchers, applied engineers, and product teams who wish to build multimodal applications with Mistral models, including OCR and document understanding pipelines.
By the end of this training, participants will be able to:
- Set up and configure Mistral models for multimodal tasks.
- Implement OCR workflows and integrate them with NLP pipelines.
- Design document understanding applications for enterprise use cases.
- Develop vision-text search and assistive UI functionalities.
Format of the Course
- Interactive lecture and discussion.
- Hands-on coding exercises.
- Live-lab implementation of multimodal pipelines.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.