AI Workloads on AKS

📄️Scaling AI Workloads with Ray

Learn how to deploy and scale distributed AI workloads using Ray on Azure Kubernetes Service (AKS). This lab covers Ray cluster setup, distributed machine learning training, and scaling AI inference workloads.

📄️Build RAG applications with KAITO RAGEngine

Retrieval Augmented Generation (RAG) is a powerful technique that combines the strengths of large language models (LLMs) with external knowledge sources. This approach enables more accurate and contextually relevant responses by retrieving information from knowledge bases or databases and using it to augment the LLM's output.

📄️Deploy AI Models with KAITO and Headlamp

Kubernetes AI Toolchain Operator (KAITO) is an open-source operator designed to automate AI/ML model inference and tuning workloads within Kubernetes clusters. It focuses on popular large models from Hugging Face such as Falcon, Phi-3, and more, while providing key capabilities including: