The AI world moves fast. GPT-4 is the king today, but Llama or Claude might be the king tomorrow. Using Connectors ensures your app is future-proof.
Allows you to use OpenAI models inside your own Azure tenant. Your data never leaves your private cloud and is never used to train the model. This is the only way to get AI through an Enterprise security review.
Using the HuggingFace connector, you can talk to thousands of specialized open-source models. For sensitive data, you can use local connectors like **Ollama** or **LlamaSharp** to run models entirely on your own GPU/CPU with zero internet connection.
You can define your connectors in appsettings.json and inject them into the kernel. This allows you to use a cheap local model for testing and a high-end Azure model for your paying customers without changing your code logic.
Q: "How do you handle 'Load Balancing' across multiple AI instances?"
Architect Answer: "We use a **Gateway/API Proxy** pattern. If the US-East Azure OpenAI instance has a quota of only 200 requests per minute, our SK connector can be configured to 'Failover' to US-West or Europe-North instances automatically. This ensures high availability even during regional AI outages or rate-limit spikes."