LangChain Like SK, LangChain is another open-source SDK application development framework and toolkit for building modern AI applications with LLMs. It provides out-of-the-box libraries and […]
Benefits of LLMOps
Autonomous agents – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps
Autonomous agents Autonomous agents are a more advanced implementation of standard agents (mentioned in previous section) and are evolving at a rapid pace. Autonomous agents […]
LLMOps – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps
LLMOps – Operationalizing LLM apps in production In this section, we aim to comprehend what LLMOps entails. We will then explore the lifecycle of LLMs, […]
Platform – using Prompt Flow for LLMOps – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps
Platform – using Prompt Flow for LLMOps Microsoft’s Azure Prompt Flow facilitates LLMOps integration for your organization, streamlining the operationalization of LLM applications and copilot […]
Putting it all together – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps
Putting it all together Before we arrive at the last major section of this chapter to look at an actual case study and best practices, […]
LLMOps best practices – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps
LLMOps best practices As we wrap up this final section, we know that successfully navigating the generative AI and LLM landscape requires effective practice. As […]
Scaling Design patterns – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies
Scaling Design patterns One area we haven’t covered yet is how these multiple TPMs or PTU-based Azure OpenAI accounts can work in unison. That is, […]
Retries with exponential backoff – the scaling special sauce – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies
Retries with exponential backoff – the scaling special sauce So, how do we control (or queue) messages when using multiple Azure OpenAI instances (accounts)? How […]
Rate Limiting Policy in Azure API Management – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies
Rate Limiting Policy in Azure API Management Rate limiting in Azure API Management is a policy that restricts the number of requests a user can […]
HTTP return codes – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies
HTTP return codes HTTP return codes, sometimes generically called “error codes” and briefly mentioned in the previous section, provide a way to validate. This is […]