LangChain Like SK, LangChain is another open-source SDK application development framework and toolkit for building modern AI applications with LLMs. It provides out-of-the-box libraries and […]
LLMOps best practices
Agent collaboration frameworks – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps
Agent collaboration frameworks In this chapter, we have covered generative AI from the perspectives of developers and operations by introducing programming development frameworks and many […]
LLM lifecycle management – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps
LLM lifecycle management LLM lifecycle management is a fairly young concept; however, one fact remains, the LLM lifecycle covers quite a few discipline areas. It […]
Platform – using Prompt Flow for LLMOps – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps
Platform – using Prompt Flow for LLMOps Microsoft’s Azure Prompt Flow facilitates LLMOps integration for your organization, streamlining the operationalization of LLM applications and copilot […]
Understanding limits – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies
Understanding limits Any large-scale cloud deployment needs to be “enterprise-ready,” ensuring both the end user experience is acceptable and the business objectives and requirements are […]
Understanding TPM, RPM, and PTUs 2 – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies
RPM Beyond the TPM limit, an RPM rate limit is also enforced, where the amount of RPM available to a model is set proportionally to […]
Scaling Design patterns – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies
Scaling Design patterns One area we haven’t covered yet is how these multiple TPMs or PTU-based Azure OpenAI accounts can work in unison. That is, […]
HTTP return codes – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies
HTTP return codes HTTP return codes, sometimes generically called “error codes” and briefly mentioned in the previous section, provide a way to validate. This is […]
Costs, training and support – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies
Costs, training and support To round off this chapter on deploying ChatGPT in the cloud with architecture design and scaling strategies, three additional areas are […]
Application Layer – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies
Application Layer Infrastructure Layer Note: We advise implementing a telemetry solution early to monitor your application’s token usage for prompts and completions. This allows for […]