Exams of Cloud – Page 2 – Cloud Exams and Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps

Platform – using Prompt Flow for LLMOps – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps

Platform – using Prompt Flow for LLMOps Microsoft’s Azure Prompt Flow facilitates LLMOps integration for your organization, streamlining the operationalization of LLM applications and copilot […]

06/25/2023

Putting it all together – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps

Putting it all together Before we arrive at the last major section of this chapter to look at an actual case study and best practices, […]

04/12/2023

LLMOps best practices – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps

LLMOps best practices As we wrap up this final section, we know that successfully navigating the generative AI and LLM landscape requires effective practice. As […]

02/10/2023

Understanding limits – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies

Understanding limits Any large-scale cloud deployment needs to be “enterprise-ready,” ensuring both the end user experience is acceptable and the business objectives and requirements are […]

12/09/2022

Understanding TPM, RPM, and PTUs 2 – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies

RPM Beyond the TPM limit, an RPM rate limit is also enforced, where the amount of RPM available to a model is set proportionally to […]

09/09/2022

Understanding TPM, RPM, and PTUs – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies

Understanding TPM, RPM, and PTUs As we scale, we will need to understand some additional terminology, such as tokens per minute (TPM), request per minute […]

07/10/2022

Scaling Design patterns – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies

Scaling Design patterns One area we haven’t covered yet is how these multiple TPMs or PTU-based Azure OpenAI accounts can work in unison. That is, […]

05/10/2022

Retries with exponential backoff – the scaling special sauce – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies

Retries with exponential backoff – the scaling special sauce So, how do we control (or queue) messages when using multiple Azure OpenAI instances (accounts)? How […]

03/11/2022

Rate Limiting Policy in Azure API Management – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies

Rate Limiting Policy in Azure API Management Rate limiting in Azure API Management is a policy that restricts the number of requests a user can […]

01/09/2022

HTTP return codes – Deploying ChatGPT in the Cloud: Architecture Design and Scaling Strategies

HTTP return codes HTTP return codes, sometimes generically called “error codes” and briefly mentioned in the previous section, provide a way to validate. This is […]

11/11/2021

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30