Deploy LLMs Using Azure ML

A tutorial on how to use the Microsoft Azure ML catalog for deploying LLM endpoints as APIs and a comparison with AWS

Skanda Vivek
7 min readJul 21, 2023
Azure ML Model Catalog

I recently wrote an article about deploying LLMs as APIs using AWS — when someone commented that they would like a similar article but using Azure ML instead. So I decided to write this article discussing how to deploy LLM endpoints on Azure ML. I also compare the price, ease, and customizability for LLM deployments in Azure ML vs AWS.

Microsoft has been making extremely successful partnerships in the Generative AI era, including those with OpenAI, GitHub Copilot, etc. Recently, Microsoft CEO Satya Nadella gave a talk at Microsoft Inspire, highlighting an incredible vision of Generative AI giving us a more natural interface to virtually every application and a reasoning engine that works on top of all your data, giving users more power.

Satya Nadella at Microsoft Inspire

Microsoft has also partnered with open-source platform Hugging Face, to make many of their models accessible on Microsoft’s Azure cloud platform. They also make available the recently released Llama-2 model family, which has been the talk of the town over…