What's new in Azure OpenAI Service

Article
06/19/2024

This article provides a summary of the latest releases and major documentation updates for Azure OpenAI.

June 2024

Retirement date updates

Updated gpt-35-turbo 0301 retirement date to no earlier than October 1, 2024.
Updated gpt-35-turbo & gpt-35-turbo-16k0613 retirement date to October 1, 2024.
Updated gpt-4 & gpt-4-32k 0314 deprecation date to October 1, 2024, and retirement date to June 6, 2025.

Refer to our model retirement guide for the latest information on model deprecation and retirement.

Token based billing for fine-tuning

Azure OpenAI fine-tuning billing is now based on the number of tokens in your training file – instead of the total elapsed training time. This can result in a significant cost reduction for some training runs, and makes estimating fine-tuning costs much easier. To learn more, you can consult the official announcement.

GPT-4o released in new regions

GPT-4o is now also available in:
- Sweden Central for standard regional deployment.
- Australia East, Canada East, Japan East, Korea Central, Sweden Central, Switzerland North, & West US 3 for provisioned deployment.

For the latest information on model availability, see the models page.

Customer-managed key (CMK) support for Assistants

Threads and Files in Assistants now supports CMK in the following region:

West US 3

May 2024

GPT-4o provisioned deployments

gpt-4o Version: 2024-05-13 is available for both standard and provisioned deployments. Provisioned and standard model deployments accept both text and image/vision inference requests. For information on model regional availability consult the model matrix for provisioned deployments.

Assistants v2 (preview)

A refresh of the Assistants API is now publicly available. It contains the following updates:

File search tool and vector storage
Max completion and max prompt token support for managing token usage.
tool_choice parameter for forcing the Assistant to use a specified tool. You can now create messages with the assistant role to create custom conversation histories in Threads.
Support for temperature, top_p, response_format parameters.
Streaming and polling support. You can use the helper functions in our Python SDK to create runs and stream responses. We have also added polling SDK helpers to share object status updates without the need for polling.
Experiment with Logic Apps and Function Calling using Azure OpenAI Studio. Import your REST APIs implemented in Logic Apps as functions and the studio invokes the function (as a Logic Apps workflow) automatically based on the user prompt.
AutoGen by Microsoft Research provides a multi-agent conversation framework to enable convenient building of Large Language Model (LLM) workflows across a wide range of applications. Azure OpenAI assistants are now integrated into AutoGen via GPTAssistantAgent, a new experimental agent that lets you seamlessly add Assistants into AutoGen-based multi-agent workflows. This enables multiple Azure OpenAI assistants that could be task or domain specialized to collaborate and tackle complex tasks.
Support for fine-tuned gpt-3.5-turbo-0125 models in the following regions:
- East US 2
- Sweden Central
Expanded regional support for:
- Japan East
- UK South
- West US
- West US 3
- Norway east

For more information, see the blog post about assistants.

GPT-4o model general availability (GA)

GPT-4o ("o is for "omni") is the latest model from OpenAI launched on May 13, 2024.

GPT-4o integrates text, and images in a single model, enabling it to handle multiple data types simultaneously. This multimodal approach enhances accuracy and responsiveness in human-computer interactions.
GPT-4o matches GPT-4 Turbo in English text and coding tasks while offering superior performance in non-English languages and in vision tasks, setting new benchmarks for AI capabilities.

For information on model regional availability, see the models page.

Global standard deployment type (preview)

Global deployments are available in the same Azure OpenAI resources as non-global offers but allow you to leverage Azure's global infrastructure to dynamically route traffic to the data center with best availability for each request. Global standard will provide the highest default quota for new models and eliminates the need to load balance across multiple resources.

For more information, see the deployment types guide.

Fine-tuning updates

GPT-4 fine-tuning is now available in public preview.
Added support for seed, events, full validation statistics, and checkpoints as part of the 2024-05-01-preview API release.

DALL-E and GPT-4 Turbo Vision GA configurable content filters

Create custom content filters for your DALL-E 2 and 3 and GPT-4 Turbo with Vision GA (gpt-4-turbo-2024-04-09) deployments. Content filtering

Asynchronous Filter available for all Azure OpenAI customers

Running filters asynchronously for improved latency in streaming scenarios is now available for all Azure OpenAI customers. Content filtering

Prompt Shields

Prompt Shields protect applications powered by Azure OpenAI models from two types of attacks: direct (jailbreak) and indirect attacks. Indirect Attacks (also known as Indirect Prompt Attacks or Cross-Domain Prompt Injection Attacks) are a type of attack on systems powered by Generative AI models that may occur when an application processes information that wasn’t directly authored by either the developer of the application or the user. Content filtering

2024-05-01-preview API release

For more information, see the API version lifecycle.

GPT-4 Turbo model general availability (GA)

The latest GA release of GPT-4 Turbo is:

gpt-4 Version: turbo-2024-04-09

This is the replacement for the following preview models:

gpt-4 Version: 1106-Preview
gpt-4 Version: 0125-Preview
gpt-4 Version: vision-preview

Differences between OpenAI and Azure OpenAI GPT-4 Turbo GA Models

OpenAI's version of the latest 0409 turbo model supports JSON mode and function calling for all inference requests.
Azure OpenAI's version of the latest turbo-2024-04-09 currently doesn't support the use of JSON mode and function calling when making inference requests with image (vision) input. Text based input requests (requests without image_url and inline images) do support JSON mode and function calling.

Differences from gpt-4 vision-preview

Azure AI specific Vision enhancements integration with GPT-4 Turbo with Vision aren't supported for gpt-4 Version: turbo-2024-04-09. This includes Optical Character Recognition (OCR), object grounding, video prompts, and improved handling of your data with images.

GPT-4 Turbo provisioned managed availability

gpt-4 Version: turbo-2024-04-09 is available for both standard and provisioned deployments. Currently the provisioned version of this model doesn't support image/vision inference requests. Provisioned deployments of this model only accept text input. Standard model deployments accept both text and image/vision inference requests.

Region availability

For information on model regional availability consult the model matrix for standard, and provisioned deployments.

Deploying GPT-4 Turbo with Vision GA

To deploy the GA model from the Studio UI, select GPT-4 and then choose the turbo-2024-04-09 version from the dropdown menu. The default quota for the gpt-4-turbo-2024-04-09 model will be the same as current quota for GPT-4-Turbo. See the regional quota limits.

April 2024

Fine-tuning is now supported in two new regions East US 2 and Switzerland West

Fine-tuning is now available with support for:

East US 2

gpt-35-turbo (0613)
gpt-35-turbo (1106)
gpt-35-turbo (0125)

Switzerland West

babbage-002
davinci-002
gpt-35-turbo (0613)
gpt-35-turbo (1106)
gpt-35-turbo (0125)

Check the models page, for the latest information on model availability and fine-tuning support in each region.

Multi-turn chat training examples

Fine-tuning now supports multi-turn chat training examples.

GPT-4 (0125) is available for Azure OpenAI On Your Data

You can now use the GPT-4 (0125) model in available regions with Azure OpenAI On Your Data.

March 2024

Risks & Safety monitoring in Azure OpenAI Studio

Azure OpenAI Studio now provides a Risks & Safety dashboard for each of your deployments that uses a content filter configuration. Use it to check the results of the filtering activity. Then you can adjust your filter configuration to better serve your business needs and meet Responsible AI principles.

Use Risks & Safety monitoring

Azure OpenAI On Your Data updates

You can now connect to an Elasticsearch vector database to be used with Azure OpenAI On Your Data.
You can use the chunk size parameter during data ingestion to set the maximum number of tokens of any given chunk of data in your index.

2024-02-01 general availability (GA) API released

This is the latest GA API release and is the replacement for the previous 2023-05-15 GA release. This release adds support for the latest Azure OpenAI GA features like Whisper, DALLE-3, fine-tuning, on your data, etc.

Features that are still in preview such as Assistants, text to speech (TTS), certain on your data datasources, still require a preview API version. For more information check out our API version lifecycle guide.

Whisper general availability (GA)

The Whisper speech to text model is now GA for both REST and Python. Client library SDKs are currently still in public preview.

Try out Whisper by following a quickstart.

DALL-E 3 general availability (GA)

DALL-E 3 image generation model is now GA for both REST and Python. Client library SDKs are currently still in public preview.

Try out DALL-E 3 by following a quickstart.

New regional support for DALL-E 3

You can now access DALL-E 3 with an Azure OpenAI resource in the East US or AustraliaEast Azure region, in addition to SwedenCentral.

Model deprecations and retirements

We have added a page to track model deprecations and retirements in Azure OpenAI Service. This page provides information about the models that are currently available, deprecated, and retired.

2024-03-01-preview API released

2024-03-01-preview has all the same functionality as 2024-02-15-preview and adds two new parameters for embeddings:

encoding_format allows you to specify the format to generate embeddings in float, or base64. The default is float.
dimensions allows you set the number of output embeddings. This parameter is only supported with the new third generation embeddings models: text-embedding-3-large, text-embedding-3-small. Typically larger embeddings are more expensive from a compute, memory, and storage perspective. Being able to adjust the number of dimensions allows more control over overall cost and performance. The dimensions parameter is not supported in all versions of the OpenAI 1.x Python library, to take advantage of this parameter we recommend upgrading to the latest version: pip install openai --upgrade.

If you are currently using a preview API version to take advantage of the latest features, we recommend consulting the API version lifecycle article to track how long your current API version will be supported.

Update to GPT-4-1106-Preview upgrade plans

The deployment upgrade of gpt-4 1106-Preview to gpt-4 0125-Preview scheduled for March 8, 2024 is no longer taking place. Deployments of gpt-4 versions 1106-Preview and 0125-Preview set to "Auto-update to default" and "Upgrade when expired" will start to be upgraded after a stable version of the model is released.

For more information on the upgrade process refer to the models page.

February 2024

GPT-3.5-turbo-0125 model available

This model has various improvements, including higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls.

For information on model regional availability and upgrades refer to the models page.

Third generation embeddings models available

text-embedding-3-large
text-embedding-3-small

In testing, OpenAI reports both the large and small third generation embeddings models offer better average multi-language retrieval performance with the MIRACL benchmark while still maintaining better performance for English tasks with the MTEB benchmark than the second generation text-embedding-ada-002 model.

For information on model regional availability and upgrades refer to the models page.

GPT-3.5 Turbo quota consolidation

To simplify migration between different versions of the GPT-3.5-Turbo models (including 16k), we will be consolidating all GPT-3.5-Turbo quota into a single quota value.

Any customers who have increased quota approved will have combined total quota that reflects the previous increases.
Any customer whose current total usage across model versions is less than the default will get a new combined total quota by default.

GPT-4-0125-preview model available

The gpt-4 model version 0125-preview is now available on Azure OpenAI Service in the East US, North Central US, and South Central US regions. Customers with deployments of gpt-4 version 1106-preview will be automatically upgraded to 0125-preview in the coming weeks.

For information on model regional availability and upgrades refer to the models page.

Assistants API public preview

Azure OpenAI now supports the API that powers OpenAI's GPTs. Azure OpenAI Assistants (Preview) allows you to create AI assistants tailored to your needs through custom instructions and advanced tools like code interpreter, and custom functions. To learn more, see:

OpenAI text to speech voices public preview

Azure OpenAI Service now supports text to speech APIs with OpenAI's voices. Get AI-generated speech from the text you provide. To learn more, see the overview guide and try the quickstart.

Note

Azure AI Speech also supports OpenAI text to speech voices. To learn more, see OpenAI text to speech voices via Azure OpenAI Service or via Azure AI Speech guide.

New Fine-tuning capabilities and model support

New regional support for Azure OpenAI On Your Data

You can now use Azure OpenAI On Your Data in the following Azure region:

South Africa North

Azure OpenAI On Your Data general availability

Azure OpenAI On Your Data is now generally available.

December 2023

Azure OpenAI On Your Data

Full VPN and private endpoint support for Azure OpenAI On Your Data, including security support for: storage accounts, Azure OpenAI resources, and Azure AI Search service resources.
New article for using Azure OpenAI On Your Data securely by protecting data with virtual networks and private endpoints.

GPT-4 Turbo with Vision now available

GPT-4 Turbo with Vision on Azure OpenAI service is now in public preview. GPT-4 Turbo with Vision is a large multimodal model (LMM) developed by OpenAI that can analyze images and provide textual responses to questions about them. It incorporates both natural language processing and visual understanding. With enhanced mode, you can use the Azure AI Vision features to generate additional insights from the images.

Explore the capabilities of GPT-4 Turbo with Vision in a no-code experience using the Azure OpenAI Playground. Learn more in the Quickstart guide.
Vision enhancement using GPT-4 Turbo with Vision is now available in the Azure OpenAI Playground and includes support for Optical Character Recognition, object grounding, image support for "add your data," and support for video prompt.
Make calls to the chat API directly using the REST API.
Region availability is currently limited to SwitzerlandNorth, SwedenCentral, WestUS, and AustraliaEast
Learn more about the known limitations of GPT-4 Turbo with Vision and other frequently asked questions.

November 2023

New data source support in Azure OpenAI On Your Data

You can now use Azure Cosmos DB for MongoDB vCore as well as URLs/web addresses as data sources to ingest your data and chat with a supported Azure OpenAI model.

GPT-4 Turbo Preview & GPT-3.5-Turbo-1106 released

Both models are the latest release from OpenAI with improved instruction following, JSON mode, reproducible output, and parallel function calling.

GPT-4 Turbo Preview has a max context window of 128,000 tokens and can generate 4,096 output tokens. It has the latest training data with knowledge up to April 2023. This model is in preview and is not recommended for production use. All deployments of this preview model will be automatically updated in place once the stable release becomes available.
GPT-3.5-Turbo-1106 has a max context window of 16,385 tokens and can generate 4,096 output tokens.

For information on model regional availability consult the models page.

The models have their own unique per region quota allocations.

DALL-E 3 public preview

DALL-E 3 is the latest image generation model from OpenAI. It features enhanced image quality, more complex scenes, and improved performance when rendering text in images. It also comes with more aspect ratio options. DALL-E 3 is available through OpenAI Studio and through the REST API. Your OpenAI resource must be in the SwedenCentral Azure region.

DALL-E 3 includes built-in prompt rewriting to enhance images, reduce bias, and increase natural variation.

Try out DALL-E 3 by following a quickstart.

Responsible AI

Expanded customer configurability: All Azure OpenAI customers can now configure all severity levels (low, medium, high) for the categories hate, violence, sexual and self-harm, including filtering only high severity content. Configure content filters
Content Credentials in all DALL-E models: AI-generated images from all DALL-E models now include a digital credential that discloses the content as AI-generated. Applications that display image assets can leverage the open source Content Authenticity Initiative SDK to display credentials in their AI generated images. Content Credentials in Azure OpenAI
New RAI models
- Jailbreak risk detection: Jailbreak attacks are user prompts designed to provoke the Generative AI model into exhibiting behaviors it was trained to avoid or to break the rules set in the System Message. The jailbreak risk detection model is optional (default off), and available in annotate and filter model. It runs on user prompts.
- Protected material text: Protected material text describes known text content (for example, song lyrics, articles, recipes, and selected web content) that can be outputted by large language models. The protected material text model is optional (default off), and available in annotate and filter model. It runs on LLM completions.
- Protected material code: Protected material code describes source code that matches a set of source code from public repositories, which can be outputted by large language models without proper citation of source repositories. The protected material code model is optional (default off), and available in annotate and filter model. It runs on LLM completions.
Configure content filters
Blocklists: Customers can now quickly customize content filter behavior for prompts and completions further by creating a custom blocklist in their filters. The custom blocklist allows the filter to take action on a customized list of patterns, such as specific terms or regex patterns. In addition to custom blocklists, we provide a Microsoft profanity blocklist (English). Use blocklists

October 2023

New fine-tuning models (preview)

gpt-35-turbo-0613 is now available for fine-tuning.
babbage-002 and davinci-002 are now available for fine-tuning. These models replace the legacy ada, babbage, curie, and davinci base models that were previously available for fine-tuning.
Fine-tuning availability is limited to certain regions. Check the models page, for the latest information on model availability in each region.
Fine-tuned models have different quota limits than regular models.
Tutorial: fine-tuning GPT-3.5-Turbo

Azure OpenAI On Your Data

New custom parameters for determining the number of retrieved documents and strictness.
- The strictness setting sets the threshold to categorize documents as relevant to your queries.
- The retrieved documents setting specifies the number of top-scoring documents from your data index used to generate responses.
You can see data ingestion/upload status in the Azure OpenAI Studio.
Support for private endpoints & VPNs for blob containers.

September 2023

GPT-4

GPT-4 and GPT-4-32k are now available to all Azure OpenAI Service customers. Customers no longer need to apply for the waitlist to use GPT-4 and GPT-4-32k (the Limited Access registration requirements continue to apply for all Azure OpenAI models). Availability might vary by region. Check the models page, for the latest information on model availability in each region.

GPT-3.5 Turbo Instruct

Azure OpenAI Service now supports the GPT-3.5 Turbo Instruct model. This model has performance comparable to text-davinci-003 and is available to use with the Completions API. Check the models page, for the latest information on model availability in each region.

Whisper public preview

Azure OpenAI Service now supports speech to text APIs powered by OpenAI's Whisper model. Get AI-generated text based on the speech audio you provide. To learn more, check out the quickstart.

Note

Azure AI Speech also supports OpenAI's Whisper model via the batch transcription API. To learn more, check out the Create a batch transcription guide. Check out What is the Whisper model? to learn more about when to use Azure AI Speech vs. Azure OpenAI Service.

New Regions

Azure OpenAI is now also available in the Sweden Central, and Switzerland North regions. Check the models page, for the latest information on model availability in each region.

Regional quota limits increases

Increases to the max default quota limits for certain models and regions. Migrating workloads to these models and regions will allow you to take advantage of higher Tokens per minute (TPM).

August 2023

Azure OpenAI on your own data (preview) updates

You can now deploy Azure OpenAI On Your Data to Power Virtual Agents.
Azure OpenAI On Your Data now supports private endpoints.
Ability to filter access to sensitive documents.
Automatically refresh your index on a schedule.
Vector search and semantic search options.
View your chat history in the deployed web app

July 2023

Support for function calling

Azure OpenAI now supports function calling to enable you to work with functions in the chat completions API.

Embedding input array increase

Azure OpenAI now supports arrays with up to 16 inputs per API request with text-embedding-ada-002 Version 2.

New Regions

Azure OpenAI is now also available in the Canada East, East US 2, Japan East, and North Central US regions. Check the models page, for the latest information on model availability in each region.

June 2023

Use Azure OpenAI on your own data (preview)

Azure OpenAI On Your Data is now available in preview, enabling you to chat with OpenAI models such as GPT-35-Turbo and GPT-4 and receive responses based on your data.

New versions of gpt-35-turbo and gpt-4 models

gpt-35-turbo (version 0613)
gpt-35-turbo-16k (version 0613)
gpt-4 (version 0613)
gpt-4-32k (version 0613)

UK South

Azure OpenAI is now available in the UK South region. Check the models page, for the latest information on model availability in each region.

Content filtering & annotations (Preview)

How to configure content filters with Azure OpenAI Service.
Enable annotations to view content filtering category and severity information as part of your GPT based Completion and Chat Completion calls.

Quota

Quota provides the flexibility to actively manage the allocation of rate limits across the deployments within your subscription.

May 2023

Java & JavaScript SDK support

NEW Azure OpenAI preview SDKs offering support for JavaScript and Java.

Azure OpenAI Chat Completion General Availability (GA)

General availability support for:
- Chat Completion API version 2023-05-15.
- GPT-35-Turbo models.
- GPT-4 model series.

If you are currently using the 2023-03-15-preview API, we recommend migrating to the GA 2023-05-15 API. If you are currently using API version 2022-12-01 this API remains GA, but does not include the latest Chat Completion capabilities.

Important

Using the current versions of the GPT-35-Turbo models with the completion endpoint remains in preview.

France Central

Azure OpenAI is now available in the France Central region. Check the models page, for the latest information on model availability in each region.

April 2023

DALL-E 2 public preview. Azure OpenAI Service now supports image generation APIs powered by OpenAI's DALL-E 2 model. Get AI-generated images based on the descriptive text you provide. To learn more, check out the quickstart. To request access, existing Azure OpenAI customers can apply by filling out this form.
Inactive deployments of customized models will now be deleted after 15 days; models will remain available for redeployment. If a customized (fine-tuned) model is deployed for more than fifteen (15) days during which no completions or chat completions calls are made to it, the deployment will automatically be deleted (and no further hosting charges will be incurred for that deployment). The underlying customized model will remain available and can be redeployed at any time. To learn more check out the how-to-article.

March 2023

GPT-4 series models are now available in preview on Azure OpenAI. To request access, existing Azure OpenAI customers can apply by filling out this form. These models are currently available in the East US and South Central US regions.
New Chat Completion API for GPT-35-Turbo and GPT-4 models released in preview on 3/21. To learn more checkout the updated quickstarts and how-to article.
GPT-35-Turbo preview. To learn more checkout the how-to article.
Increased training limits for fine-tuning: The max training job size (tokens in training file) x (# of epochs) is 2 Billion tokens for all models. We have also increased the max training job from 120 to 720 hours.
Adding additional use cases to your existing access. Previously, the process for adding new use cases required customers to reapply to the service. Now, we're releasing a new process that allows you to quickly add new use cases to your use of the service. This process follows the established Limited Access process within Azure AI services. Existing customers can attest to any and all new use cases here. Please note that this is required anytime you would like to use the service for a new use case you did not originally apply for.

February 2023

New Features

.NET SDK(inference) preview release | Samples
Terraform SDK update to support Azure OpenAI management operations.
Inserting text at the end of a completion is now supported with the suffix parameter.

Updates

Content filtering is on by default.

New articles on:

New training course:

Intro to Azure OpenAI

January 2023

New Features

Service GA. Azure OpenAI Service is now generally available.
New models: Addition of the latest text model, text-davinci-003 (East US, West Europe), text-ada-embeddings-002 (East US, South Central US, West Europe)

December 2022

New features

The latest models from OpenAI. Azure OpenAI provides access to all the latest models including the GPT-3.5 series.
New API version (2022-12-01). This update includes several requested enhancements including token usage information in the API response, improved error messages for files, alignment with OpenAI on fine-tuning creation data structure, and support for the suffix parameter to allow custom naming of fine-tuned jobs.
Higher request per second limits. 50 for non-Davinci models. 20 for Davinci models.
Faster fine-tune deployments. Deploy an Ada and Curie fine-tuned models in under 10 minutes.
Higher training limits: 40M training tokens for Ada, Babbage, and Curie. 10M for Davinci.
Process for requesting modifications to the abuse & miss-use data logging & human review. Today, the service logs request/response data for the purposes of abuse and misuse detection to ensure that these powerful models aren't abused. However, many customers have strict data privacy and security requirements that require greater control over their data. To support these use cases, we're releasing a new process for customers to modify the content filtering policies or turn off the abuse logging for low-risk use cases. This process follows the established Limited Access process within Azure AI services and existing OpenAI customers can apply here.
Customer managed key (CMK) encryption. CMK provides customers greater control over managing their data in Azure OpenAI by providing their own encryption keys used for storing training data and customized models. Customer-managed keys (CMK), also known as bring your own key (BYOK), offer greater flexibility to create, rotate, disable, and revoke access controls. You can also audit the encryption keys used to protect your data. Learn more from our encryption at rest documentation.
Lockbox support
SOC-2 compliance
Logging and diagnostics through Azure Resource Health, Cost Analysis, and Metrics & Diagnostic settings.
Studio improvements. Numerous usability improvements to the Studio workflow including Azure AD role support to control who in the team has access to create fine-tuned models and deploy.

Changes (breaking)

Fine-tuning create API request has been updated to match OpenAI’s schema.

Preview API versions:

{
    "training_file": "file-XGinujblHPwGLSztz8cPS8XY",
    "hyperparams": { 
        "batch_size": 4,
        "learning_rate_multiplier": 0.1,
        "n_epochs": 4,
        "prompt_loss_weight": 0.1,
    }
}

API version 2022-12-01:

{
    "training_file": "file-XGinujblHPwGLSztz8cPS8XY",
    "batch_size": 4,
    "learning_rate_multiplier": 0.1,
    "n_epochs": 4,
    "prompt_loss_weight": 0.1,
}

Content filtering is temporarily off by default. Azure content moderation works differently than Azure OpenAI. Azure OpenAI runs content filters during the generation call to detect harmful or abusive content and filters them from the response. Learn More

These models will be re-enabled in Q1 2023 and be on by default.

Customer actions

Contact Azure Support if you would like these turned on for your subscription.
Apply for filtering modifications, if you would like to have them remain off. (This option will be for low-risk use cases only.)

Next steps

Learn more about the underlying models that power Azure OpenAI.

Share via

What's new in Azure OpenAI Service

June 2024

Retirement date updates

Token based billing for fine-tuning

GPT-4o released in new regions

Customer-managed key (CMK) support for Assistants

May 2024

GPT-4o provisioned deployments

Assistants v2 (preview)

GPT-4o model general availability (GA)

Global standard deployment type (preview)

Fine-tuning updates

DALL-E and GPT-4 Turbo Vision GA configurable content filters

Asynchronous Filter available for all Azure OpenAI customers

Prompt Shields

2024-05-01-preview API release

GPT-4 Turbo model general availability (GA)

Differences between OpenAI and Azure OpenAI GPT-4 Turbo GA Models

Differences from gpt-4 vision-preview

GPT-4 Turbo provisioned managed availability

Region availability

Deploying GPT-4 Turbo with Vision GA

April 2024

Fine-tuning is now supported in two new regions East US 2 and Switzerland West

East US 2

Switzerland West

Multi-turn chat training examples

GPT-4 (0125) is available for Azure OpenAI On Your Data

March 2024

Risks & Safety monitoring in Azure OpenAI Studio

Azure OpenAI On Your Data updates

2024-02-01 general availability (GA) API released

Whisper general availability (GA)

DALL-E 3 general availability (GA)

New regional support for DALL-E 3

Model deprecations and retirements

2024-03-01-preview API released

Update to GPT-4-1106-Preview upgrade plans

February 2024

GPT-3.5-turbo-0125 model available

Third generation embeddings models available

GPT-3.5 Turbo quota consolidation

GPT-4-0125-preview model available

Assistants API public preview

OpenAI text to speech voices public preview

New Fine-tuning capabilities and model support

New regional support for Azure OpenAI On Your Data

Azure OpenAI On Your Data general availability

December 2023

Azure OpenAI On Your Data

GPT-4 Turbo with Vision now available

November 2023

New data source support in Azure OpenAI On Your Data

GPT-4 Turbo Preview & GPT-3.5-Turbo-1106 released

DALL-E 3 public preview

Responsible AI

October 2023

New fine-tuning models (preview)

Azure OpenAI On Your Data

September 2023

GPT-4

GPT-3.5 Turbo Instruct

Whisper public preview

New Regions

Regional quota limits increases

August 2023

Azure OpenAI on your own data (preview) updates

July 2023

Support for function calling

Embedding input array increase

New Regions

June 2023

Use Azure OpenAI on your own data (preview)

New versions of gpt-35-turbo and gpt-4 models

UK South

Content filtering & annotations (Preview)

Quota

May 2023

Java & JavaScript SDK support