Set scaling rules in Azure Container Apps
Azure Container Apps manages automatic horizontal scaling through a set of declarative scaling rules. As a container app revision scales out, new instances of the revision are created on-demand. These instances are known as replicas.
Adding or editing scaling rules creates a new revision of your container app. A revision is an immutable snapshot of your container app. To learn which types of changes trigger a new revision, see revision change types.
Event-driven Container Apps jobs use scaling rules to trigger executions based on events.
Scale definition
Scaling is the combination of limits, rules, and behavior.
Limits define the minimum and maximum possible number of replicas per revision as your container app scales.
Scale limit Default value Min value Max value Minimum number of replicas per revision 0 0 Maximum replicas configurable are 1,000. Maximum number of replicas per revision 10 1 Maximum replicas configurable are 1,000. Rules are the criteria used by Container Apps to decide when to add or remove replicas.
Scale rules are implemented as HTTP, TCP (Transmission Control Protocol), or custom.
Behavior is the combination of rules and limits to determine scale decisions over time.
Scale behavior explains how scale decisions are made.
As you define your scaling rules, it's important to consider the following items:
- You aren't billed usage charges if your container app scales to zero.
- Replicas that aren't processing, but remain in memory might be billed at a lower "idle" rate. For more information, see Billing.
- If you want to ensure that an instance of your revision is always running, set the minimum number of replicas to 1 or higher.
Scale rules
Scaling is driven by three different categories of triggers:
- HTTP: Based on the number of concurrent HTTP requests to your revision.
- TCP: Based on the number of concurrent TCP connections to your revision.
- Custom: Based on CPU, memory, or supported event-driven data sources such as:
- Azure Service Bus
- Azure Event Hubs
- Apache Kafka
- Redis
If you define more than one scale rule, the container app begins to scale once the first condition of any rules is met.
HTTP
With an HTTP scaling rule, you have control over the threshold of concurrent HTTP requests that determines how your container app revision scales. Every 15 seconds, the number of concurrent requests is calculated as the number of requests in the past 15 seconds divided by 15. Container Apps jobs don't support HTTP scaling rules.
In the following example, the revision scales out up to five replicas and can scale in to zero. The scaling property is set to 100 concurrent requests per second.
Example
The http
section defines an HTTP scale rule.
Scale property | Description | Default value | Min value | Max value |
---|---|---|---|---|
concurrentRequests |
When the number of HTTP requests exceeds this value, then another replica is added. Replicas continue to add to the pool up to the maxReplicas amount. |
10 | 1 | n/a |
{
...
"resources": {
...
"properties": {
...
"template": {
...
"scale": {
"minReplicas": 0,
"maxReplicas": 5,
"rules": [{
"name": "http-rule",
"http": {
"metadata": {
"concurrentRequests": "100"
}
}
}]
}
}
}
}
}
Note
Set the properties.configuration.activeRevisionsMode
property of the container app to single
, when using non-HTTP event scale rules.
Define an HTTP scale rule using the --scale-rule-http-concurrency
parameter in the create
or update
commands.
CLI parameter | Description | Default value | Min value | Max value |
---|---|---|---|---|
--scale-rule-http-concurrency |
When the number of concurrent HTTP requests exceeds this value, then another replica is added. Replicas continue to add to the pool up to the max-replicas amount. |
10 | 1 | n/a |
az containerapp create \
--name <CONTAINER_APP_NAME> \
--resource-group <RESOURCE_GROUP> \
--environment <ENVIRONMENT_NAME> \
--image <CONTAINER_IMAGE_LOCATION>
--min-replicas 0 \
--max-replicas 5 \
--scale-rule-name azure-http-rule \
--scale-rule-type http \
--scale-rule-http-concurrency 100
Go to your container app in the Azure portal
Select Scale.
Select Edit and deploy.
Select the Scale tab.
Select the minimum and maximum replica range.
Select Add.
In the Rule name box, enter a rule name.
From the Type dropdown, select HTTP Scaling.
In the Concurrent requests box, enter your desired number of concurrent requests for your container app.
TCP
With a TCP scaling rule, you have control over the threshold of concurrent TCP connections that determines how your app scales. Every 15 seconds, the number of concurrent connections is calculated as the number of connections in the past 15 seconds divided by 15. Container Apps jobs don't support TCP scaling rules.
In the following example, the container app revision scales out up to five replicas and can scale in to zero. The scaling threshold is set to 100 concurrent connections per second.
Example
The tcp
section defines a TCP scale rule.
Scale property | Description | Default value | Min value | Max value |
---|---|---|---|---|
concurrentConnections |
When the number of concurrent TCP connections exceeds this value, then another replica is added. Replicas continue to be added up to the maxReplicas amount as the number of concurrent connections increase. |
10 | 1 | n/a |
{
...
"resources": {
...
"properties": {
...
"template": {
...
"scale": {
"minReplicas": 0,
"maxReplicas": 5,
"rules": [{
"name": "tcp-rule",
"tcp": {
"metadata": {
"concurrentConnections": "100"
}
}
}]
}
}
}
}
}
Define a TCP scale rule using the --scale-rule-tcp-concurrency
parameter in the create
or update
commands.
CLI parameter | Description | Default value | Min value | Max value |
---|---|---|---|---|
--scale-rule-tcp-concurrency |
When the number of concurrent TCP connections exceeds this value, then another replica is added. Replicas continue to be added up to the max-replicas amount as the number of concurrent connections increase. |
10 | 1 | n/a |
az containerapp create \
--name <CONTAINER_APP_NAME> \
--resource-group <RESOURCE_GROUP> \
--environment <ENVIRONMENT_NAME> \
--image <CONTAINER_IMAGE_LOCATION>
--min-replicas 0 \
--max-replicas 5 \
--transport tcp \
--ingress <external/internal> \
--target-port <CONTAINER_TARGET_PORT> \
--scale-rule-name azure-tcp-rule \
--scale-rule-type tcp \
--scale-rule-tcp-concurrency 100
Not supported in the Azure portal. Use the Azure CLI or Azure Resource Manager to configure a TCP scale rule.
Custom
You can create a custom Container Apps scaling rule based on any ScaledObject-based KEDA scaler with these defaults:
Defaults | Seconds |
---|---|
Polling interval | 30 |
Cool down period | 300 |
For event-driven Container Apps jobs, you can create a custom scaling rule based on any ScaledJob-based KEDA scalers.
The following example demonstrates how to create a custom scale rule.
Example
This example shows how to convert an Azure Service Bus scaler to a Container Apps scale rule, but you use the same process for any other ScaledObject-based KEDA scaler specification.
For authentication, KEDA scaler authentication parameters convert into Container Apps secrets.
The following procedure shows you how to convert a KEDA scaler to a Container App scale rule. This snippet is an excerpt of an ARM template to show you where each section fits in context of the overall template.
{
...
"resources": {
...
"properties": {
...
"configuration": {
...
"secrets": [
{
"name": "<NAME>",
"value": "<VALUE>"
}
]
},
"template": {
...
"scale": {
"minReplicas": 0,
"maxReplicas": 5,
"rules": [
{
"name": "<RULE_NAME>",
"custom": {
"metadata": {
...
},
"auth": [
{
"secretRef": "<NAME>",
"triggerParameter": "<PARAMETER>"
}
]
}
}
]
}
}
}
}
}
Refer to this excerpt for context on how the below examples fit in the ARM template.
First, you define the type and metadata of the scale rule.
From the KEDA scaler specification, find the
type
value.triggers: - type: azure-servicebus metadata: queueName: my-queue namespace: service-bus-namespace messageCount: "5"
In the ARM template, enter the scaler
type
value into thecustom.type
property of the scale rule.... "rules": [ { "name": "azure-servicebus-queue-rule", "custom": { "type": "azure-servicebus", "metadata": { "queueName": "my-queue", "namespace": "service-bus-namespace", "messageCount": "5" } } } ] ...
From the KEDA scaler specification, find the
metadata
values.triggers: - type: azure-servicebus metadata: queueName: my-queue namespace: service-bus-namespace messageCount: "5"
In the ARM template, add all metadata values to the
custom.metadata
section of the scale rule.... "rules": [ { "name": "azure-servicebus-queue-rule", "custom": { "type": "azure-servicebus", "metadata": { "queueName": "my-queue", "namespace": "service-bus-namespace", "messageCount": "5" } } } ] ...
Authentication
A KEDA scaler supports using secrets in a TriggerAuthentication that is referenced by the authenticationRef
property. You can map the TriggerAuthentication object to the Container Apps scale rule.
Note
Container Apps scale rules only support secret references. Other authentication types such as pod identity are not supported.
Find the
TriggerAuthentication
object referenced by the KEDAScaledObject
specification.From the KEDA specification, find each
secretTargetRef
of theTriggerAuthentication
object and its associated secret.apiVersion: v1 kind: Secret metadata: name: my-secrets namespace: my-project type: Opaque data: connection-string-secret: <SERVICE_BUS_CONNECTION_STRING> --- apiVersion: keda.sh/v1alpha1 kind: TriggerAuthentication metadata: name: azure-servicebus-auth spec: secretTargetRef: - parameter: connection name: my-secrets key: connection-string-secret --- apiVersion: keda.sh/v1alpha1 kind: ScaledObject metadata: name: azure-servicebus-queue-rule namespace: default spec: scaleTargetRef: name: my-scale-target triggers: - type: azure-servicebus metadata: queueName: my-queue namespace: service-bus-namespace messageCount: "5" authenticationRef: name: azure-servicebus-auth
In the ARM template, add all entries to the
auth
array of the scale rule.Add a secret to the container app's
secrets
array containing the secret value.Set the value of the
triggerParameter
property to the value of theTriggerAuthentication
'skey
property.Set the value of the
secretRef
property to the name of the Container Apps secret.
{ ... "resources": { ... "properties": { ... "configuration": { ... "secrets": [ { "name": "connection-string-secret", "value": "<SERVICE_BUS_CONNECTION_STRING>" } ] }, "template": { ... "scale": { "minReplicas": 0, "maxReplicas": 5, "rules": [ { "name": "azure-servicebus-queue-rule", "custom": { "type": "azure-servicebus", "metadata": { "queueName": "my-queue", "namespace": "service-bus-namespace", "messageCount": "5" }, "auth": [ { "secretRef": "connection-string-secret", "triggerParameter": "connection" } ] } } ] } } } } }
Some scalers support metadata with the
FromEnv
suffix to reference a value in an environment variable. Container Apps looks at the first container listed in the ARM template for the environment variable.Refer to the considerations section for more security related information.
From the KEDA scaler specification, find the
type
value.triggers: - type: azure-servicebus metadata: queueName: my-queue namespace: service-bus-namespace messageCount: "5"
In the CLI command, set the
--scale-rule-type
parameter to the specificationtype
value.az containerapp create \ --name <CONTAINER_APP_NAME> \ --resource-group <RESOURCE_GROUP> \ --environment <ENVIRONMENT_NAME> \ --image <CONTAINER_IMAGE_LOCATION> --min-replicas 0 \ --max-replicas 5 \ --secrets "connection-string-secret=<SERVICE_BUS_CONNECTION_STRING>" \ --scale-rule-name azure-servicebus-queue-rule \ --scale-rule-type azure-servicebus \ --scale-rule-metadata "queueName=my-queue" \ "namespace=service-bus-namespace" \ "messageCount=5" \ --scale-rule-auth "connection=connection-string-secret"
From the KEDA scaler specification, find the
metadata
values.triggers: - type: azure-servicebus metadata: queueName: my-queue namespace: service-bus-namespace messageCount: "5"
In the CLI command, set the
--scale-rule-metadata
parameter to the metadata values.You need to transform the values from a YAML format to a key/value pair for use on the command line. Separate each key/value pair with a space.
az containerapp create \ --name <CONTAINER_APP_NAME> \ --resource-group <RESOURCE_GROUP> \ --environment <ENVIRONMENT_NAME> \ --image <CONTAINER_IMAGE_LOCATION> --min-replicas 0 \ --max-replicas 5 \ --secrets "connection-string-secret=<SERVICE_BUS_CONNECTION_STRING>" \ --scale-rule-name azure-servicebus-queue-rule \ --scale-rule-type azure-servicebus \ --scale-rule-metadata "queueName=my-queue" \ "namespace=service-bus-namespace" \ "messageCount=5" \ --scale-rule-auth "connection=connection-string-secret"
Authentication
A KEDA scaler supports using secrets in a TriggerAuthentication that is referenced by the authenticationRef property. You can map the TriggerAuthentication object to the Container Apps scale rule.
Note
Container Apps scale rules only support secret references. Other authentication types such as pod identity are not supported.
Find the
TriggerAuthentication
object referenced by the KEDAScaledObject
specification. Identify eachsecretTargetRef
of theTriggerAuthentication
object.apiVersion: v1 kind: Secret metadata: name: my-secrets namespace: my-project type: Opaque data: connection-string-secret: <SERVICE_BUS_CONNECTION_STRING> --- apiVersion: keda.sh/v1alpha1 kind: TriggerAuthentication metadata: name: azure-servicebus-auth spec: secretTargetRef: - parameter: connection name: my-secrets key: connection-string-secret --- apiVersion: keda.sh/v1alpha1 kind: ScaledObject metadata: name: azure-servicebus-queue-rule namespace: default spec: scaleTargetRef: name: my-scale-target triggers: - type: azure-servicebus metadata: queueName: my-queue namespace: service-bus-namespace messageCount: "5" authenticationRef: name: azure-servicebus-auth
In your container app, create the secrets that match the
secretTargetRef
properties.In the CLI command, set parameters for each
secretTargetRef
entry.Create a secret entry with the
--secrets
parameter. If there are multiple secrets, separate them with a space.Create an authentication entry with the
--scale-rule-auth
parameter. If there are multiple entries, separate them with a space.
az containerapp create \ --name <CONTAINER_APP_NAME> \ --resource-group <RESOURCE_GROUP> \ --environment <ENVIRONMENT_NAME> \ --image <CONTAINER_IMAGE_LOCATION> --min-replicas 0 \ --max-replicas 5 \ --secrets "connection-string-secret=<SERVICE_BUS_CONNECTION_STRING>" \ --scale-rule-name azure-servicebus-queue-rule \ --scale-rule-type azure-servicebus \ --scale-rule-metadata "queueName=my-queue" \ "namespace=service-bus-namespace" \ "messageCount=5" \ --scale-rule-auth "connection=connection-string-secret"
Go to your container app in the Azure portal.
Select Scale.
Select Edit and deploy.
Select the Scale and replicas tab.
Select the minimum and maximum replica range.
Select Add.
In the Rule name box, enter a rule name.
From the Type dropdown, select Custom.
From the KEDA scaler specification, find the
type
value.triggers: - type: azure-servicebus metadata: queueName: my-queue namespace: service-bus-namespace messageCount: "5"
In the Custom rule type box, enter the scaler
type
value.From the KEDA scaler specification, find the
metadata
values.triggers: - type: azure-servicebus metadata: queueName: my-queue namespace: service-bus-namespace messageCount: "5"
In the portal, find the Metadata section and select Add. Enter the name and value for each item in the KEDA
ScaledObject
specification metadata section.
Authentication
A KEDA scaler supports using secrets in a TriggerAuthentication that is referenced by the authenticationRef property. You can map the TriggerAuthentication object to the Container Apps scale rule.
Note
Container Apps scale rules only support secret references. Other authentication types such as pod identity are not supported.
In your container app, create the secrets that you want to reference.
Find the
TriggerAuthentication
object referenced by the KEDAScaledObject
specification. Identify eachsecretTargetRef
of theTriggerAuthentication
object.apiVersion: v1 kind: Secret metadata: name: my-secrets namespace: my-project type: Opaque data: connection-string-secret: <SERVICE_BUS_CONNECTION_STRING> --- apiVersion: keda.sh/v1alpha1 kind: TriggerAuthentication metadata: name: azure-servicebus-auth spec: secretTargetRef: - parameter: connection name: my-secrets key: connection-string-secret --- apiVersion: keda.sh/v1alpha1 kind: ScaledObject metadata: name: azure-servicebus-queue-rule namespace: default spec: scaleTargetRef: name: my-scale-target triggers: - type: azure-servicebus metadata: queueName: my-queue namespace: service-bus-namespace messageCount: "5" authenticationRef: name: azure-servicebus-auth
In the Authentication section, select Add to create an entry for each KEDA
secretTargetRef
parameter.
Default scale rule
If you don't create a scale rule, the default scale rule is applied to your container app.
Trigger | Min replicas | Max replicas |
---|---|---|
HTTP | 0 | 10 |
Important
Make sure you create a scale rule or set minReplicas
to 1 or more if you don't enable ingress. If ingress is disabled and you don't define a minReplicas
or a custom scale rule, then your container app will scale to zero and have no way of starting back up.
Scale behavior
Scaling behavior has the following defaults:
Parameter | Value |
---|---|
Polling interval | 30 seconds |
Cool down period | 300 seconds |
Scale up stabilization window | 0 seconds |
Scale down stabilization window | 300 seconds |
Scale up step | 1, 4, 100% of current |
Scale down step | 100% of current |
Scaling algorithm | desiredReplicas = ceil(currentMetricValue / targetMetricValue) |
- Polling interval is how frequently event sources are queried by KEDA. This value doesn't apply to HTTP and TCP scale rules.
- Cool down period is how long after the last event was observed before the application scales down to its minimum replica count.
- Scale up stabilization window is how long to wait before performing a scale up decision once scale up conditions were met.
- Scale down stabilization window is how long to wait before performing a scale down decision once scale down conditions were met.
- Scale up step is the rate new instances are added at. It starts with 1, 4, 8, 16, 32, ... up to the configured maximum replica count.
- Scale down step is the rate at which replicas are removed. By default 100% of replicas that need to shut down are removed.
- Scaling algorithm is the formula used to calculate the current desired number of replicas.
Example
For the following scale rule:
"minReplicas": 0,
"maxReplicas": 20,
"rules": [
{
"name": "azure-servicebus-queue-rule",
"custom": {
"type": "azure-servicebus",
"metadata": {
"queueName": "my-queue",
"namespace": "service-bus-namespace",
"messageCount": "5"
}
}
}
]
As your app scales out, KEDA starts with an empty queue and performs the following steps:
- Check
my-queue
every 30 seconds. - If the queue length equals 0, go back to (1).
- If the queue length is > 0, scale the app to 1.
- If the queue length is 50, calculate
desiredReplicas = ceil(50/5) = 10
. - Scale app to
min(maxReplicaCount, desiredReplicas, max(4, 2*currentReplicaCount))
- Go back to (1).
If the app was scaled to the maximum replica count of 20, scaling goes through the same previous steps. Scale down only happens if the condition was satisfied for 300 seconds (scale down stabilization window). Once the queue length is 0, KEDA waits for 300 seconds (cool down period) before scaling the app to 0.
Considerations
In "multiple revisions" mode, adding a new scale trigger creates a new revision of your application but your old revision remains available with the old scale rules. Use the Revision management page to manage traffic allocations.
No usage charges are incurred when an application scales to zero. For more pricing information, see Billing in Azure Container Apps.
You need to enable data protection for all .NET apps on Azure Container Apps. See Deploying and scaling an ASP.NET Core app on Azure Container Apps for details.
Known limitations
Vertical scaling isn't supported.
Replica quantities are a target amount, not a guarantee.
If you're using Dapr actors to manage states, you should keep in mind that scaling to zero isn't supported. Dapr uses virtual actors to manage asynchronous calls, which means their in-memory representation isn't tied to their identity or lifetime.
Next steps
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for