Short answer
────────────────
In the public-cloud Azure App Service the value that you put in siteConfig.loadBalancing is ignored; internally the platform always uses the same algorithm – a simple weighted round-robin (all instances have the same weight) with an optional ARR-affinity cookie for stickiness.
The other five algorithms that appear in the ARM schema exist only because the same SiteConfig object is also used by the on-premises versions of the service (Azure Pack / Azure Stack Hub). They have never been implemented in the multi-tenant or in App Service Environment (ASE) versions that run in Azure. Consequently you will see exactly the behaviour you described, no matter which value you set.
Details
───────
How requests reach your workers
• A farm of “front-end” (FE) VMs running IIS+ARR takes the HTTPS/TCP connection from the internet.
• ARR resolves the worker pool for the site, selects one worker and forwards the request.
• If ARR-Affinity is ON the cookie determines the worker; if it is OFF the FE selects a worker for every request.
• The only selection algorithm that is enabled in the Azure implementation is WRR (equal weight).
What loadBalancing was meant for
• The IIS Application Request Routing module supports five algorithms:
– WeightedRoundRobin
– LeastRequests
– LeastResponseTime
– WeightedTotalTraffic
– RequestHash
• When Microsoft created “Azure Pack Web Sites” (and later Azure App Service on Azure Stack Hub) they exposed that ARR setting through an API property called loadBalancing.
• The ARM schema that you see today is shared between those products and Azure App Service, therefore the enum is present in the public cloud API even though the Azure FE role does not look at it.
Behaviour in the different offerings
• Multi-tenant Basic/Standard/Premium plans ➜ always weighted-round-robin.
• App Service Environment v1 / v2 ➜ still uses the same FE code; no alternative algorithms.
• Azure Stack Hub / former Azure Pack ➜ the setting really works, because each cloud operator can decide to enable the ARR algorithms on his own hardware.
Your test results
• Because only WRR is active each FE distributes requests 50 / 50 between the two healthy workers.
• The FE does not measure response time for the individual site, so LeastResponseTime and LeastRequests cannot influence the choice. What you observed is therefore expected.
What you can actually tune
• Turn ARR affinity on/off (WEBSITE_ARR_DISABLE_SESSION_AFFINITY).
• Scale the number of instances (so the pool from which WRR chooses is larger/smaller).
• Use a service in front of App Service (e.g. Azure Front Door, Application Gateway) if you need different routing logic.
So, to answer the original questions explicitly:
• Does the loadBalancing setting work in Azure App Service? → No, only the value WeightedRoundRobin is honoured; all other values are ignored.
• In what configuration is it respected? → Only in on-premises variants of the service (Azure Pack / Azure Stack Hub) where the cloud operator has enabled those ARR algorithms, not in the public Azure service nor in App Service Environments.
Was this answer helpful?
version: o3-2025-04-16
Status: UQ Validated
Validated: 8 months ago
Status: Needs Human Verification
Loading model reviews...
Loading reviews...