Auto Scaling

With the help of monitoring solutions, we can measure the performance parameters of an infrastructure. This is usually in the form of performance SLAs. Auto Scaling gives us the ability to increase or decrease the resources available to the system, based on performance thresholds.

The Auto Scaling feature adds additional resources to cater to increased load. It works in reverse, as well. If the load is reduced, then Auto Scaling reduces the number of resources available to perform the task. Auto Scaling does it all without pre-provisioning the resources, and does this in an automated way.

Auto Scaling can scale in both ways—vertically (adding more resources to the existing resource type) or horizontally (adding resources by creating another instance of that type of resource).

The Auto Scaling feature makes a decision regarding adding or removing resources based on two strategies. One is based on the available metrics of the resource or on meeting some system threshold value. The other type of strategy is based on time, for example, between 9 a.m. and 5 p.m. IST, instead of three web servers; the system needs 30 web servers.

Azure monitoring instruments every resource; all the metric-related data is collected and monitored. Based on the data collected, Auto Scaling makes decisions.

Azure Monitor autoscale applies only to virtual machine scale sets, cloud services, and app services (for example, web apps).