Autoscaling

Autoscaling, also spelled auto scaling or auto-scaling, and sometimes also called automatic scaling, is a method used in cloud computing that dynamically adjusts the amount of computational resources in a server farm - typically measured by the number of active servers - automatically based on the load on the farm. For example, the number of servers running behind a web application may be increased or decreased automatically based on the number of active users on the site. Since such metrics may change dramatically throughout the course of the day, and servers are a limited resource that cost money to run even while idle, there is often an incentive to run "just enough" servers to support the current load while still being able to support sudden and large spikes in activity. Autoscaling is helpful for such needs, as it can reduce the number of active servers when activity is low, and launch new servers when activity is high. Autoscaling is closely related to, and builds upon, the idea of load balancing.^[1]^[2]

^ "Above the Clouds: A Berkeley View of Cloud Computing" (PDF). Berkeley EECS. February 10, 2009. Retrieved March 21, 2015.
^ "Auto Scaling". Amazon Web Services. Retrieved March 21, 2015.

[eecs-berkeley-1] "Above the Clouds: A Berkeley View of Cloud Computing" (PDF). Berkeley EECS. February 10, 2009. Retrieved March 21, 2015.

[aws-2] "Auto Scaling". Amazon Web Services. Retrieved March 21, 2015.

[1]

[2]