Operation Mode
This page introduces the operation modes of cloud hosting. Currently, the following types are available:
- Always-on Auto-scaling
- Continuous Operation
- Continuous operation during the day, Auto-scaling at night
- Custom
Operation Mode Introduction
Always-on Auto-scaling
When the Cloud Hosting service is set to Always-on Auto-scaling
, it automatically adjusts the number of instances based on CPU or memory usage. The maximum number of instances is 16, and the minimum is 0.
When selecting Always-on Auto-scaling
, you can set scaling conditions based on CPU usage or memory usage, or use both CPU and memory scaling metrics simultaneously.
Continuous Operation
In the following scenarios, you may need the service to run continuously without auto-scaling:
- Service traffic is relatively stable, without sudden increases or decreases
You can switch your operation mode to Continuous Operation. The service will start the corresponding number of instances based on your configuration and keep them running continuously. It will not automatically increase or decrease the number of instances based on traffic volume or CPU/memory usage.
Continuous Operation During the Day, Auto-scaling at Night
From 8:00 to 24:00, a fixed number of instances are run; from 0:00 to 8:00, an auto-scaling solution that scales down to zero instances is used.
Custom
Custom mode provides configuration options for auto-scaling and scheduled scaling:
- Auto-scaling: Same as
Always-on Auto-scaling
, configure the maximum and minimum number of instances; instances automatically scale out and in based on business request volume and CPU or memory usage. - Scheduled scaling: Scheduled scaling can be configured to start and maintain a minimum number of instances during specified time periods. When setting up scheduled scaling, this minimum instance count must exceed the auto-scaling minimum. During the scheduled period, instances can still scale out automatically, subject to the maximum replica limit.
Instance Scaling Introduction
Instance Scale-out Instructions
Instance scale-out consists of two phases:
- Scaling from 0 to 1: After a period of no requests, the service scales in completely with no running instances. When a request arrives, a new instance is started and begins processing requests after successful startup. This phase is the
cold start
phase of the instance. The startup duration depends on platform resources, image size, and the startup time of the business code within the image. - Scaling from 1 to more: After configuring scale-out conditions, the service starts a new instance when the average CPU usage or average memory usage of currently running instances reaches or exceeds the configured thresholds. The newly started instance will load balance overall requests after successful startup. If after the detection period, the average CPU usage or average memory usage across multiple instances still exceeds the scale-out conditions, the service will continue launching new instances to handle traffic.
Instance Scale-in Instructions
Instances with no access or traffic will be reclaimed and destroyed after being idle for 10 minutes. When the minimum value of instances is 0, all instances will be scaled in. After all instances are scaled in, subsequent incoming requests will trigger the scale-out process from 0 to 1 instance.