Skip to main content

Operation Mode

This page introduces the operation modes of cloud hosting. Currently, the following types are available:

Operation Mode Introduction

Always-on Auto-scaling

When the Cloud Hosting service is set to Always-on Auto-scaling, it automatically adjusts the number of instances based on CPU or memory usage. The maximum number of instances is 16, and the minimum is 0.

When selecting Always-on Auto-scaling, you can set scaling conditions based on CPU usage or memory usage, or use both CPU and memory scaling metrics simultaneously.

Continuous Operation

In the following scenarios, you may need the service to run continuously without auto-scaling:

  • Service traffic is relatively stable, without sudden increases or decreases

You can switch your operation mode to Continuous Operation. The service will start the corresponding number of instances based on your configuration and keep them running continuously. It will not automatically increase or decrease the number of instances based on traffic volume or CPU/memory usage.

Continuous Operation During the Day, Auto-scaling at Night

From 8:00 to 24:00, a fixed number of instances are run; from 0:00 to 8:00, an auto-scaling solution that scales down to zero instances is used.

Custom

Custom mode provides configuration options for auto-scaling and scheduled scaling:

  • Auto-scaling: Same as Always-on Auto-scaling, configure the maximum and minimum number of instances; instances automatically scale out and in based on business request volume and CPU or memory usage.
  • Scheduled scaling: Scheduled scaling can be configured to start and maintain a minimum number of instances during specified time periods. When setting up scheduled scaling, this minimum instance count must exceed the auto-scaling minimum. During the scheduled period, instances can still scale out automatically, subject to the maximum replica limit.

Instance Scaling Introduction

Instance Scale-out Instructions

Instance scale-out consists of two phases:

  • Scaling from 0 to 1: After a period of no requests, the service scales in completely with no running instances. When a request arrives, a new instance is started and begins processing requests after successful startup. This phase is the cold start phase of the instance. The startup duration depends on platform resources, image size, and the startup time of the business code within the image.
  • Scaling from 1 to more: After configuring scale-out conditions, the service starts a new instance when the average CPU usage or average memory usage of currently running instances reaches or exceeds the configured thresholds. The newly started instance will load balance overall requests after successful startup. If after the detection period, the average CPU usage or average memory usage across multiple instances still exceeds the scale-out conditions, the service will continue launching new instances to handle traffic.

Instance Scale-in Instructions

Instances with no access or traffic will be reclaimed and destroyed after being idle for 10 minutes. When the minimum value of instances is 0, all instances will be scaled in. After all instances are scaled in, subsequent incoming requests will trigger the scale-out process from 0 to 1 instance.