Skip to main content

Provisioned Concurrency

What Is Concurrency

QPS (Queries Per Second) is a key metric for measuring system performance, indicating the number of requests processed per second.

Cloud function follows the operational logic where a single service instance processes only one event at any given time.

Assuming request A takes an average of 0.02 seconds to process, a service instance can handle 1/0.02=50 requests per second, meaning QPS=50. If there are 100 concurrent requests, at least two service instances are required to process them simultaneously; each service instance is referred to as one concurrency.

Concurrency Calculation Formula

Estimate concurrency using the following formula:

Concurrency = QPS × Function Execution Time (s)

Example: For a service with 2000 requests per second and an average request duration of 20ms (0.02 seconds), the concurrency is 2000 × 0.02 = 40, meaning 40 service instances are required to process the requests simultaneously.

Tip

You can view the average duration per request under "Execution Time" in the monitoring information.

Provisioned Concurrency

Cloud functions reclaim service instances when there are no requests to conserve resources. Each restart incurs cold start latency, and logs similar to the following can be seen in cloud function logs:

Coldstart: xxxms

If you wish to avoid cold start latency, you can reserve a resident instance through provisioned concurrency.

Provisioned Concurrency is a pre-launched concurrent instance that avoids cold start and improves function response speed.

Procedures

  1. Log in to the Tencent Cloud Development Platform/Cloud Functions
  2. In the function list, click Provisioned Management in the Operation column of the target function
  3. Click Add Provisioned Concurrency
  4. Choose Function Version, click Next
  5. Set Concurrent Instances, click Confirm
Note

When publishing a new version from the $LATEST version, a version is a snapshot of the function at the moment of production deployment, containing both code and configuration (timeout duration, environment variables, etc.).