Skip to main content

Provisioned Concurrency

What Is Concurrency

QPS (Queries Per Second) is an important metric for measuring system performance, indicating the number of requests per second.

Cloud function follows the runtime logic where a service instance processes only one event at a time.

Assuming that request A takes an average of 0.02s to process, a service instance can handle 1/0.02=50 requests per second, i.e., QPS=50. If there are 100 concurrent requests, at least two service instances are required to process them simultaneously. A service instance is referred to as one concurrency.

Concurrency Calculation Formula

You can estimate the concurrency using the following formula:

Concurrency = QPS × Function Runtime Duration (seconds)

Example: For a business with a QPS of 2000 and an average request duration of 20ms (0.02s), the concurrency is calculated as 2000 × 0.02 = 40. This requires 40 service instances to process requests simultaneously.

Tip

You can view the average duration of each request in the "Runtime" section of the monitoring information.

Provisioned Concurrency

Cloud function reclaims service instances when there are no requests to conserve resources. Each restart incurs cold startup overhead, as shown in log examples like:

Coldstart: xxxms

If you wish to avoid cold startup overhead, you can reserve a resident instance through provisioned concurrency.

Provisioned Concurrency is pre-launched concurrent instances that can avoid cold startup and improve function response speed.

Procedure

  1. Log in to TCB/cloud function
  2. In the function list, click Provisioned Management in the Operation column of the target function
  3. Click Add Provisioned Concurrency
  4. Select function version and click Next
  5. Set concurrent instances and click Confirm
Note

When publishing a new version from the $LATEST version, the version is a snapshot of the function at the time of production release, including code and configuration (timeout, environment variables, etc.).