Observability Guide

CloudBase provides built-in observability capabilities based on OpenTelemetry and OpenInference standards, helping developers track and monitor the complete execution chain of AI Agents.

Prerequisites

Agent application created (LangChain / LangGraph / CrewAI)
Corresponding SDK installed (cloudbase-agent-server / @cloudbase/agent-server)
Understanding of OpenTelemetry basics (optional)

Install Dependencies

Python
TypeScript

# Basic dependencies
pip install cloudbase-agent-server cloudbase-agent-observability

# If you need to export to OTLP backend (e.g., Langfuse)
pip install opentelemetry-exporter-otlp

# Basic dependencies
npm install @cloudbase/agent-server @cloudbase/agent-observability

# If you need to export to OTLP backend
npm install @opentelemetry/exporter-trace-otlp-http

Overview

What is Observability

Observability is the ability to understand a system's internal state through signals output by the system (logs, metrics, traces). For AI Agent applications, observability helps you:

Track Execution Chains: View the complete call chain from the Agent receiving a request to returning a response
Identify Performance Bottlenecks: Identify time-consuming LLM calls or tool executions
Debug Issues: Analyze the Agent's decision process and tool call parameters
Optimize Costs: Calculate token usage and analyze model invocation frequency

Observability Features

Out-of-the-Box: Enable with one line of code or one environment variable, no complex configuration
Full-Chain Tracing: Automatically links Server layer → Adapter layer → Agent SDK layer call chains
Standardized: Follows OpenTelemetry and OpenInference semantic conventions
Multiple Export Targets: Supports console output (debugging) and OTLP export (Langfuse, Jaeger, etc.)

Architecture Principles

Span Hierarchy Example

Using a LangGraph workflow as an example, a typical Span hierarchy looks like:

AG-UI.Server (Request entry point)
└─ Adapter.LangGraph (Agent adapter layer)
   └─ LangGraph
      ├─ node_a (LangGraph node)
      │   └─ ChatOpenAI (LLM call)
      ├─ node_b (LangGraph node)
      │   ├─ ChatOpenAI (LLM call)
      │   └─ calculator (Tool call)
      └─ synthesizer (LangGraph node)
          └─ ChatOpenAI (LLM call)

Span Type Description

Type	Icon	Description	Examples
`CHAIN`	⛓️	Chained calls	Adapter.LangGraph, LangGraph nodes
`LLM`	💬	LLM calls	ChatOpenAI, ChatAnthropic
`TOOL`	🔧	Tool calls	calculator, get_weather
`AGENT`	🤖	Agent calls	Multi-Agent orchestration scenarios

Standards Followed

OpenTelemetry: Standard framework for distributed tracing, providing concepts like Span, Trace, Context
OpenInference: Semantic conventions for AI applications, defining attribute specifications for Span types like LLM, TOOL, CHAIN

Key attributes include:

input.value / output.value: Input/output content
llm.model_name: Model identifier
llm.token_count.prompt / llm.token_count.completion: Token usage
tool.name: Tool function name

Quick Start

Method 1: Enable via Environment Variables (Recommended)

This is the simplest approach, no code changes required, just set environment variables.

# Enable console output (for local development debugging)
AUTO_TRACES_STDOUT=true

# Disable observability
AUTO_TRACES_STDOUT=false

Example:

Python
TypeScript

# app.py - No code changes needed
from cloudbase_agent.server import AgentServiceApp
from cloudbase_agent.langgraph import LangGraphAgent

app = AgentServiceApp()  # Automatically reads AUTO_TRACES_STDOUT environment variable
app.run(lambda: {"agent": agent})

// index.js - No code changes needed
import { createExpressRoutes } from "@cloudbase/agent-server";

createExpressRoutes({
  createAgent,
  express: app,
  // Don't pass observability parameter, automatically reads AUTO_TRACES_STDOUT environment variable
});

Method 2: Enable via Code Configuration

For finer control (e.g., OTLP export configuration), you can explicitly configure via code.

Python
TypeScript

from cloudbase_agent.server import AgentServiceApp
from cloudbase_agent.observability.server import ConsoleTraceConfig, OTLPTraceConfig

# Option A: Console output (local debugging)
app = AgentServiceApp(observability=ConsoleTraceConfig())

# Option B: Export to Langfuse
app = AgentServiceApp(
    observability=OTLPTraceConfig(
        endpoint="https://your-langfuse.com/api/public/otel/v1/traces",
        headers={"Authorization": "Basic your-credentials"}
    )
)

app.run(lambda: {"agent": agent})

import { createExpressRoutes } from "@cloudbase/agent-server";
import { ExporterType } from "@cloudbase/agent-observability/server";

// Option A: Console output (local debugging)
createExpressRoutes({
  createAgent,
  express: app,
  observability: { type: ExporterType.Console }
});

// Option B: Export to OTLP backend
createExpressRoutes({
  createAgent,
  express: app,
  observability: {
    type: ExporterType.OTLP,
    url: "https://your-langfuse.com/api/public/otel/v1/traces",
    headers: { "Authorization": "Basic your-credentials" }
  }
});

Exporter Configuration

Console Export (Local Debugging)

The console exporter outputs Span information in JSON format to the console, suitable for local development and debugging.

Output Example:

{
  "trace_id": "550e8400-e29b-41d4-a716-446655440000",
  "span_id": "a1b2c3d4e5f67890",
  "parent_span_id": "0987654321fedcba",
  "name": "ChatOpenAI",
  "kind": "SPAN_KIND_INTERNAL",
  "start_time": "2025-01-15T08:30:00.123456Z",
  "end_time": "2025-01-15T08:30:01.234567Z",
  "attributes": {
    "openinference.span.kind": "LLM",
    "llm.model_name": "gpt-4",
    "input.value": "Hello, how are you?",
    "output.value": "I'm doing well, thank you!",
    "llm.token_count.prompt": 5,
    "llm.token_count.completion": 7
  }
}

Viewing Tips:

# Save to file
python app.py 2>&1 | grep "^\\{" > spans.jsonl

# Parse and beautify output
python app.py 2>&1 | python -m json.tool

OTLP Export (Production)

OTLP (OpenTelemetry Protocol) is the standard transmission protocol for OpenTelemetry and can export trace data to various backends like Langfuse, Jaeger, Zipkin, etc.

Langfuse Configuration Example

Langfuse is an open-source LLM observability platform with native OpenTelemetry support.

Python
TypeScript

from cloudbase_agent.observability.server import OTLPTraceConfig

app = AgentServiceApp(
    observability=OTLPTraceConfig(
        endpoint="https://cloud.langfuse.com/api/public/otel/v1/traces",
        headers={
            "Authorization": "Basic " + base64.b64encode(
                f"{public_key}:{secret_key}".encode()
            ).decode()
        }
    )
)

import { ExporterType } from "@cloudbase/agent-observability/server";

createExpressRoutes({
  createAgent,
  express: app,
  observability: {
    type: ExporterType.OTLP,
    url: "https://cloud.langfuse.com/api/public/otel/v1/traces",
    headers: {
      "Authorization": `Basic ${Buffer.from(`${publicKey}:${secretKey}`).toString("base64")}`
    }
  }
});

Jaeger Configuration Example

Jaeger is an open-source distributed tracing system by Uber.

Python
TypeScript

from cloudbase_agent.observability.server import OTLPTraceConfig

app = AgentServiceApp(
    observability=OTLPTraceConfig(
        endpoint="http://localhost:4318/v1/traces"
    )
)

createExpressRoutes({
  createAgent,
  express: app,
  observability: {
    type: ExporterType.OTLP,
    url: "http://localhost:4318/v1/traces"
  }
});

Complete Examples

Python + LangGraph + Langfuse

import os
import base64
from cloudbase_agent.server import AgentServiceApp
from cloudbase_agent.langgraph import LangGraphAgent
from cloudbase_agent.observability.server import OTLPTraceConfig
from langgraph.graph import StateGraph, MessagesState
from langchain_openai import ChatOpenAI

# Create Agent
def create_agent():
    model = ChatOpenAI(
        api_key=os.getenv("OPENAI_API_KEY"),
        base_url=os.getenv("OPENAI_BASE_URL"),
        model="gpt-4"
    )

    async def chat_node(state):
        response = await model.ainvoke(state["messages"])
        return {"messages": [response]}

    workflow = StateGraph(MessagesState)
    workflow.add_node("chat", chat_node)
    workflow.add_edge("__start__", "chat")
    workflow.add_edge("chat", "__end__")

    return {
        "agent": LangGraphAgent(
            name="chatbot",
            graph=workflow.compile()
        )
    }

# Configure Langfuse
credentials = base64.b64encode(
    f"{os.getenv('LANGFUSE_PUBLIC_KEY')}:{os.getenv('LANGFUSE_SECRET_KEY')}".encode()
).decode()

app = AgentServiceApp(
    observability=OTLPTraceConfig(
        endpoint=f"{os.getenv('LANGFUSE_HOST')}/api/public/otel/v1/traces",
        headers={"Authorization": f"Basic {credentials}"}
    )
)

app.run(create_agent)

TypeScript + LangGraph + Console

import { createExpressRoutes } from "@cloudbase/agent-server";
import { LanggraphAgent } from "@cloudbase/agent-adapter-langgraph";
import { StateGraph, Annotation } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";
import express from "express";
import { ExporterType } from "@cloudbase/agent-observability/server";

const app = express();

const createAgent = () => {
  const model = new ChatOpenAI({
    modelName: "gpt-4",
    openAIApiKey: process.env.OPENAI_API_KEY,
  });

  const StateAnnotation = Annotation.Root({
    messages: Annotation<string[]>({
      reducer: (x, y) => x.concat(y),
      default: () => [],
    }),
  });

  const graph = new StateGraph(StateAnnotation)
    .addNode("chat", async (state) => {
      const response = await model.invoke(state.messages);
      return { messages: [response.content] };
    })
    .addEdge("__start__", "chat")
    .addEdge("chat", "__end__");

  return {
    agent: new LanggraphAgent({
      name: "chatbot",
      compiledWorkflow: graph.compile(),
    }),
  };
};

createExpressRoutes({
  createAgent,
  express: app,
  observability: { type: ExporterType.Console },
});

app.listen(9000);

Best Practices

1. Use Console Export in Development Environments

During local development, using ConsoleTraceConfig or ExporterType.Console allows you to view trace data in real-time without configuring external services.

2. Use OTLP Export in Production Environments

For production environments, it's recommended to export trace data to professional platforms like Langfuse or Jaeger for long-term storage and analysis.

Prerequisites​

Install Dependencies​

Overview​

What is Observability​

Observability Features​

Architecture Principles​

Span Hierarchy Example​

Span Type Description​

Standards Followed​

Quick Start​

Method 1: Enable via Environment Variables (Recommended)​

Method 2: Enable via Code Configuration​

Exporter Configuration​

Console Export (Local Debugging)​

OTLP Export (Production)​

Langfuse Configuration Example​

Jaeger Configuration Example​

Complete Examples​

Python + LangGraph + Langfuse​

TypeScript + LangGraph + Console​

Best Practices​

1. Use Console Export in Development Environments​

2. Use OTLP Export in Production Environments​

Related Resources​