Observability Guide
CloudBase provides built-in observability capabilities based on OpenTelemetry and OpenInference standards, helping developers track and monitor the complete execution chain of AI Agents.
Prerequisites
- Agent application created (LangChain / LangGraph / CrewAI)
- Corresponding SDK installed (
cloudbase-agent-server/@cloudbase/agent-server) - Understanding of OpenTelemetry basics (optional)
Install Dependencies
- Python
- TypeScript
# Basic dependencies
pip install cloudbase-agent-server cloudbase-agent-observability
# If you need to export to OTLP backend (e.g., Langfuse)
pip install opentelemetry-exporter-otlp
# Basic dependencies
npm install @cloudbase/agent-server @cloudbase/agent-observability
# If you need to export to OTLP backend
npm install @opentelemetry/exporter-trace-otlp-http
Overview
What is Observability
Observability is the ability to understand a system's internal state through signals output by the system (logs, metrics, traces). For AI Agent applications, observability helps you:
- Track Execution Chains: View the complete call chain from the Agent receiving a request to returning a response
- Identify Performance Bottlenecks: Identify time-consuming LLM calls or tool executions
- Debug Issues: Analyze the Agent's decision process and tool call parameters
- Optimize Costs: Calculate token usage and analyze model invocation frequency
Observability Features
- Out-of-the-Box: Enable with one line of code or one environment variable, no complex configuration
- Full-Chain Tracing: Automatically links Server layer → Adapter layer → Agent SDK layer call chains
- Standardized: Follows OpenTelemetry and OpenInference semantic conventions
- Multiple Export Targets: Supports console output (debugging) and OTLP export (Langfuse, Jaeger, etc.)
Architecture Principles
Span Hierarchy Example
Using a LangGraph workflow as an example, a typical Span hierarchy looks like:
AG-UI.Server (Request entry point)
└─ Adapter.LangGraph (Agent adapter layer)
└─ LangGraph
├─ node_a (LangGraph node)
│ └─ ChatOpenAI (LLM call)
├─ node_b (LangGraph node)
│ ├─ ChatOpenAI (LLM call)
│ └─ calculator (Tool call)
└─ synthesizer (LangGraph node)
└─ ChatOpenAI (LLM call)
Span Type Description
| Type | Icon | Description | Examples |
|---|---|---|---|
CHAIN | ⛓️ | Chained calls | Adapter.LangGraph, LangGraph nodes |
LLM | 💬 | LLM calls | ChatOpenAI, ChatAnthropic |
TOOL | 🔧 | Tool calls | calculator, get_weather |
AGENT | 🤖 | Agent calls | Multi-Agent orchestration scenarios |
Standards Followed
- OpenTelemetry: Standard framework for distributed tracing, providing concepts like Span, Trace, Context
- OpenInference: Semantic conventions for AI applications, defining attribute specifications for Span types like LLM, TOOL, CHAIN
Key attributes include:
input.value/output.value: Input/output contentllm.model_name: Model identifierllm.token_count.prompt/llm.token_count.completion: Token usagetool.name: Tool function name
Quick Start
Method 1: Enable via Environment Variables (Recommended)
This is the simplest approach, no code changes required, just set environment variables.
# Enable console output (for local development debugging)
AUTO_TRACES_STDOUT=true
# Disable observability
AUTO_TRACES_STDOUT=false
Example:
- Python
- TypeScript
# app.py - No code changes needed
from cloudbase_agent.server import AgentServiceApp
from cloudbase_agent.langgraph import LangGraphAgent
app = AgentServiceApp() # Automatically reads AUTO_TRACES_STDOUT environment variable
app.run(lambda: {"agent": agent})
// index.js - No code changes needed
import { createExpressRoutes } from "@cloudbase/agent-server";
createExpressRoutes({
createAgent,
express: app,
// Don't pass observability parameter, automatically reads AUTO_TRACES_STDOUT environment variable
});
Method 2: Enable via Code Configuration
For finer control (e.g., OTLP export configuration), you can explicitly configure via code.
- Python
- TypeScript
from cloudbase_agent.server import AgentServiceApp
from cloudbase_agent.observability.server import ConsoleTraceConfig, OTLPTraceConfig
# Option A: Console output (local debugging)
app = AgentServiceApp(observability=ConsoleTraceConfig())
# Option B: Export to Langfuse
app = AgentServiceApp(
observability=OTLPTraceConfig(
endpoint="https://your-langfuse.com/api/public/otel/v1/traces",
headers={"Authorization": "Basic your-credentials"}
)
)
app.run(lambda: {"agent": agent})
import { createExpressRoutes } from "@cloudbase/agent-server";
import { ExporterType } from "@cloudbase/agent-observability/server";
// Option A: Console output (local debugging)
createExpressRoutes({
createAgent,
express: app,
observability: { type: ExporterType.Console }
});
// Option B: Export to OTLP backend
createExpressRoutes({
createAgent,
express: app,
observability: {
type: ExporterType.OTLP,
url: "https://your-langfuse.com/api/public/otel/v1/traces",
headers: { "Authorization": "Basic your-credentials" }
}
});
Exporter Configuration
Console Export (Local Debugging)
The console exporter outputs Span information in JSON format to the console, suitable for local development and debugging.
Output Example:
{
"trace_id": "550e8400-e29b-41d4-a716-446655440000",
"span_id": "a1b2c3d4e5f67890",
"parent_span_id": "0987654321fedcba",
"name": "ChatOpenAI",
"kind": "SPAN_KIND_INTERNAL",
"start_time": "2025-01-15T08:30:00.123456Z",
"end_time": "2025-01-15T08:30:01.234567Z",
"attributes": {
"openinference.span.kind": "LLM",
"llm.model_name": "gpt-4",
"input.value": "Hello, how are you?",
"output.value": "I'm doing well, thank you!",
"llm.token_count.prompt": 5,
"llm.token_count.completion": 7
}
}
Viewing Tips:
# Save to file
python app.py 2>&1 | grep "^\\{" > spans.jsonl
# Parse and beautify output
python app.py 2>&1 | python -m json.tool
OTLP Export (Production)
OTLP (OpenTelemetry Protocol) is the standard transmission protocol for OpenTelemetry and can export trace data to various backends like Langfuse, Jaeger, Zipkin, etc.
Langfuse Configuration Example
Langfuse is an open-source LLM observability platform with native OpenTelemetry support.
- Python
- TypeScript
from cloudbase_agent.observability.server import OTLPTraceConfig
app = AgentServiceApp(
observability=OTLPTraceConfig(
endpoint="https://cloud.langfuse.com/api/public/otel/v1/traces",
headers={
"Authorization": "Basic " + base64.b64encode(
f"{public_key}:{secret_key}".encode()
).decode()
}
)
)
import { ExporterType } from "@cloudbase/agent-observability/server";
createExpressRoutes({
createAgent,
express: app,
observability: {
type: ExporterType.OTLP,
url: "https://cloud.langfuse.com/api/public/otel/v1/traces",
headers: {
"Authorization": `Basic ${Buffer.from(`${publicKey}:${secretKey}`).toString("base64")}`
}
}
});
Jaeger Configuration Example
Jaeger is an open-source distributed tracing system by Uber.
- Python
- TypeScript
from cloudbase_agent.observability.server import OTLPTraceConfig
app = AgentServiceApp(
observability=OTLPTraceConfig(
endpoint="http://localhost:4318/v1/traces"
)
)
createExpressRoutes({
createAgent,
express: app,
observability: {
type: ExporterType.OTLP,
url: "http://localhost:4318/v1/traces"
}
});
Complete Examples
Python + LangGraph + Langfuse
import os
import base64
from cloudbase_agent.server import AgentServiceApp
from cloudbase_agent.langgraph import LangGraphAgent
from cloudbase_agent.observability.server import OTLPTraceConfig
from langgraph.graph import StateGraph, MessagesState
from langchain_openai import ChatOpenAI
# Create Agent
def create_agent():
model = ChatOpenAI(
api_key=os.getenv("OPENAI_API_KEY"),
base_url=os.getenv("OPENAI_BASE_URL"),
model="gpt-4"
)
async def chat_node(state):
response = await model.ainvoke(state["messages"])
return {"messages": [response]}
workflow = StateGraph(MessagesState)
workflow.add_node("chat", chat_node)
workflow.add_edge("__start__", "chat")
workflow.add_edge("chat", "__end__")
return {
"agent": LangGraphAgent(
name="chatbot",
graph=workflow.compile()
)
}
# Configure Langfuse
credentials = base64.b64encode(
f"{os.getenv('LANGFUSE_PUBLIC_KEY')}:{os.getenv('LANGFUSE_SECRET_KEY')}".encode()
).decode()
app = AgentServiceApp(
observability=OTLPTraceConfig(
endpoint=f"{os.getenv('LANGFUSE_HOST')}/api/public/otel/v1/traces",
headers={"Authorization": f"Basic {credentials}"}
)
)
app.run(create_agent)
TypeScript + LangGraph + Console
import { createExpressRoutes } from "@cloudbase/agent-server";
import { LanggraphAgent } from "@cloudbase/agent-adapter-langgraph";
import { StateGraph, Annotation } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";
import express from "express";
import { ExporterType } from "@cloudbase/agent-observability/server";
const app = express();
const createAgent = () => {
const model = new ChatOpenAI({
modelName: "gpt-4",
openAIApiKey: process.env.OPENAI_API_KEY,
});
const StateAnnotation = Annotation.Root({
messages: Annotation<string[]>({
reducer: (x, y) => x.concat(y),
default: () => [],
}),
});
const graph = new StateGraph(StateAnnotation)
.addNode("chat", async (state) => {
const response = await model.invoke(state.messages);
return { messages: [response.content] };
})
.addEdge("__start__", "chat")
.addEdge("chat", "__end__");
return {
agent: new LanggraphAgent({
name: "chatbot",
compiledWorkflow: graph.compile(),
}),
};
};
createExpressRoutes({
createAgent,
express: app,
observability: { type: ExporterType.Console },
});
app.listen(9000);
Best Practices
1. Use Console Export in Development Environments
During local development, using ConsoleTraceConfig or ExporterType.Console allows you to view trace data in real-time without configuring external services.
2. Use OTLP Export in Production Environments
For production environments, it's recommended to export trace data to professional platforms like Langfuse or Jaeger for long-term storage and analysis.