Skip to main content

Collect telemetry for MCP workloads

In this tutorial, you'll set up comprehensive observability for your MCP workloads using OpenTelemetry with Jaeger for distributed tracing, Prometheus for metrics collection, and Grafana for visualization.

By the end, you'll have a complete, industry-standard observability solution that captures detailed traces and metrics, giving you visibility into your MCP server performance and usage patterns.

Grafana dashboard showing MCP telemetryGrafana dashboard showing MCP telemetry

Choose your deployment path

This tutorial offers two paths for MCP observability:

ToolHive CLI + Docker observability stack

Use the ToolHive CLI to run MCP servers locally, with Jaeger and Prometheus running in Docker containers. This approach is perfect for:

  • Local development and testing
  • Quick setup and experimentation
  • Individual developer workflows
  • Learning OpenTelemetry concepts
Choose one path

Select your preferred deployment method using the tabs above. All subsequent steps will show instructions for your chosen path.

What you'll learn

  • How to deploy Jaeger and Prometheus for your chosen environment
  • How to configure OpenTelemetry collection for ToolHive MCP servers
  • How to analyze traces in Jaeger and metrics in Prometheus
  • How to set up queries and monitoring for MCP workloads
  • Best practices for observability in your deployment environment

Prerequisites

Before starting this tutorial, make sure you have:

Overview

The architecture for each deployment method:

Your setup will include:

  • ToolHive CLI managing MCP servers in containers
  • Jaeger for distributed tracing with built-in UI
  • Prometheus for metrics collection with web UI
  • OpenTelemetry Collector forwarding data to both backends

Step 1: Deploy the observability stack

First, set up the observability infrastructure for your chosen environment.

Create Docker Compose configuration

Create a Docker Compose file for the observability stack:

observability-stack.yml
services:
jaeger:
image: jaegertracing/jaeger:latest
container_name: jaeger
environment:
- COLLECTOR_OTLP_ENABLED=true
ports:
- '16686:16686' # Jaeger UI
networks:
- observability

prometheus:
image: prom/prometheus:latest
container_name: prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--web.enable-lifecycle'
- '--enable-feature=native-histograms'
ports:
- '9090:9090'
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus-data:/prometheus
networks:
- observability

grafana:
image: grafana/grafana:latest
container_name: grafana
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin
- GF_USERS_ALLOW_SIGN_UP=false
ports:
- '3000:3000'
volumes:
- ./grafana-prometheus.yml:/etc/grafana/provisioning/datasources/prometheus.yml
- grafana-data:/var/lib/grafana
networks:
- observability

otel-collector:
image: otel/opentelemetry-collector-contrib:latest
container_name: otel-collector
command: ['--config=/etc/otel-collector-config.yml']
volumes:
- ./otel-collector-config.yml:/etc/otel-collector-config.yml
ports:
- '4318:4318' # OTLP HTTP receiver (ToolHive sends here)
- '8889:8889' # Prometheus exporter metrics
depends_on:
- jaeger
- prometheus
networks:
- observability

volumes:
prometheus-data:
grafana-data:

networks:
observability:
driver: bridge

Configure the OpenTelemetry Collector

Create the collector configuration to export to both Jaeger and Prometheus:

otel-collector-config.yml
receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318

processors:
batch:
timeout: 10s
send_batch_size: 1024

exporters:
# Export traces to Jaeger
otlp/jaeger:
endpoint: jaeger:4317
tls:
insecure: true

# Expose metrics for Prometheus
prometheus:
endpoint: 0.0.0.0:8889
const_labels:
service: 'toolhive-mcp-proxy'

service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp/jaeger]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [prometheus]

Configure Prometheus and Grafana

Create a Prometheus configuration to scrape the OpenTelemetry Collector:

prometheus.yml
global:
scrape_interval: 15s

scrape_configs:
- job_name: 'otel-collector'
static_configs:
- targets: ['otel-collector:8889']

Create the Prometheus data source configuration for Grafana:

grafana-prometheus.yml
apiVersion: 1

datasources:
- name: prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
editable: true

Start the observability stack

Deploy the stack and verify it's running:

# Start the stack
docker compose -f observability-stack.yml up -d

# Verify Jaeger is running
curl http://localhost:16686/api/services

# Verify Prometheus is running
curl http://localhost:9090/-/healthy

# Verify the OpenTelemetry Collector is ready
curl -I http://localhost:8889/metrics

Access the interfaces:

  • Jaeger UI: http://localhost:16686
  • Prometheus Web UI: http://localhost:9090
  • Grafana: http://localhost:3000 (login: admin/admin)

Step 2: Configure MCP server telemetry

Now configure your MCP servers to send telemetry data to the observability stack.

Set global telemetry configuration

Configure ToolHive CLI with default telemetry settings to send data to the OpenTelemetry Collector:

# Configure the OpenTelemetry endpoint (collector, not directly to Jaeger)
thv config otel set-endpoint localhost:4318

# Enable both metrics and tracing
thv config otel set-metrics-enabled true
thv config otel set-tracing-enabled true

# Set 100% sampling for development
thv config otel set-sampling-rate 1.0

# Use insecure connection for local development
thv config otel set-insecure true

Run an MCP server with telemetry

Start an MCP server with enhanced telemetry configuration:

thv run \
--otel-service-name "mcp-fetch-server" \
--otel-env-vars "USER,HOST" \
--otel-enable-prometheus-metrics-path \
fetch

Verify the server started and is exporting telemetry:

# Check server status
thv list

# Check Prometheus metrics are available on the MCP server
PORT=$(thv list | grep fetch | awk '{print $5}')
curl http://localhost:$PORT/metrics

Step 3: Generate telemetry data

Create some MCP interactions to generate traces and metrics for analysis.

Connect your AI client

Your MCP server is already configured to work with your AI client from the CLI quickstart. Simply use your client to make requests that will generate telemetry data.

Generate sample data

Make several requests using your AI client to create diverse telemetry:

  1. Basic fetch request: "Fetch the content from https://toolhive.dev and summarize it"
  2. Multiple requests: Make 3-4 more fetch requests with different URLs
  3. Error generation: Try an invalid URL to generate error traces

Each interaction creates rich telemetry data including:

  • Request traces with timing information sent to Jaeger
  • Tool call details with sanitized arguments
  • Performance metrics sent to Prometheus

The CLI and Kubernetes deployments will both generate similar telemetry data, with the Kubernetes setup including additional Kubernetes-specific attributes.

Step 4: Access and analyze telemetry data

Now examine your telemetry data using Jaeger and Prometheus to understand MCP server performance.

Access Jaeger for traces

Open Jaeger in your browser at http://localhost:16686.

Explore traces in Jaeger

  1. In the Service dropdown, select mcp-fetch-server
  2. Click Find Traces to see recent traces
  3. Click on individual traces to see detailed spans

Look for traces with protocol and MCP-specific attributes like:

{
"serviceName": "mcp-fetch-server",
"http.duration_ms": "307.8",
"http.status_code": 200,
"mcp.method": "tools/call",
"mcp.tool.name": "fetch",
"mcp.tool.arguments": "url=https://toolhive.dev",
"mcp.transport": "streamable-http",
"service.version": "v0.3.6"
}

Access Grafana for visualization

Open http://localhost:3000 in your browser and log in using the default credentials (admin / admin).

Import the ToolHive dashboard

  1. Click the + icon in the top-right of the Grafana interface and select Import dashboard
  2. In the Import via dashboard JSON model input box, paste the contents of this example dashboard file
  3. Click Load, then Import

Make some requests to your MCP server again and watch the dashboard update in real-time.

You can also explore other metrics in Grafana by creating custom panels and queries. See the Observability guide for examples.

Step 5: Cleanup

When you're finished exploring, clean up your resources.

Stop MCP servers

# Stop and remove the MCP server
thv rm fetch

# Clear telemetry configuration (optional)
thv config otel unset-endpoint
thv config otel unset-metrics-enabled
thv config otel unset-tracing-enabled
thv config otel unset-sampling-rate
thv config otel unset-insecure

Stop observability stack

# Stop all containers
docker compose -f observability-stack.yml down

# Remove all data (optional)
docker compose -f observability-stack.yml down -v

# Clean up provisioning directories (optional)
rm -rf grafana/

What's next?

Congratulations! You've successfully set up comprehensive observability for ToolHive MCP workloads using Jaeger and Prometheus.

To learn more about ToolHive's telemetry capabilities and best practices, see the Observability concepts guide.

Here are some next steps to explore:

  • Custom dashboards: Create Grafana dashboards that query both Jaeger and Prometheus
  • Alerting: Set up Prometheus AlertManager for performance and error alerts
  • Performance optimization: Use telemetry data to optimize MCP server performance
  • Distributed tracing: Understand request flows across multiple MCP servers

CLI-specific next steps

Troubleshooting

Docker containers won't start

Check Docker daemon and container logs:

# Verify Docker is running
docker info

# Check container logs
docker compose -f observability-stack.yml logs jaeger
docker compose -f observability-stack.yml logs prometheus
docker compose -f observability-stack.yml logs otel-collector

Common issues:

  • Port conflicts with existing services
  • Insufficient Docker memory allocation
  • Missing configuration files
ToolHive CLI not sending telemetry

Verify telemetry configuration:

# Check current config
thv config otel get-endpoint
thv config otel get-metrics-enabled

Check the ToolHive CLI logs for telemetry export errors:

  • macOS: ~/Library/Application Support/toolhive/logs/fetch.log
  • Windows: %LOCALAPPDATA%\toolhive\logs\fetch.log
  • Linux: ~/.local/share/toolhive/logs/fetch.log
No traces in Jaeger

Check the telemetry pipeline:

  1. Verify collector is receiving data: curl http://localhost:8888/metrics
  2. Check collector logs: docker logs otel-collector
  3. Verify Jaeger connectivity: curl http://localhost:16686/api/services
No metrics in Prometheus

Common troubleshooting steps:

  1. Verify Prometheus targets: Check http://localhost:9090/targets to ensure otel-collector target is UP
  2. Check collector metrics endpoint: curl http://localhost:8889/metrics (CLI) or port-forward and check in K8s
  3. Review collector configuration: Ensure the Prometheus exporter is properly configured
  4. Check Prometheus config: Verify the scrape configuration includes the collector endpoint