Skip to content

Commit

Permalink
Add monitoring page
Browse files Browse the repository at this point in the history
Signed-off-by: Michel Hollands <[email protected]>
  • Loading branch information
MichelHollands committed Jan 10, 2025
1 parent c5ad332 commit f253fc6
Showing 1 changed file with 115 additions and 0 deletions.
115 changes: 115 additions & 0 deletions src/content/doc-surrealdb/reference-guide/observability.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
---
sidebar_position: 6
sidebar_label: Observability
title: Observability | Reference guides
description: In SurrealDB, metrics and traces are available if enabled⁠.
---

# Observability
SurrealDB can be monitored by enabling the built in observability.

## Enable observability
To enable observability the `SURREAL_TELEMETRY_PROVIDER` environment variable has to be set to `otlp`. If set to anything else no observability will be available.

If enabled, SurrealDB will send metrics and/or traces to an [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/). Configuration of the collector is done via [environment variables](https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/). The most important one is [OTEL_EXPORTER_OTLP_ENDPOINT](https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/#otel_exporter_otlp_endpoint). By default this is set to localhost. It should be set to the GRPC endpoint of your OTEL collector. For example if your OTEL collector named `my-collector` is running in Kubernetes in the `monitoring` namespace the following can be used:

```
OTEL_EXPORTER_OTLP_ENDPOINT="http://my-collector.monitoring.svc.cluster.local:4317"
```

Metrics can be disabled (even if `SURREAL_TELEMETRY_PROVIDER` is set to `otlp`) by setting the `SURREAL_TELEMETRY_DISABLE_METRICS` environment variable to `true`. Similarly traces can be disabled by setting `SURREAL_TELEMETRY_DISABLE_TRACING` to `true`.

## Metrics

Metrics are gathered every minute and sent to the collector. The following metrics are present:

<table>
<thead>
<tr>
<th colspan="1" scope="col">Name</th>
<th colspan="1" scope="col">[Instrument](https://opentelemetry.io/docs/concepts/signals/metrics/#metric-instruments)</th>
<th colspan="1" scope="col">Explanation</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="1" scope="row" data-label="Metric name">
rpc.server.duration
</td>
<td colspan="1" scope="row" data-label="Type">
histogram
</td>
<td colspan="1" scope="row" data-label="Explanation">
Measures duration of inbound RPC requests in milliseconds
</td>
</tr>
<tr>
<td colspan="1" scope="row" data-label="Metric name">
rpc.server.active_connections
</td>
<td colspan="1" scope="row" data-label="Type">
counter
</td>
<td colspan="1" scope="row" data-label="Explanation">
The number of active WebSocket connections
</td>
</tr>
<tr>
<td colspan="1" scope="row" data-label="Metric name">
rpc.server.response.size
</td>
<td colspan="1" scope="row" data-label="Type">
histogram
</td>
<td colspan="1" scope="row" data-label="Explanation">
Measures the size of HTTP response messages
</td>
</tr>
<tr>
<td colspan="1" scope="row" data-label="Metric name">
http.server.duration
</td>
<td colspan="1" scope="row" data-label="Type">
histogram
</td>
<td colspan="1" scope="row" data-label="Explanation">
The HTTP server duration in milliseconds
</td>
</tr>
<tr>
<td colspan="1" scope="row" data-label="Metric name">
http.server.active_requests
</td>
<td colspan="1" scope="row" data-label="Type">
counter
</td>
<td colspan="1" scope="row" data-label="Explanation">
The number of active HTTP requests
</td>
</tr>
<tr>
<td colspan="1" scope="row" data-label="Metric name">
http.server.request.size
</td>
<td colspan="1" scope="row" data-label="Type">
histogram
</td>
<td colspan="1" scope="row" data-label="Explanation">
Measures the size of HTTP request messages
</td>
</tr>
<tr>
<td colspan="1" scope="row" data-label="Metric name">
http.server.response.size
</td>
<td colspan="1" scope="row" data-label="Type">
histogram
</td>
<td colspan="1" scope="row" data-label="Explanation">
Measures the size of HTTP response messages
</td>
</tr>
</tbody>
</table>

The metrics are shown here in the form required by the [OpenTelemetry Metrics Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/general/metrics/) with a `.` separator. When ingested into Prometheus the `.` separator will be [replaced](https://prometheus.io/blog/2024/03/14/commitment-to-opentelemetry/#support-utf-8-metric-and-label-names) with an `_`. For example `rpc.server.active.connections` will be transformed into `rpc_server_active_connections`.

0 comments on commit f253fc6

Please sign in to comment.