Skip to content

Commit

Permalink
Improve v2 docs
Browse files Browse the repository at this point in the history
Signed-off-by: Yuri Shkuro <[email protected]>
  • Loading branch information
yurishkuro committed Nov 24, 2024
1 parent 9445d0b commit 64eaaf1
Show file tree
Hide file tree
Showing 11 changed files with 87 additions and 64 deletions.
6 changes: 2 additions & 4 deletions .github/workflows/ci-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,10 @@ jobs:
go install github.com/wjdp/htmltest@latest
- name: Strict link check for newer versions
run: |
htmltest -c .htmltest.yml
run: make check-links

- name: Relaxed link check for newer versions
run: |
htmltest -c .htmltest.old-versions.yml
run: make check-links-older

spellcheck:
runs-on: ubuntu-latest
Expand Down
14 changes: 9 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
HTMLPROOFER = bundle exec htmlproofer
HUGO_THEME = jaeger-docs
THEME_DIR := themes/$(HUGO_THEME)
HTMLTEST := htmltest

# generate currently doesn't do anything, but can be useful in the future.
generate:
Expand Down Expand Up @@ -40,13 +41,16 @@ build: clean generate
link-checker-setup:
curl https://raw.githubusercontent.com/wjdp/htmltest/master/godownloader.sh | bash

run-link-checker:
bin/htmltest
check-links:
$(HTMLTEST) --conf .htmltest.yml

check-internal-links: clean build link-checker-setup run-link-checker
check-links-older:
$(HTMLTEST) --conf .htmltest.old-versions.yml

check-all-links: clean build link-checker-setup
bin/htmltest --conf .htmltest.external.yml
check-links-external:
$(HTMLTEST) --conf .htmltest.external.yml

check-links-all: check-links check-links-older check-links-external

spellcheck:
cat scripts/cspell/project-names.txt | grep -v '^#' | grep -v '^\s*$$' | tr ' ' '\n' > scripts/cspell/project-names-parsed.txt
Expand Down
25 changes: 9 additions & 16 deletions content/docs/next-release-v2/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ If you are new to distributed tracing, please take a look at the [Related Links]

## About

Jaeger is a distributed tracing platform released as open source by [Uber Technologies][ubeross] and donated to [Cloud Native Computing Foundation](https://cncf.io/) where it is a graduated project.
Jaeger is a distributed tracing platform released as open source by [Uber Technologies][ubeross] in 2016 and donated to [Cloud Native Computing Foundation](https://cncf.io/) where it is a graduated project.

With Jaeger you can:

Expand All @@ -31,26 +31,19 @@ Uber published a blog post, [Evolving Distributed Tracing at Uber](https://eng.u

* [OpenTracing](https://opentracing.io/)-inspired data model
* [OpenTelemetry](https://opentelemetry.io/) compatible
* Multiple built-in storage backends: Cassandra, Elasticsearch, OpenSearch, and in-memory
* Community supported external storage backends via the gRPC plugin: [ClickHouse](https://github.com/jaegertracing/jaeger-clickhouse)
* System topology graphs
* Adaptive sampling
* Service Performance Monitoring (SPM)
* Post-collection data processing

See [Features](./features/) page for more details.

## Technical Specs

* Backend components implemented in Go
* React/Javascript UI
* Supported storage backends:
* Multiple built-in storage backends:
* [Cassandra 4+](./cassandra/)
* [Elasticsearch 7.x, 8.x](./elasticsearch/)
* [Badger](./badger/)
* [Kafka](./kafka/) - as an intermediate buffer
* [Memory storage](./memory/)
* Custom backends via [Remote Storage API](./storage/#remote-storage)
* Extensibility with custom backends via [Remote Storage API](./storage/#remote-storage)
* System topology / service dependencies graphs
* Adaptive sampling
* Service Performance Monitoring (SPM)
* Post-collection data processing

See [Features](./features/) page for more details.

## Quick Start

Expand Down
40 changes: 30 additions & 10 deletions content/docs/next-release-v2/apis.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,9 @@ The following tables list the default ports used by Jaeger components. They can

Jaeger can receive trace data in multiple formats on different ports.

### OpenTelemetry Protocol (stable)
### OpenTelemetry Protocol

**Status**: Stable

Jaeger can receive trace data from the OpenTelemetry SDKs in their native [OpenTelemetry Protocol (OTLP)][otlp]. The OTLP data is accepted in these formats:
* binary gRPC
Expand All @@ -57,19 +59,25 @@ Only tracing data is accepted, since Jaeger does not store other telemetry types
[otlp-rcvr]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/receiver/otlpreceiver/README.md
[otlp]: https://github.com/open-telemetry/opentelemetry-proto/blob/main/docs/specification.md

### Legacy Protobuf via gRPC (stable)
### Legacy Protobuf via gRPC

**Status**: Stable, Deprecated

Jaeger's legacy Protobuf format is defined in [collector.proto] IDL file. Support for this format has been removed from OpenTelemetry SDKs, it's only maintained for backwards compatibility.

### Legacy Thrift over HTTP (stable)
### Legacy Thrift over HTTP

**Status**: Stable, Deprecated

Jaeger's legacy Thrift format is defined in [jaeger.thrift] IDL file, and is only maintained for backwards compatibility. The Thrift payload can be submitted in an HTTP POST request to the `/api/traces` endpoint, for example, `https://jaeger-collector:14268/api/traces`. The `Batch` struct needs to be encoded using Thrift's `binary` encoding, and the HTTP request should specify the content type header:

```
Content-Type: application/vnd.apache.thrift.binary
```

### Zipkin Formats (stable)
### Zipkin Formats

**Status**: Stable

Jaeger can accept spans in several Zipkin data formats, namely JSON v1/v2 and Thrift. **jaeger-collector** needs to be configured to enable Zipkin HTTP server, e.g. on port 9411 used by Zipkin collectors. The server enables two endpoints that expect POST requests:

Expand All @@ -80,23 +88,33 @@ Jaeger can accept spans in several Zipkin data formats, namely JSON v1/v2 and Th

Traces saved in the storage can be retrieved by calling **jaeger-query** Service.

### gRPC/Protobuf (stable)
### gRPC/Protobuf

**Status**: Stable

The recommended way for programmatically retrieving traces and other data is via the `jaeger.api_v3.QueryService` gRPC endpoint defined in [api_v3/query_service.proto](https://github.com/jaegertracing/jaeger-idl/blob/main/proto/api_v3/query_service.proto) IDL file. In the default configuration this endpoint is accessible on port `:16685`. The legacy [api_v2](https://github.com/jaegertracing/jaeger-idl/tree/main/proto/api_v2) is also supported.

### HTTP JSON (internal)
### HTTP JSON

**Status**: Internal

Jaeger UI communicates with **jaeger-query** Service via JSON API. For example, a trace can be retrieved via a GET request to `https://jaeger-query:16686/api/traces/{trace-id-hex-string}`. This JSON API is intentionally undocumented and subject to change.

## Remote Storage API (stable)
## Remote Storage API

**Status**: Stable

When using the `grpc` storage type (a.k.a. [remote storage](../storage/#remote-storage)), Jaeger components can use custom storage backends as long as those backends implement the gRPC [Remote Storage API][storage.proto].

## Remote Sampling Configuration (stable)
## Remote Sampling Configuration

**Status**: Stable

This API supports Jaeger's [Remote Sampling](../sampling/#remote-sampling) protocol, defined in the [sampling.proto] IDL file. See [Remote Sampling](../sampling/#remote-sampling) for details on how to configure Jaeger with sampling strategies.

## Service dependencies graph (internal)
## Service dependencies graph

**Status**: Internal

Can be retrieved from `/api/dependencies` endpoint. The GET request expects two parameters:

Expand All @@ -107,7 +125,9 @@ The returned JSON is a list of edges represented as tuples `(caller, callee, cou

For programmatic access to the service graph, the recommended API is gRPC/Protobuf described above.

## Service Performance Monitoring (internal)
## Service Performance Monitoring

**Status**: Internal

Please refer to the [SPM Documentation](../spm/#api)

Expand Down
26 changes: 19 additions & 7 deletions content/docs/next-release-v2/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,24 +12,36 @@ children:
url: terminology
---

Jaeger can be deployed either as an **all-in-one** binary, where all Jaeger backend components
run in a single process, or as a scalable distributed system. There are two main deployment options discussed below.
Jaeger v2 is designed to be a versatile and flexible tracing platform. It can be deployed as a single binary that can be configured to perform different roles within the Jaeger architecture, such as:
* **collector**: Receives incoming trace data from applications and writes it into a storage backend.
* **query**: Serves the APIs and the user interface for querying and visualizing traces.
* **ingester**: Ingests spans from Kafka and writes them into a storage backend; useful when running in a [split collector-Kafka-ingester configuration](./#via-kafka).
* **all-in-one**: Collector and query roles in a single process.
* **agent**: A host agent or a sidecar that runs next to the application and forwards trace data to the collector. While Jaeger can be configured for this role, we recommend using the standard [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) instead because you may likely need it to process other types of telemetry (metrics & logs).

## Direct to storage
Choosing between the **all-in-one** and the **collector**/**query** configurations is a matter of preference. When using external storage backend, both configurations are horizontally scalable, but the **collector**/**query** configuration allows to separate the read and write traffic and to scale them independently, as well as to apply different access and security policies.

In this deployment Jaeger receives the data from traced applications and writes it directly to storage. The storage must be able to handle both average and peak traffic. Collectors use an in-memory queue to smooth short-term traffic peaks, but a sustained traffic spike may result in dropped data if the storage is not able to keep up.
The **all-in-one** configuration with in-memory storage is most suitable for development and testing, but it is not recommended for production since the data is lost on restarts. **all-in-one** with the [Badger](../badger/) backend _can_ be used in production, but only for modest data volumes since it is limited to a single instance and cannot be scaled horizontantally.

## Architecture choices

The two most common deployment options for a scalable Jaeger backend are direct-to-storage and using Kafka as a buffer.

### Direct to storage

In this deployment the **collector**s receive the data from traced applications and write it directly to storage. The storage must be able to handle both average and peak traffic. The **collector**s may use an in-memory queue to smooth short-term traffic peaks, but a sustained traffic spike may result in dropped data if the storage is not able to keep up.

![Architecture](/img/architecture-v2-2024.png)

## Via Kafka
### Via Kafka

To prevent data loss between collectors and storage, Kafka can be used as an intermediary, persistent queue. Jaeger can be deployed with OpenTelemetry to handle writing the data to Kafka and pulling it off the queue and writing the data to the storage. Multiple Jaeger instances can be deployed to scale up ingestion; they will automatically partition the load across them.
To prevent data loss between **collector**s and storage, Kafka can be used as an intermediary, persistent queue. The **collector**s are configured with Kafka exporters. An additional component, **ingester**, needs to be deployed to read data from Kafka and save it to storage. Multiple **ingester**s can be deployed to scale up ingestion; they will automatically partition the load across them. In practice, an **ingester** is very similar to a **collector**, only configured with a Kafka receiver instead of RPC-based receivers.

![Architecture](/img/architecture-v2-kafka-2024.png)

## With OpenTelemetry Collector

You **do not need to use OpenTelemetry Collector**, because **Jaeger** is a customized distribution of the OpenTelemetry Collector with different roles. However, if you already use the OpenTelemetry Collectors, for gathering other types of telemetry or for pre-processing / enriching the tracing data, it __can be placed before__ **Jaeger**. The OpenTelemetry Collectors can be run as an application sidecar, as a host agent / daemon, or as a central cluster.
You **do not need** to use the OpenTelemetry Collector to operate Jaeger, because Jaeger is a customized distribution of the OpenTelemetry Collector with different roles. However, if you already use the OpenTelemetry Collectors, for gathering other types of telemetry or for pre-processing / enriching the tracing data, it can be placed _in front of_ Jaeger in the pipeline. The OpenTelemetry Collectors can be run as an application sidecar, as a host agent / daemon, or as a central cluster.

The OpenTelemetry Collector supports Jaeger's Remote Sampling protocol and can either serve static configurations from config files directly, or proxy the requests to the Jaeger backend (e.g., when using adaptive sampling).

Expand Down
2 changes: 1 addition & 1 deletion content/docs/next-release-v2/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ Of note here is the `storage` section, which references by name the storage back

### Remote sampling

`remote_sampling` extension is responsible for running HTTP/gRPC servers that expose the [Remote Sampling API](../apis/#remote-sampling-configuration-stable).
`remote_sampling` extension is responsible for running HTTP/gRPC servers that expose the [Remote Sampling API](../apis/#remote-sampling-configuration).

```yaml
remote_sampling:
Expand Down
8 changes: 4 additions & 4 deletions content/docs/next-release-v2/features.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ hasparent: true

## High Scalability

Jaeger backend is designed to have no single points of failure and to scale with the business needs. For example, any given Jaeger installation at Uber is typically processing several billion spans per day.
Jaeger backend is designed to have no single points of failure and to scale with the business needs. For example, Jaeger installation at Uber is typically processing several billion spans per day.

## Cloud Native

Jaeger backend is distributed as a Docker image or a raw binary, available for multiple platforms. The behavior of the binary can be customized via YAML configuration file. Deployment to Kubernetes clusters is assisted by a [Kubernetes operator](https://github.com/jaegertracing/jaeger-operator) and a [Helm chart](https://github.com/kubernetes/charts/tree/master/incubator/jaeger).
Jaeger backend is distributed as a container image or a raw binary, available for multiple platforms. The behavior of the binary can be customized via YAML configuration file. Deployment to Kubernetes clusters is assisted by a [Kubernetes operator](https://github.com/jaegertracing/jaeger-operator) and a [Helm chart](https://github.com/kubernetes/charts/tree/master/incubator/jaeger).

## OpenTelemetry

Expand All @@ -18,13 +18,13 @@ Jaeger backend and Web UI have been designed from ground up to support the OpenT
* Represent traces as directed acyclic graphs (not just trees) via [span references](https://github.com/opentracing/specification/blob/master/specification.md#references-between-spans)
* Support strongly typed span _tags_ and _structured logs_

Jaeger can receive trace data from the OpenTelemetry SDKs in their native [OpenTelemetry Protocol (OTLP)][otlp]. However, the internal data representation and the UI still follow the OpenTracing specification's model.
Jaeger can receive trace data in the standard [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/specs/otel/protocol/). However, the internal data representation and the UI still follow the OpenTracing specification's model.

## Multiple storage backends

Jaeger can be used with a growing number of storage backends:
* It natively supports popular open source NoSQL databases as trace storage backends: Cassandra 4.0+, Elasticsearch 7.x/8.x, and OpenSearch 1.0+.
* It integrates via a gRPC API with other well known databases that have been certified to be Jaeger compliant: [ClickHouse](https://github.com/jaegertracing/jaeger-clickhouse).
* It is extensible via a [Remote Storage gRPC API](../apis/#remote-storage-api) with other well known databases that have been certified to be Jaeger compliant: [ClickHouse](https://github.com/jaegertracing/jaeger-clickhouse).
* There is embedded database support using [Badger](https://github.com/dgraph-io/badger) and simple in-memory storage for testing setups.
* There are ongoing community experiments using other databases; you can find more in [this issue](https://github.com/jaegertracing/jaeger/issues/638).

Expand Down
22 changes: 9 additions & 13 deletions content/docs/next-release-v2/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,25 @@ hasparent: true
weight: 2
---

## In Docker
## All-in-one

The easiest way to run Jaeger is by starting a Docker container:
The easiest way to run Jaeger is by starting it in a container:

```
docker run --rm --name jaeger \
-p 5778:5778 \
-p 16686:16686 \
-p 4317:4317 \
-p 4318:4318 \
-p 14250:14250 \
-p 14268:14268 \
-p 5778:5778 \
-p 9411:9411 \
jaegertracing/jaeger:{{< currentVersion >}}
```

This runs the "all-in-one" configuration of Jaeger (using a configuration file embedded in the binary) that combines collector and query components in a single process and uses a transient in-memory storage for trace data. You can navigate to `http://localhost:16686` to access the Jaeger UI. See the [APIs page](../apis/) for a list of other exposed ports.
This runs the **all-in-one** configuration of Jaeger that combines collector and query components in a single process and uses a transient in-memory storage for trace data. You can navigate to `http://localhost:16686` to access the Jaeger UI. See the [APIs page](../apis/) for a full list of exposed ports.

## Instrumentation

Your applications must be instrumented before they can send tracing data to Jaeger. We recommend using the [OpenTelemetry][otel] instrumentation and SDKs.
{{< warning >}}
Your applications must be instrumented before they can send tracing data to Jaeger. We recommend using the [OpenTelemetry](https://opentelemetry.io/) instrumentation and SDKs.
{{< /warning >}}

## 🚗 HotROD Demo

Expand All @@ -40,15 +38,13 @@ Using this application you can:
- Use open source libraries from `opentelemetry-contrib` to get vendor-neutral instrumentation
for free.

### Running

We recommend running Jaeger and HotROD together via `docker compose`:

```bash
git clone git@github.com:jaegertracing/jaeger.git jaeger
git clone https://github.com/jaegertracing/jaeger.git jaeger
cd jaeger/examples/hotrod
docker compose -f docker-compose-v2.yml up
# Ctrl-C to stop
# press Ctrl-C to exit
```

Then navigate to `http://localhost:8080`. See the [README](https://github.com/jaegertracing/jaeger/blob/main/examples/hotrod/README.md) for other ways to run the demo.
Expand Down
2 changes: 1 addition & 1 deletion content/docs/next-release-v2/sampling.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ These are two basic types of head samplers that are used by the remote sampling:
* **Probabilistic** sampler makes a random sampling decision with a pre-configured probability. For example, with `probability=0.1` approximately 1 in 10 traces will be sampled.
* **Rate Limiting** sampler uses a leaky bucket rate limiter to ensure that traces are sampled with a certain constant rate. For example, when `rate=2.0` it will sample requests with the rate of 2 traces per second.

[remote-sampling-api]: ../apis/#remote-sampling-configuration-stable
[remote-sampling-api]: ../apis/#remote-sampling-configuration

### File-based Sampling Configuration

Expand Down
2 changes: 1 addition & 1 deletion content/docs/next-release-v2/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ The Jaeger backend supports [Remote Sampling](../sampling/#remote-sampling), i.e

If you suspect the remote sampling is not working correctly, try these steps:

1. Make sure that the SDK is actually configured to use remote sampling, points to the correct sampling service address (see [APIs](../apis/#remote-sampling-configuration-stable)), and that address is reachable from your application's [networking namespace](#networking-namespace).
1. Make sure that the SDK is actually configured to use remote sampling, points to the correct sampling service address (see [APIs](../apis/#remote-sampling-configuration)), and that address is reachable from your application's [networking namespace](#networking-namespace).
1. Look at the root span of the traces that are captured in Jaeger. If you are using Jaeger SDKs, the root span will contain the tags `sampler.type` and `sampler.param`, which indicate which strategy was used. (TBD - do OpenTelemetry SDKs record that?)
1. Verify that the server is returning the appropriate sampling strategy for your service:
```
Expand Down
Loading

0 comments on commit 64eaaf1

Please sign in to comment.