Improve v2 docs

Signed-off-by: Yuri Shkuro <[email protected]>
jaegertracing · Nov 24, 2024 · 64eaaf1 · 64eaaf1
1 parent 9445d0b
commit 64eaaf1
Show file tree

Hide file tree

Showing 11 changed files with 87 additions and 64 deletions.
diff --git a/.github/workflows/ci-test.yml b/.github/workflows/ci-test.yml
@@ -40,12 +40,10 @@ jobs:
         go install github.com/wjdp/htmltest@latest
 
     - name: Strict link check for newer versions
-      run: |
-        htmltest -c .htmltest.yml
+      run: make check-links
 
     - name: Relaxed link check for newer versions
-      run: |
-        htmltest -c .htmltest.old-versions.yml
+      run: make check-links-older
 
   spellcheck:
     runs-on: ubuntu-latest

diff --git a/Makefile b/Makefile
@@ -4,6 +4,7 @@
 HTMLPROOFER  = bundle exec htmlproofer
 HUGO_THEME   = jaeger-docs
 THEME_DIR    := themes/$(HUGO_THEME)
+HTMLTEST     := htmltest
 
 # generate currently doesn't do anything, but can be useful in the future.
 generate:
@@ -40,13 +41,16 @@ build: clean generate
 link-checker-setup:
 	curl https://raw.githubusercontent.com/wjdp/htmltest/master/godownloader.sh | bash
 
-run-link-checker:
-	bin/htmltest
+check-links:
+	$(HTMLTEST) --conf .htmltest.yml
 
-check-internal-links: clean build link-checker-setup run-link-checker
+check-links-older:
+	$(HTMLTEST) --conf .htmltest.old-versions.yml
 
-check-all-links: clean build link-checker-setup
-	bin/htmltest --conf .htmltest.external.yml
+check-links-external:
+	$(HTMLTEST) --conf .htmltest.external.yml
+
+check-links-all: check-links check-links-older check-links-external
 
 spellcheck:
 	cat scripts/cspell/project-names.txt | grep -v '^#' | grep -v '^\s*$$' | tr ' ' '\n' > scripts/cspell/project-names-parsed.txt

diff --git a/content/docs/next-release-v2/_index.md b/content/docs/next-release-v2/_index.md
@@ -16,7 +16,7 @@ If you are new to distributed tracing, please take a look at the [Related Links]
 
 ## About
 
-Jaeger is a distributed tracing platform released as open source by [Uber Technologies][ubeross] and donated to [Cloud Native Computing Foundation](https://cncf.io/) where it is a graduated project.
+Jaeger is a distributed tracing platform released as open source by [Uber Technologies][ubeross] in 2016 and donated to [Cloud Native Computing Foundation](https://cncf.io/) where it is a graduated project.
 
 With Jaeger you can:
 
@@ -31,26 +31,19 @@ Uber published a blog post, [Evolving Distributed Tracing at Uber](https://eng.u
 
   * [OpenTracing](https://opentracing.io/)-inspired data model
   * [OpenTelemetry](https://opentelemetry.io/) compatible
-  * Multiple built-in storage backends: Cassandra, Elasticsearch, OpenSearch, and in-memory
-  * Community supported external storage backends via the gRPC plugin: [ClickHouse](https://github.com/jaegertracing/jaeger-clickhouse)
-  * System topology graphs
-  * Adaptive sampling
-  * Service Performance Monitoring (SPM)
-  * Post-collection data processing
-
-See [Features](./features/) page for more details.
-
-## Technical Specs
-
-  * Backend components implemented in Go
-  * React/Javascript UI
-  * Supported storage backends:
+  * Multiple built-in storage backends:
     * [Cassandra 4+](./cassandra/)
     * [Elasticsearch 7.x, 8.x](./elasticsearch/)
     * [Badger](./badger/)
     * [Kafka](./kafka/) - as an intermediate buffer
     * [Memory storage](./memory/)
-    * Custom backends via [Remote Storage API](./storage/#remote-storage)
+  * Extensibility with custom backends via [Remote Storage API](./storage/#remote-storage)
+  * System topology / service dependencies graphs
+  * Adaptive sampling
+  * Service Performance Monitoring (SPM)
+  * Post-collection data processing
+
+See [Features](./features/) page for more details.
 
 ## Quick Start
 

diff --git a/content/docs/next-release-v2/apis.md b/content/docs/next-release-v2/apis.md
@@ -45,7 +45,9 @@ The following tables list the default ports used by Jaeger components. They can
 
 Jaeger can receive trace data in multiple formats on different ports.
 
-### OpenTelemetry Protocol (stable)
+### OpenTelemetry Protocol
+
+**Status**: Stable
 
 Jaeger can receive trace data from the OpenTelemetry SDKs in their native [OpenTelemetry Protocol (OTLP)][otlp]. The OTLP data is accepted in these formats: 
   * binary gRPC
@@ -57,19 +59,25 @@ Only tracing data is accepted, since Jaeger does not store other telemetry types
 [otlp-rcvr]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/receiver/otlpreceiver/README.md
 [otlp]: https://github.com/open-telemetry/opentelemetry-proto/blob/main/docs/specification.md
 
-### Legacy Protobuf via gRPC (stable)
+### Legacy Protobuf via gRPC
+
+**Status**: Stable, Deprecated
 
 Jaeger's legacy Protobuf format is defined in [collector.proto] IDL file. Support for this format has been removed from OpenTelemetry SDKs, it's only maintained for backwards compatibility.
 
-### Legacy Thrift over HTTP (stable)
+### Legacy Thrift over HTTP
+
+**Status**: Stable, Deprecated
 
 Jaeger's legacy Thrift format is defined in [jaeger.thrift] IDL file, and is only maintained for backwards compatibility. The Thrift payload can be submitted in an HTTP POST request to the  `/api/traces` endpoint, for example, `https://jaeger-collector:14268/api/traces`. The `Batch` struct needs to be encoded using Thrift's `binary` encoding, and the HTTP request should specify the content type header:
 
 ```
 Content-Type: application/vnd.apache.thrift.binary
 ```
 
-### Zipkin Formats (stable)
+### Zipkin Formats
+
+**Status**: Stable
 
 Jaeger can accept spans in several Zipkin data formats, namely JSON v1/v2 and Thrift. **jaeger-collector** needs to be configured to enable Zipkin HTTP server, e.g. on port 9411 used by Zipkin collectors. The server enables two endpoints that expect POST requests:
 
@@ -80,23 +88,33 @@ Jaeger can accept spans in several Zipkin data formats, namely JSON v1/v2 and Th
 
 Traces saved in the storage can be retrieved by calling **jaeger-query** Service.
 
-### gRPC/Protobuf (stable)
+### gRPC/Protobuf
+
+**Status**: Stable
 
 The recommended way for programmatically retrieving traces and other data is via the `jaeger.api_v3.QueryService` gRPC endpoint defined in [api_v3/query_service.proto](https://github.com/jaegertracing/jaeger-idl/blob/main/proto/api_v3/query_service.proto) IDL file. In the default configuration this endpoint is accessible on port `:16685`. The legacy [api_v2](https://github.com/jaegertracing/jaeger-idl/tree/main/proto/api_v2) is also supported.
 
-### HTTP JSON (internal)
+### HTTP JSON
+
+**Status**: Internal
 
 Jaeger UI communicates with **jaeger-query** Service via JSON API. For example, a trace can be retrieved via a GET request to `https://jaeger-query:16686/api/traces/{trace-id-hex-string}`. This JSON API is intentionally undocumented and subject to change.
 
-## Remote Storage API (stable)
+## Remote Storage API
+
+**Status**: Stable
 
 When using the `grpc` storage type (a.k.a. [remote storage](../storage/#remote-storage)), Jaeger components can use custom storage backends as long as those backends implement the gRPC [Remote Storage API][storage.proto].
 
-## Remote Sampling Configuration (stable)
+## Remote Sampling Configuration
+
+**Status**: Stable
 
 This API supports Jaeger's [Remote Sampling](../sampling/#remote-sampling) protocol, defined in the [sampling.proto] IDL file. See [Remote Sampling](../sampling/#remote-sampling) for details on how to configure Jaeger  with sampling strategies.
 
-## Service dependencies graph (internal)
+## Service dependencies graph
+
+**Status**: Internal
 
 Can be retrieved from `/api/dependencies` endpoint. The GET request expects two parameters:
 
@@ -107,7 +125,9 @@ The returned JSON is a list of edges represented as tuples `(caller, callee, cou
 
 For programmatic access to the service graph, the recommended API is gRPC/Protobuf described above.
 
-## Service Performance Monitoring (internal)
+## Service Performance Monitoring
+
+**Status**: Internal
 
 Please refer to the [SPM Documentation](../spm/#api)
 

diff --git a/content/docs/next-release-v2/architecture.md b/content/docs/next-release-v2/architecture.md
@@ -12,24 +12,36 @@ children:
   url: terminology
 ---
 
-Jaeger can be deployed either as an **all-in-one** binary, where all Jaeger backend components
-run in a single process, or as a scalable distributed system. There are two main deployment options discussed below.
+Jaeger v2 is designed to be a versatile and flexible tracing platform. It can be deployed as a single binary that can be configured to perform different roles within the Jaeger architecture, such as:
+  * **collector**: Receives incoming trace data from applications and writes it into a storage backend.
+  * **query**: Serves the APIs and the user interface for querying and visualizing traces.
+  * **ingester**: Ingests spans from Kafka and writes them into a storage backend; useful when running in a [split collector-Kafka-ingester configuration](./#via-kafka).
+  * **all-in-one**: Collector and query roles in a single process.
+  * **agent**: A host agent or a sidecar that runs next to the application and forwards trace data to the collector. While Jaeger can be configured for this role, we recommend using the standard [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) instead because you may likely need it to process other types of telemetry (metrics & logs).
 
-## Direct to storage
+Choosing between the **all-in-one** and the **collector**/**query** configurations is a matter of preference. When using external storage backend, both configurations are horizontally scalable, but the **collector**/**query** configuration allows to separate the read and write traffic and to scale them independently, as well as to apply different access and security policies.
 
-In this deployment Jaeger receives the data from traced applications and writes it directly to storage. The storage must be able to handle both average and peak traffic. Collectors use an in-memory queue to smooth short-term traffic peaks, but a sustained traffic spike may result in dropped data if the storage is not able to keep up.
+The **all-in-one** configuration with in-memory storage is most suitable for development and testing, but it is not recommended for production since the data is lost on restarts. **all-in-one** with the [Badger](../badger/) backend _can_ be used in production, but only for modest data volumes since it is limited to a single instance and cannot be scaled horizontantally.
+
+## Architecture choices
+
+The two most common deployment options for a scalable Jaeger backend are direct-to-storage and using Kafka as a buffer.
+
+### Direct to storage
+
+In this deployment the **collector**s receive the data from traced applications and write it directly to storage. The storage must be able to handle both average and peak traffic. The **collector**s may use an in-memory queue to smooth short-term traffic peaks, but a sustained traffic spike may result in dropped data if the storage is not able to keep up.
 
 ![Architecture](/img/architecture-v2-2024.png)
 
-## Via Kafka
+### Via Kafka
 
-To prevent data loss between collectors and storage, Kafka can be used as an intermediary, persistent queue. Jaeger can be deployed with OpenTelemetry to handle writing the data to Kafka and pulling it off the queue and writing the data to the storage. Multiple Jaeger instances can be deployed to scale up ingestion; they will automatically partition the load across them.
+To prevent data loss between **collector**s and storage, Kafka can be used as an intermediary, persistent queue. The **collector**s are configured with Kafka exporters. An additional component, **ingester**, needs to be deployed to read data from Kafka and save it to storage. Multiple **ingester**s can be deployed to scale up ingestion; they will automatically partition the load across them. In practice, an **ingester** is very similar to a **collector**, only configured with a Kafka receiver instead of RPC-based receivers.
 
 ![Architecture](/img/architecture-v2-kafka-2024.png)
 
 ## With OpenTelemetry Collector
 
-You **do not need to use OpenTelemetry Collector**, because **Jaeger** is a customized distribution of the OpenTelemetry Collector with different roles. However, if you already use the OpenTelemetry Collectors, for gathering other types of telemetry or for pre-processing / enriching the tracing data, it __can be placed before__  **Jaeger**. The OpenTelemetry Collectors can be run as an application sidecar, as a host agent / daemon, or as a central cluster.
+You **do not need** to use the OpenTelemetry Collector to operate Jaeger, because Jaeger is a customized distribution of the OpenTelemetry Collector with different roles. However, if you already use the OpenTelemetry Collectors, for gathering other types of telemetry or for pre-processing / enriching the tracing data, it can be placed _in front of_ Jaeger in the pipeline. The OpenTelemetry Collectors can be run as an application sidecar, as a host agent / daemon, or as a central cluster.
 
 The OpenTelemetry Collector supports Jaeger's Remote Sampling protocol and can either serve static configurations from config files directly, or proxy the requests to the Jaeger backend (e.g., when using adaptive sampling).
 

diff --git a/content/docs/next-release-v2/configuration.md b/content/docs/next-release-v2/configuration.md
@@ -64,7 +64,7 @@ Of note here is the `storage` section, which references by name the storage back
 
 ### Remote sampling
 
-`remote_sampling` extension is responsible for running HTTP/gRPC servers that expose the [Remote Sampling API](../apis/#remote-sampling-configuration-stable).
+`remote_sampling` extension is responsible for running HTTP/gRPC servers that expose the [Remote Sampling API](../apis/#remote-sampling-configuration).
 
 ```yaml
 remote_sampling:

diff --git a/content/docs/next-release-v2/features.md b/content/docs/next-release-v2/features.md
@@ -5,11 +5,11 @@ hasparent: true
 
 ## High Scalability
 
-Jaeger backend is designed to have no single points of failure and to scale with the business needs. For example, any given Jaeger installation at Uber is typically processing several billion spans per day.
+Jaeger backend is designed to have no single points of failure and to scale with the business needs. For example, Jaeger installation at Uber is typically processing several billion spans per day.
 
 ## Cloud Native
 
-Jaeger backend is distributed as a Docker image or a raw binary, available for multiple platforms. The behavior of the binary can be customized via YAML configuration file. Deployment to Kubernetes clusters is assisted by a [Kubernetes operator](https://github.com/jaegertracing/jaeger-operator) and a [Helm chart](https://github.com/kubernetes/charts/tree/master/incubator/jaeger).
+Jaeger backend is distributed as a container image or a raw binary, available for multiple platforms. The behavior of the binary can be customized via YAML configuration file. Deployment to Kubernetes clusters is assisted by a [Kubernetes operator](https://github.com/jaegertracing/jaeger-operator) and a [Helm chart](https://github.com/kubernetes/charts/tree/master/incubator/jaeger).
 
 ##  OpenTelemetry
 
@@ -18,13 +18,13 @@ Jaeger backend and Web UI have been designed from ground up to support the OpenT
 * Represent traces as directed acyclic graphs (not just trees) via [span references](https://github.com/opentracing/specification/blob/master/specification.md#references-between-spans)
 * Support strongly typed span _tags_ and _structured logs_
 
-Jaeger can receive trace data from the OpenTelemetry SDKs in their native [OpenTelemetry Protocol (OTLP)][otlp]. However, the internal data representation and the UI still follow the OpenTracing specification's model.
+Jaeger can receive trace data in the standard [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/specs/otel/protocol/). However, the internal data representation and the UI still follow the OpenTracing specification's model.
 
 ## Multiple storage backends
 
 Jaeger can be used with a growing number of storage backends:
 * It natively supports popular open source NoSQL databases as trace storage backends: Cassandra 4.0+, Elasticsearch 7.x/8.x, and OpenSearch 1.0+.
-* It integrates via a gRPC API with other well known databases that have been certified to be Jaeger compliant: [ClickHouse](https://github.com/jaegertracing/jaeger-clickhouse).
+* It is extensible via a [Remote Storage gRPC API](../apis/#remote-storage-api) with other well known databases that have been certified to be Jaeger compliant: [ClickHouse](https://github.com/jaegertracing/jaeger-clickhouse).
 * There is embedded database support using [Badger](https://github.com/dgraph-io/badger) and simple in-memory storage for testing setups.
 * There are ongoing community experiments using other databases; you can find more in [this issue](https://github.com/jaegertracing/jaeger/issues/638).
 

diff --git a/content/docs/next-release-v2/getting-started.md b/content/docs/next-release-v2/getting-started.md
@@ -4,27 +4,25 @@ hasparent: true
 weight: 2
 ---
 
-## In Docker
+## All-in-one
 
-The easiest way to run Jaeger is by starting a Docker container:
+The easiest way to run Jaeger is by starting it in a container:
 
 ```
 docker run --rm --name jaeger \
-  -p 5778:5778 \
   -p 16686:16686 \
   -p 4317:4317 \
   -p 4318:4318 \
-  -p 14250:14250 \
-  -p 14268:14268 \
+  -p 5778:5778 \
   -p 9411:9411 \
   jaegertracing/jaeger:{{< currentVersion >}}
 ```
 
-This runs the "all-in-one" configuration of Jaeger (using a configuration file embedded in the binary) that combines collector and query components in a single process and uses a transient in-memory storage for trace data. You can navigate to `http://localhost:16686` to access the Jaeger UI. See the [APIs page](../apis/) for a list of other exposed ports.
+This runs the **all-in-one** configuration of Jaeger that combines collector and query components in a single process and uses a transient in-memory storage for trace data. You can navigate to `http://localhost:16686` to access the Jaeger UI. See the [APIs page](../apis/) for a full list of exposed ports.
 
-## Instrumentation
-
-Your applications must be instrumented before they can send tracing data to Jaeger. We recommend using the [OpenTelemetry][otel] instrumentation and SDKs.
+{{< warning >}}
+Your applications must be instrumented before they can send tracing data to Jaeger. We recommend using the [OpenTelemetry](https://opentelemetry.io/) instrumentation and SDKs.
+{{< /warning >}}
 
 ## 🚗 HotROD Demo
 
@@ -40,15 +38,13 @@ Using this application you can:
 - Use open source libraries from `opentelemetry-contrib` to get vendor-neutral instrumentation 
 for free.
 
-### Running
-
 We recommend running Jaeger and HotROD together via `docker compose`:
 
 ```bash
-git clone git@github.com:jaegertracing/jaeger.git jaeger
+git clone https://github.com/jaegertracing/jaeger.git jaeger
 cd jaeger/examples/hotrod
 docker compose -f docker-compose-v2.yml up
-# Ctrl-C to stop
+# press Ctrl-C to exit
 ```
 
 Then navigate to `http://localhost:8080`. See the [README](https://github.com/jaegertracing/jaeger/blob/main/examples/hotrod/README.md) for other ways to run the demo.

diff --git a/content/docs/next-release-v2/sampling.md b/content/docs/next-release-v2/sampling.md
@@ -31,7 +31,7 @@ These are two basic types of head samplers that are used by the remote sampling:
 * **Probabilistic** sampler makes a random sampling decision with a pre-configured probability. For example, with `probability=0.1` approximately 1 in 10 traces will be sampled.
 * **Rate Limiting** sampler uses a leaky bucket rate limiter to ensure that traces are sampled with a certain constant rate. For example, when `rate=2.0` it will sample requests with the rate of 2 traces per second.
 
-[remote-sampling-api]: ../apis/#remote-sampling-configuration-stable
+[remote-sampling-api]: ../apis/#remote-sampling-configuration
 
 ### File-based Sampling Configuration
 

diff --git a/content/docs/next-release-v2/troubleshooting.md b/content/docs/next-release-v2/troubleshooting.md
@@ -36,7 +36,7 @@ The Jaeger backend supports [Remote Sampling](../sampling/#remote-sampling), i.e
 
 If you suspect the remote sampling is not working correctly, try these steps:
 
-1. Make sure that the SDK is actually configured to use remote sampling, points to the correct sampling service address (see [APIs](../apis/#remote-sampling-configuration-stable)), and that address is reachable from your application's [networking namespace](#networking-namespace).
+1. Make sure that the SDK is actually configured to use remote sampling, points to the correct sampling service address (see [APIs](../apis/#remote-sampling-configuration)), and that address is reachable from your application's [networking namespace](#networking-namespace).
 1. Look at the root span of the traces that are captured in Jaeger. If you are using Jaeger SDKs, the root span will contain the tags `sampler.type` and `sampler.param`, which indicate which strategy was used. (TBD - do OpenTelemetry SDKs record that?)
 1. Verify that the server is returning the appropriate sampling strategy for your service:
 ```