diff --git a/docs/understanding-tracking-design/defining-the-data-to-collect-with-data-poducts/index.md b/docs/understanding-tracking-design/defining-the-data-to-collect-with-data-poducts/index.md index 6a520cd7c4..235ccfae5b 100644 --- a/docs/understanding-tracking-design/defining-the-data-to-collect-with-data-poducts/index.md +++ b/docs/understanding-tracking-design/defining-the-data-to-collect-with-data-poducts/index.md @@ -1,7 +1,7 @@ --- title: "Defining the Data to collect with Data Products" sidebar_position: 2 -sidebar_label: "🆕 Defining the Data to collect with Data Products" +sidebar_label: "Defining the Data to collect with Data Products" --- As described in [Data Products Introduction](/docs/understanding-your-pipeline/data-products/index.md), a data product is a logical grouping of the data you collect as an organisation by domain, with an explicit owner. @@ -23,6 +23,7 @@ With data products, you can: - **Description;** a description of the data that the data product captures - **Owner;** the individual responsible for the data product - **Domain;** the team or business domain that owns the data product +- **Source Application;** the [source application/s](../organize-data-sources-with-source-applications/index.md) the Data Product is implemented in - **Event specifications** * **Name;** a descriptive name for the event * **Description;** a description to help people understand what action the event is capturing diff --git a/docs/understanding-tracking-design/managing-event-specifications/index.md b/docs/understanding-tracking-design/managing-event-specifications/index.md index f98437b8fc..95a139128e 100644 --- a/docs/understanding-tracking-design/managing-event-specifications/index.md +++ b/docs/understanding-tracking-design/managing-event-specifications/index.md @@ -1,6 +1,6 @@ --- title: "Managing Event Specifications" -sidebar_label: "🆕 Managing Event Specifications" +sidebar_label: "Managing Event Specifications" sidebar_position: 95 sidebar_custom_props: offerings: diff --git a/docs/understanding-tracking-design/organize-data-sources-with-source-applications/index.md b/docs/understanding-tracking-design/organize-data-sources-with-source-applications/index.md new file mode 100644 index 0000000000..3a3c8c4d20 --- /dev/null +++ b/docs/understanding-tracking-design/organize-data-sources-with-source-applications/index.md @@ -0,0 +1,31 @@ +--- +title: "Organize Data Sources with Source Applications" +sidebar_position: 1 +sidebar_label: "🆕 Source Applications" +--- + +For data collection, you will often have different sources of information that correspond to applications designed for a particular purpose. These are what we will refer to as Source Applications. + +To illustrate, let's consider Snowplow. We can identify several applications designed for distinct purposes, each serving as a separate data source for behavioral data, or in other words, a Source Application: + +- The Snowplow website that corresponds to the application served under www.snowplow.io +- The BDP Console application that is served under console.snowplowanalytics.com. +- The documentation website serving as our information hub, for all things related to our product, served under docs.snowplow.io. + +Source Applications are a foundational component that enables you to establish the overarching relationships that connect application IDs and [Application Entites](../../collecting-data/collecting-from-own-applications/javascript-trackers/web-tracker/custom-tracking-using-schemas/global-context/index.md) and [Data Products](../defining-the-data-to-collect-with-data-poducts/index.md). + +## Application IDs + +For each of these applications you would set up a unique application ID using the [app_id](../../collecting-data/collecting-from-own-applications/snowplow-tracker-protocol/ootb-data/app-information/index.md#atomic-event-properties) field to distinguish them later on in analysis. + +:::tip +We often see, and recommend as a best practice, setting up a unique application ID for each deployment environment you are using. For example `${appId}-qa` for staging, `${appId}-dev` for development environments. +::: + +## Application Context + +Application Context, also referred to as [Global Context](../../collecting-data/collecting-from-own-applications/javascript-trackers/web-tracker/custom-tracking-using-schemas/global-context/index.md), is a set of entities that can be sent with every event recorded in the application. Using Source Applications you can document which Application Contexts are expected. This is really useful for tracking implementation, data discovery and preventing information duplication in Data Products. + +:::info +Since Application Entities can also be set conditionally, you can mark any of them as optional with a note to better understand the condition or any extra information required. The method for conditionally adding an Application Context is through [rulesets](../../collecting-data/collecting-from-own-applications/javascript-trackers/web-tracker/custom-tracking-using-schemas/global-context/index.md#rulesets), [filter functions](../../collecting-data/collecting-from-own-applications/javascript-trackers/web-tracker/custom-tracking-using-schemas/global-context/index.md#filter-functions) and [context generators](../../collecting-data/collecting-from-own-applications/javascript-trackers/web-tracker/custom-tracking-using-schemas/global-context/index.md#context-generators). +::: diff --git a/docs/understanding-your-pipeline/data-products/index.md b/docs/understanding-your-pipeline/data-products/index.md index 77083283f8..160ac7adbe 100644 --- a/docs/understanding-your-pipeline/data-products/index.md +++ b/docs/understanding-your-pipeline/data-products/index.md @@ -36,6 +36,12 @@ Examples of data products: ## Key elements of a Data Product +**The Source Application/s it is part of**; a data product is referencing the [Source Application/s](/docs/understanding-tracking-design/organize-data-sources-with-source-applications/index.md) that is spanning across. + +**Benefits:** + +* Have a clear view in which application the data product is implemented in, which domains it spans and the related application context information it will have available by default in the dataset. + **An owner**; data products are typically split by domain with each data product having an explicit owner that is responsible for the maintenance and evolution of that data. **Benefits:**