Skip to content

Commit

Permalink
Add setup instructions
Browse files Browse the repository at this point in the history
  • Loading branch information
stanch committed Jan 17, 2025
1 parent dbadbdf commit 4c7d830
Show file tree
Hide file tree
Showing 15 changed files with 459 additions and 319 deletions.
50 changes: 27 additions & 23 deletions docs/api-reference/failed-events/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@ title: "Failed event types"
sidebar_position: 15
---

## Where do Failed Events originate?
This page lists all the possible types of [failed events](/docs/fundamentals/failed-events/index.md).

## Where do failed events originate?

While an event is being processed by the pipeline it is checked to ensure it meets the specific formatting or configuration expectations; these include checks like: does it match the schema it is associated with, were Enrichments successfully applied and was the payload sent by the tracker acceptable.

Expand All @@ -17,22 +19,22 @@ Once the Collector payload successfully reaches the validation and enrichment st

:::

## Schema Violation
## Schema violation

This failure type is produced during the process of [validation and enrichment](/docs/pipeline/enrichments/what-is-enrichment/index.md). It concerns the [self-describing events](/docs/fundamentals/events/index.md#self-describing-events) and [entities](/docs/fundamentals/entities/index.md) which can be attached to your snowplow event.

<details>

In order for an event to be processed successfully:

1. There must be a schema in an [iglu repository](/docs/api-reference/iglu/iglu-repositories/index.md) corresponding to each self-describing event or entity. The enrichment app must be able to look up the schema in order to validate the event.
1. There must be a schema in an [Iglu repository](/docs/api-reference/iglu/iglu-repositories/index.md) corresponding to each self-describing event or entity. The enrichment app must be able to look up the schema in order to validate the event.
2. Each self-describing event or entity must conform to the structure described in the schema. For example, all required fields must be present, and all fields must be of the expected type.

If your pipeline is generating schema violations, it might mean there is a problem with your tracking, or a problem with your [iglu resolver](/docs/api-reference/iglu/iglu-resolver/index.md) which lists where schemas should be found. The error details in the schema violation JSON object should give you a hint about what the problem might be.
If your pipeline is generating schema violations, it might mean there is a problem with your tracking, or a problem with your [Iglu resolver](/docs/api-reference/iglu/iglu-resolver/index.md) which lists where schemas should be found. The error details in the schema violation JSON object should give you a hint about what the problem might be.

Snowplow BDP customers should check in the Snowplow BDP Console that all data structures are correct and have been [promoted to production](/docs/data-product-studio/data-structures/manage/ui/index.md). Snowplow Community Edition users should check that the Enrichment app is configured with an [iglu resolver file](/docs/api-reference/iglu/iglu-resolver/index.md) that points to a repository containing the schemas.
Snowplow BDP customers should check in the Snowplow BDP Console that all data structures are correct and have been [promoted to production](/docs/data-product-studio/data-structures/manage/ui/index.md). Snowplow Community Edition users should check that the Enrichment app is configured with an [Iglu resolver file](/docs/api-reference/iglu/iglu-resolver/index.md) that points to a repository containing the schemas.

Next, check the tracking code in your custom application, and make sure the entities you are sending conform the schema definition.
Next, check the tracking code in your custom application, and make sure the entities you are sending conform to the schema definition.

Once you have fixed your tracking, you might want to also [recover the failed events](/docs/data-product-studio/data-quality/failed-events/recovering-failed-events/index.md), to avoid any data loss.

Expand All @@ -54,7 +56,7 @@ There are many reasons why an enrichment will fail, but here are some examples:
- You are using the [IP lookup enrichment](/docs/pipeline/enrichments/available-enrichments/ip-lookup-enrichment/index.md) but have mis-configured the location of the MaxMind database.
- You are using the [custom API request enrichment](/docs/pipeline/enrichments/available-enrichments/custom-api-request-enrichment/index.md) but the API server is not responding.
- The raw event contained an unstructured event field or a context field which was not valid JSON.
- An iglu server responded with an unexpected error response, so the event schema could not be resolved.
- An Iglu server responded with an unexpected error response, so the event schema could not be resolved.

If your pipeline is generating enrichment failures, it might mean there is a problem with your enrichment configuration. The error details in the enrichment failure JSON object should give you a hint about what the problem might be.

Expand All @@ -66,7 +68,7 @@ Enrichment failure schema can be found [here](https://github.com/snowplow/iglu-c

</details>

## Collector Payload Format Violation
## Collector payload format violation

This failure type is produced by the [enrichment](/docs/pipeline/enrichments/what-is-enrichment/index.md) application, when Collector payloads from the raw stream are deserialized from thrift format.

Expand All @@ -87,7 +89,7 @@ Collector payload format violation schema can be found [here](https://github.com

</details>

## Adaptor Failure
## Adaptor failure

This failure type is produced by the [enrichment](/docs/pipeline/enrichments/what-is-enrichment/index.md) application, when it tries to interpret a Collector payload from the raw stream as a http request from a [3rd party webhook](/docs/sources/webhooks/index.md).

Expand All @@ -112,7 +114,7 @@ Adapter failure schema can be found [here](https://github.com/snowplow/iglu-cent

</details>

## Tracker Protocol Violation
## Tracker protocol violation

This failure type is produced by the [enrichment](/docs/pipeline/enrichments/what-is-enrichment/index.md) application, when a http request does not conform to our [Snowplow Tracker Protocol](/docs/sources/trackers/snowplow-tracker-protocol/index.md).

Expand All @@ -130,49 +132,51 @@ Tracker protocol violation schema can be found [here](https://github.com/snowplo

</details>

## Size Violation
## Size violation

This failure type can be produced either by the [Collector](/docs/api-reference/stream-collector/index.md) or by the [enrichment](/docs/pipeline/enrichments/what-is-enrichment/index.md) application. It happens when the size of the raw event or enriched event is too big for the output message queue. In this case it will be truncated and wrapped in a size violation failed event instead.

<details>

Failures of this type cannot be [recovered](/docs/data-product-studio/data-quality/failed-events/recovering-failed-events/index.md). The best you can do is to fix any application that is sending over-sized events.

Size violation schema can be found [here](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.badrows/size_violation/jsonschema/1-0-0).
Because this failure is handled during collection or enrichment, events in the real time good stream are free of this violation type.

The size violation schema can be found [here](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.badrows/size_violation/jsonschema/1-0-0).

</details>

## Loader Parsing Error
## Loader parsing error

This failure type can be produced by [any loader](/docs/api-reference/loaders-storage-targets/index.md), if the enriched event in the real time good stream cannot be parsed as a canonical TSV event format. For example, if line has not enough columns (not 131) or event_id is not UUID. This error type is uncommon and unexpected, because it can only be caused by an invalid message in the stream of validated enriched events.
This failure type can be produced by [any loader](/docs/api-reference/loaders-storage-targets/index.md), if the enriched event in the real time good stream cannot be parsed as a canonical TSV event format. For example, if the row does not have enough columns (131 are expected) or the `event_id` is not a UUID. This error type is uncommon and unexpected, because it can only be caused by an invalid message in the stream of validated enriched events.

<details>

This failure type cannot be [recovered](/docs/data-product-studio/data-quality/failed-events/recovering-failed-events/index.md).

Loader parsing error schema can be found [here](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.badrows/loader_parsing_error/jsonschema/2-0-0).
The loader parsing error schema can be found [here](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.badrows/loader_parsing_error/jsonschema/2-0-0).

</details>

## Loader Iglu Error
## Loader Iglu error

This failure type can be produced by [any loader](/docs/api-reference/loaders-storage-targets/index.md) and describes an error using the [Iglu](/docs/api-reference/iglu/index.md) subsystem.

<details>

For example:

- A schema is not available in any of the repositories listed in the [iglu resolver](/docs/api-reference/iglu/iglu-resolver/index.md).
- Some loaders (e.g. [RDB loader](/docs/api-reference/loaders-storage-targets/snowplow-rdb-loader/index.md) and [Postgres loader](/docs/api-reference/loaders-storage-targets/snowplow-postgres-loader/index.md)) make use of the "schema list" api endpoints, which are only implemented for an [iglu-server](/docs/api-reference/iglu/iglu-repositories/iglu-server/index.md) repository. A loader iglu error will be generated if the schema is in a [static repo](/docs/api-reference/iglu/iglu-repositories/static-repo/index.md) or [embedded repo](/docs/api-reference/iglu/iglu-repositories/jvm-embedded-repo/index.md).
- A schema is not available in any of the repositories listed in the [Iglu resolver](/docs/api-reference/iglu/iglu-resolver/index.md).
- Some loaders (e.g. [RDB loader](/docs/api-reference/loaders-storage-targets/snowplow-rdb-loader/index.md) and [Postgres loader](/docs/api-reference/loaders-storage-targets/snowplow-postgres-loader/index.md)) make use of the "schema list" api endpoints, which are only implemented for an [Iglu server](/docs/api-reference/iglu/iglu-repositories/iglu-server/index.md) repository. A loader Iglu error will be generated if the schema is in a [static repo](/docs/api-reference/iglu/iglu-repositories/static-repo/index.md) or [embedded repo](/docs/api-reference/iglu/iglu-repositories/jvm-embedded-repo/index.md).
- The loader cannot auto-migrate a database table. If a schema version is incremented from `1-0-0` to `1-0-1` then it is expected to be [a non-breaking change](/docs/api-reference/iglu/common-architecture/schemaver/index.md), and many loaders (e.g. RDB loader) attempt to execute a `ALTER TABLE` statement to facilitate the new schema in the warehouse. But if the schema change is breaking (e.g. string field changed to integer field) then the database migration is not possible.

This failure type cannot be [recovered](/docs/data-product-studio/data-quality/failed-events/recovering-failed-events/index.md).

Loader iglu error schema can be found [here](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.badrows/loader_iglu_error/jsonschema/2-0-0).
Loader Iglu error schema can be found [here](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.badrows/loader_iglu_error/jsonschema/2-0-0).

</details>

## Loader Recovery Error
## Loader recovery error

Currently only the [BigQuery repeater](/docs/api-reference/loaders-storage-targets/bigquery-loader/index.md#block-8db848d4-0265-4ffa-97db-0211f4e2293d) generates this error. We call it "loader recovery error" because the purpose of the repeater is to recover from previously failed inserts. It represents the case when the software could not re-insert the row into the database due to a runtime failure or invalid data in a source.

Expand All @@ -184,7 +188,7 @@ Loader recovery error schema can be found [here](https://github.com/snowplow/igl

</details>

## Loader Runtime Error
## Loader runtime error

This failure type can be produced by any loader and describes generally any runtime error that we did not catch. For example, a DynamoDB outage, or a null pointer exception. This error type is uncommon and unexpected, and it probably indicates a mistake in the configuration or a bug in the software.

Expand All @@ -196,7 +200,7 @@ Loader runtime error schema can be found [here](https://github.com/snowplow/iglu

</details>

## Relay Failure
## Relay failure

This failure type is only produced by relay jobs, which transfer Snowplow data into a 3rd party platform. This error type is uncommon and unexpected, and it probably indicates a mistake in the configuration or a bug in the software.

Expand All @@ -208,7 +212,7 @@ Relay failure schema can be found [here](https://github.com/snowplow/iglu-centra

</details>

## Generic Error
## Generic error

This is a failure type for anything that does not fit into the other categories, and is unlikely enough that we have not created a special category. The failure error messages should give you a hint about what has happened.

Expand Down
Loading

0 comments on commit 4c7d830

Please sign in to comment.