Skip to content

Commit

Permalink
GITBOOK-969: Reporting Update + Keycloak Client Audience update
Browse files Browse the repository at this point in the history
  • Loading branch information
Lalith Kota authored and gitbook-bot committed Aug 15, 2024
1 parent f5b8d25 commit 605673a
Show file tree
Hide file tree
Showing 10 changed files with 356 additions and 35 deletions.
7 changes: 6 additions & 1 deletion SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -373,7 +373,12 @@
* [Registration Tool Kit](utilities-and-tools/registration-tool-kit.md)
* [Monitoring and Reporting](monitoring-and-reporting/README.md)
* [Apache Superset](monitoring-and-reporting/apache-superset.md)
* [Reporting Framework](monitoring-and-reporting/reporting-framework.md)
* [Reporting Framework](monitoring-and-reporting/reporting-framework/README.md)
* [📔 User Guides](monitoring-and-reporting/reporting-framework/user-guides/README.md)
* [Connector Creation Guide](monitoring-and-reporting/reporting-framework/user-guides/connector-creation-guide.md)
* [Dashboards Creation Guide](monitoring-and-reporting/reporting-framework/user-guides/dashboards-creation-guide.md)
* [Installation & Troubleshooting](monitoring-and-reporting/reporting-framework/user-guides/installation-and-troubleshooting.md)
* [Kafka Connect Transform Reference](monitoring-and-reporting/reporting-framework/kafka-connect-transform-reference.md)
* [System Logging](monitoring-and-reporting/logging.md)
* [System Health](monitoring-and-reporting/system-health.md)
* [Privacy and Security](privacy-and-security/README.md)
Expand Down
18 changes: 11 additions & 7 deletions deployment/deployment-guide/keycloak-client-creation.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,17 @@ The steps to create a Keycloak client are given below.
* Authentication flow: Select the `Standard flow` and `Service accounts roles`
* Valid redirect URIs: `*`
4. Save the changes and click the _**Credentials**_ tab above. You must note down the client ID and secret to add while installing the OpenG2P modules.
5. Click the _**Client Scopes**_ tab.
5. Click the _**Client Scopes**_ tab.
6. Select the client that you created in the _**Client Scopes**._
7. Select the _**From Predefined Mappers**_ from the _**Add Mapper**_ drop-down.
8. In the _**Add Predefined Mapper**_ screen, check all the mappers below the _**Name**_ column, and click the _**Add**_ button.
9. After adding predefined mappers, search for the _**Client**_ from the filter, select _**Client Roles,**_ update, and save the below changes.
8. In the _**Add Predefined Mapper**_ screen, select to show all mappers on the same page. Check all the mappers below the _**Name**_ column, and click the _**Add**_ button.
9. Search and remove the "Audience Resolve" mapper from the added mappers list. Click on **Add Mapper** -> **By configuration** and select the **Audience** mapper in the **Configure new mapper** page. Configure the audience mapper with the following details.
* Client ID: `select your Client ID from the drop-down`
* Token Claim Name: `client_roles`
* Add to ID token: `ON`
* Add to userinfo: `ON` 
10. After the successful creation of the client, you can use this client for the OpenG2P module installation from the Rancher UI.
* Add to Access Token: `ON` .
* Add to ID token: `ON` .
10. After adding predefined mappers, search for "client" in the filter, select _**Client Roles** mapper,_ update, and save the below changes.
* Client ID: `select your Client ID from the drop-down`
* Token Claim Name: `client_roles`
* Add to ID token: `ON`
* Add to userinfo: `ON` 
11. After the successful creation of the client, you can use this client for the OpenG2P module installation from the Rancher UI.
4 changes: 2 additions & 2 deletions monitoring-and-reporting/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,11 @@ Monitoring the status of programs and registries is vital for program administra
The following tools are provided

* [Apache Superset](https://superset.apache.org/) for visual pre-configured **dashboards.** 
* [Reporting Framework](reporting-framework.md) for real-time updates and **slicing and dicing of data.** 
* [Reporting Framework](reporting-framework/) for real-time updates and **slicing and dicing of data.** 
* [Logging pipeline](logging.md) for **system logs** monitoring.
* [Prometheus and Grafana](system-health.md) for **system health** monitoring. 



<table data-view="cards"><thead><tr><th></th><th></th><th></th><th data-hidden data-card-cover data-type="files"></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td></td><td><mark style="color:purple;">Dashboards</mark></td><td></td><td><a href="../.gitbook/assets/apache-superset-dashboard.png">apache-superset-dashboard.png</a></td><td><a href="apache-superset.md">apache-superset.md</a></td></tr><tr><td></td><td><mark style="color:purple;">Reporting Framework</mark></td><td></td><td><a href="../.gitbook/assets/reporting-dashboard (1).png">reporting-dashboard (1).png</a></td><td><a href="reporting-framework.md">reporting-framework.md</a></td></tr><tr><td></td><td><mark style="color:purple;">Logging</mark></td><td></td><td><a href="../.gitbook/assets/opensearch-log-dashboard.png">opensearch-log-dashboard.png</a></td><td><a href="logging.md">logging.md</a></td></tr><tr><td></td><td><mark style="color:purple;">System Health</mark></td><td></td><td><a href="../.gitbook/assets/prometheus-grafana.png">prometheus-grafana.png</a></td><td><a href="system-health.md">system-health.md</a></td></tr></tbody></table>
<table data-view="cards"><thead><tr><th></th><th></th><th></th><th data-hidden data-card-cover data-type="files"></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td></td><td><mark style="color:purple;">Dashboards</mark></td><td></td><td><a href="../.gitbook/assets/apache-superset-dashboard.png">apache-superset-dashboard.png</a></td><td><a href="apache-superset.md">apache-superset.md</a></td></tr><tr><td></td><td><mark style="color:purple;">Reporting Framework</mark></td><td></td><td><a href="../.gitbook/assets/reporting-dashboard (1).png">reporting-dashboard (1).png</a></td><td><a href="reporting-framework/">reporting-framework</a></td></tr><tr><td></td><td><mark style="color:purple;">Logging</mark></td><td></td><td><a href="../.gitbook/assets/opensearch-log-dashboard.png">opensearch-log-dashboard.png</a></td><td><a href="logging.md">logging.md</a></td></tr><tr><td></td><td><mark style="color:purple;">System Health</mark></td><td></td><td><a href="../.gitbook/assets/prometheus-grafana.png">prometheus-grafana.png</a></td><td><a href="system-health.md">system-health.md</a></td></tr></tbody></table>

Original file line number Diff line number Diff line change
Expand Up @@ -32,35 +32,15 @@ The salient features of the framework are the following:

## Installation

Reporting framework is installed as part of modules' installation via the Helm chart that installs the respective module. Note that during installation you need to specify the Github location and branch for both the Debezium and Kafka connectors. For example: [https://github.com/OpenG2P/openg2p-reporting/tree/develop/scripts/social-registry](https://github.com/OpenG2P/openg2p-reporting/tree/develop/scripts/social-registry)

If you would like to update these connectors for your dashboards, update the files on Github.

## Post installation check

To ensure that all Kafka connectors are working login into Kafka UI (domain name is set during installation) and check the connectors' status. &#x20;

<figure><img src="../.gitbook/assets/kafka-ui-kafka-connect.png" alt=""><figcaption></figcaption></figure>





## Configuring the pipeline for specific dashboards

### Debezium connector

* Inspect the Debezium connector for fields that are shunted to OpenSearch. See example connector: [https://github.com/OpenG2P/openg2p-reporting/blob/develop/scripts/social-registry/debezium-connectors/default.json](https://github.com/OpenG2P/openg2p-reporting/blob/develop/scripts/social-registry/debezium-connectors/default.json)
* Carefully inspect the `column.exclude.list` field -- make sure you add the fields from Social Registry that must NOT be indexed. Specifically, PII fields like name, address, phone number etc. As a general rule, fields that are not required for dashboards must be excluded explicitly.&#x20;
* To see trend data and changes in values of fields based on time, the old data should be preserved. Refer to this guide. (_TBD_)
Refer to [Installation Guide and Post Installation Check](user-guides/installation-and-troubleshooting.md).

## Accessing OpenSearch dashboards

* Pick the URL provided during the installation of the Helm chart of the module (like SR, PBMS)
* Add Keycloak roles to the user who is accessing the dashboard (as given [here](../social-registry/deployment/#post-installation)).
* Add Keycloak roles to the user who is accessing the dashboard (as given [here](user-guides/installation-and-troubleshooting.md#assigning-roles-to-users)).
* Confirm that the number of indexed records in OpenSearch matches the number of rows in the DB (_guide TBD_). This check confirms that the reporting pipeline is working fine.

## Creating dashboards

* On OpenSearch Dashboard, create an Index Pattern and create dashboards. [Learn more>>](https://opensearch.org/docs/latest/dashboards/dashboard/index/)
* If you have relational queries across tables, the connectors need to be written in a certain way. Refer to this guide. _(TBD)_
* [Create connectors](user-guides/connector-creation-guide.md).
* [Create Dashboards](user-guides/dashboards-creation-guide.md).
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Kafka Connect Transform Reference

This document is the configuration reference guide for Kafka SMTs developed by OpenG2P, that can be used on [OpenSearch Sink Connectors](https://github.com/OpenG2P/openg2p-reporting).

Following is a list of some of the other transformations available on the OpenSearch Connectors, apart from the ones developed by OpenG2P:

* [Apache Kafka Connect SMTs](https://kafka.apache.org/documentation/#connect\_included\_transformation).
* [Debezium Kafka Connect Transformations](https://debezium.io/documentation/reference/stable/transformations/index.html).

## Transformations

### DynamicNewField

#### Class name:

* `org.openg2p.reporting.kafka.connect.DynamicNewField$Key` - Applies transform only to the _Key_ of Kafka Connect Record.
* `org.openg2p.reporting.kafka.connect.DynamicNewField$Value` - Applies transform only to the _Value_ of Kafka Connect Record.

#### Description:

* This transformation can be used to query external data sources to retrieve new fields and add them to the current record, based on the values of some existing fields of this record.
* Currently, only Elasticsearch-based queries are supported. This means any index on Elasticsearch(or OpenSearch) can be queried and some new fields can be populated based on fields from the current record.
* Some selected from the current record will be taken. ES will be queried for records where the selected field values match. The top response will be picked. Fields from that response can be added back to the current record.

#### Configuration:

<table><thead><tr><th width="210">Field name</th><th width="138">Field title</th><th>Description</th><th width="100">Default Value</th></tr></thead><tbody><tr><td>query.type</td><td>Query Type</td><td><p>This is the type of query made to retrieve new field values.</p><p>Supported values:</p><ul><li><code>es</code> (Elasticsearch based).</li></ul></td><td>es</td></tr><tr><td>input.fields</td><td>Input Fields</td><td><p>List of comma-separated fields that will be considered as input fields in the current record.</p><p>Nested input fields are supported, like: (where profile is json that contains name and birthdate fields)</p><pre class="language-json"><code class="lang-json">profile.name,profile.birthdate
</code></pre></td><td></td></tr><tr><td>output.fields</td><td>Output Fields</td><td>List of comma-separated fields to be added to this record.</td><td></td></tr><tr><td>input.default.values</td><td>Input Default Values</td><td>List of comma-separated values to give in place of the input fields when an input field is empty or null.<br>Length of this has to match that of <code>input.fields</code>.</td><td></td></tr><tr><td>es.index</td><td>ES Index</td><td>Elasticsearch(or OpenSearch) index to query for.</td><td></td></tr><tr><td>es.input.fields</td><td>ES Input Fields</td><td>List of comma-separated fields, to be queried on the ES index, each of which maps to the fields on <code>input.fields</code>.<br>Length of this has to match that of <code>input.fields</code>.</td><td></td></tr><tr><td>es.output.fields</td><td>ES Output Fields</td><td>List of comma-separated fields, to be retrieved from the ES query response document, each of which maps to the fields on <code>output.fields</code>. <br>Length of this has to match that of <code>output.fields</code>.</td><td></td></tr><tr><td>es.input.query.add.keyword</td><td>ES Input Query Add Keyword</td><td>Whether or not to add <code>.keyword</code> to the <code>es.input.fields</code> during the term query. Supported values: <code>true</code> / <code>false</code> .</td><td>false</td></tr><tr><td>es.security.enabled</td><td>ES Security Enabled</td><td>If this value is given as <code>true</code>, then Security is enabled on ES.</td><td></td></tr><tr><td>es.url</td><td>ES Url</td><td>Elasticsearch/OpenSearch base URL.</td><td></td></tr><tr><td>es.username</td><td>ES Username</td><td></td><td></td></tr><tr><td>es.password</td><td>ES Password</td><td></td><td></td></tr></tbody></table>

### StringToJson

#### Class name:

* `org.openg2p.reporting.kafka.connect.StringToJson$Key` - Applies transform only to the _Key_ of Kafka Connect Record.
* `org.openg2p.reporting.kafka.connect.StringToJson$Value` - Applies transform only to the _Value_ of Kafka Connect Record.

#### Description:

* This transformation can be used to convert JSON string, present in a field in the record, to JSON. Example:

```json
{"profile": "{\"name\":\"Temp\"}"} -> {"profile": {"name": "Temp"}}
```
* Currently, this transform only works in schemaless mode. (`value.converter.schemas.enable=false`).

#### Configuration

<table><thead><tr><th width="210">Field name</th><th width="138">Field title</th><th>Description</th><th width="100">Default Value</th></tr></thead><tbody><tr><td>input.field</td><td>Input Field</td><td>Input Field that contains JSON string.</td><td></td></tr></tbody></table>

### TimestampConverterAdv

#### Class name:

* `org.openg2p.reporting.kafka.connect.TimestampConverterAdv$Key` - Applies transform only to the _Key_ of Kafka Connect Record.
* `org.openg2p.reporting.kafka.connect.TimestampConverterAdv$Value` - Applies transform only to the _Value_ of Kafka Connect Record.

#### Description:

* This transformation can be used to convert a Timestamp present in a field in the record, to another format. Example:

```json
{"create_date": 1723667415} -> {"profile": "2024-08-14'T'20:30:50.069'Z'"}
```
* Currently, the output can only be in the form of a string.

#### Configuration

<table><thead><tr><th width="210">Field name</th><th width="138">Field title</th><th>Description</th><th width="100">Default Value</th></tr></thead><tbody><tr><td>field</td><td>Input Field</td><td>Input Field that contains the Timestamp.</td><td></td></tr><tr><td>input.type</td><td>Input Type</td><td><p>Supported values:</p><ul><li>milli_sec (Input is present as milliseconds since epoch)</li><li>micro_sec (Input is present as microseconds since epoch. Useful for converting Datetime field of PostgreSQL)</li><li>days_epoch (Input is present as days since epoch. Useful for converting Date field of PostgreSQL)</li></ul></td><td>milli_sec</td></tr><tr><td>output.type</td><td>Output Type</td><td><p>Supported values:</p><ul><li>string (Gives output as string)</li></ul></td><td>string</td></tr><tr><td>output.format</td><td>Output Format</td><td>Format of string output</td><td><p></p><pre class="language-json"><code class="lang-json">yyyy-MM-dd'T'HH:mm:ss.SSS'Z'
</code></pre></td></tr></tbody></table>

### TimestampSelector

#### Class name:

* `org.openg2p.reporting.kafka.connect.TimestampSelector$Key` - Applies transform only to the _Key_ of Kafka Connect Record.
* `org.openg2p.reporting.kafka.connect.TimestampSelector$Value` - Applies transform only to the _Value_ of Kafka Connect Record.

#### Description:

* This transformation can be used to create a new timestamp field, whose value can be selected from other fields, in the order of whichever is not empty first. Example: (when `ts.order` is `profile.write_date,profile.create_date`)

```json
{"profile": {"create_date": 7415, "write_date": null}} -> {"@timestamp_gen": 7415, "profile": {"create_date": 7415, "write_date": null}}
{"profile": {"create_date": 2945, "write_date": 3442}} -> {"@timestamp_gen": 3442, "profile": {"create_date": 2945, "write_date": 3442}}
```

#### Configuration

<table><thead><tr><th width="210">Field name</th><th width="138">Field title</th><th>Description</th><th width="100">Default Value</th></tr></thead><tbody><tr><td>ts.order</td><td>Timestamp order</td><td>List of comma-separated fields to select output from. The output will be selected based on whichever field in the order is not null first. Nested fields are supported.</td><td></td></tr><tr><td>output.field</td><td>Output Field</td><td>Name of the output field into which the selected timestamp is put.</td><td><p></p><pre class="language-json"><code class="lang-json">@ts_generated
</code></pre></td></tr></tbody></table>

## Source Code

[https://github.com/OpenG2P/openg2p-reporting/tree/develop/opensearch-kafka-connector](https://github.com/OpenG2P/openg2p-reporting/tree/develop/opensearch-kafka-connector)
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# 📔 User Guides

Loading

0 comments on commit 605673a

Please sign in to comment.