Skip to content

Commit

Permalink
Releases/sle/2023 10 (#168)
Browse files Browse the repository at this point in the history
* copy template and update release number

* update container:version

* update help versions

* notice and sbom files

* update sbom and notice files

* Update README.md

Added Software Behavior Changes for 2023-10

* Update README.md

Updated Helm Chart Breaking Changes

* Update README.md

* Update README.md

Updated Non Container/Chart Artifacts to say that 'There aren't any Non Container/Chart Artifacts.'

* Update README.md

* Update README.md

* Create Perform-a-hard-reset-on-the-redis-cluster.md

* Update README.md

Linking from readme to [Perform-a-hard-reset-on-the-redis-cluster](https://github.com/ni/install-systemlink-enterprise/tree/2023-10/release-notes/2023-10/Perform-a-hard-reset-on-the-redis-cluster.md)

* Create How-to-reset-Dremio.md

* Update README.md

* Update How-to-reset-Dremio.md

* Update README.md

* Adding closed-bugs-sle-2023-10

* Update README.md

Linked to closed bugs, sbom, notices

* Update How-to-reset-Dremio.md

Removing the text that links to this internal page:

"For instructions on how to install OpenLens and get Kubernetes access to the NI-internal AWS clusters, follow the instructions in
[Getting started visualizing the cluster](https://github.com/ni/install-systemlink-enterprise/blob/sle-release-notes-2023-10/Tutorials/Getting-started-visualizing-the-cluster.md)"

* Update README.md

Removed the following:

<!-- This file should be renamed to README.md and placed in the directory for the release. -->

<!-- This section should link to the excel document that list customer facing bugs, fixed in the current release. The URL for the release (tag) should be used. -->

* Update README.md

Removed this:


### Non Container/Chart Artifacts

There aren't any Non Container/Chart Artifacts.

* Update README.md

Using ordered list for these bullets:

Option #1: Set webserver.redis-cluster.redis.update-strategy.type = OnDelete

Option #2: Prior to upgrade, run: kubectl -n <namespace> delete statefulset <release>-webserver-redis

* Update README.md

Fixed `code` display for these:

`kubectl -n <namespace> delete pods <release>-webserver-redis-0 <release>-webserver-redis-1 <release>-webserver-redis-2 <release>-webserver-redis-3 <release>-webserver-redis-4 <release>-webserver-redis-5`

`kubectl -n <namespace> delete statefulset <release>-webserver-redis`

* Update release-notes/2023-10/README.md

Co-authored-by: Mark Black <[email protected]>

* Update README.md

* Update release-notes/2023-10/README.md

Co-authored-by: Mark Black <[email protected]>

* Update README.md

* Update README.md

* Update release-notes/2023-10/README.md

Co-authored-by: Mark Black <[email protected]>

* Update README.md

Added 'View this configuration' link for continuationTokenEncryptionKey, to:

- https://github.com/ni/install-systemlink-enterprise/blob/4da6c60d63ef48a663e78efd9b393e41b6c40ba4/getting-started/templates/systemlink-secrets.yaml#L566

* Update README.md

For this section,
- `dataframeservice.requestBodySizeLimitMegabytes` has been renamed to `dataframeservice.requestBodySizeLimit`. It now accepts units in "MiB" (Mebibytes, 1024 KiB) or in "MB" (Megabytes, 1000 KB).

Added 'View this configuration' link, that links to :|
- https://github.com/ni/install-systemlink-enterprise/blob/4da6c60d63ef48a663e78efd9b393e41b6c40ba4/getting-started/templates/systemlink-values.yaml#L579

* Update README.md

I've added a 'View this configuration' bullet that links to:
- https://github.com/ni/install-systemlink-enterprise/blob/4da6c60d63ef48a663e78efd9b393e41b6c40ba4/getting-started/templates/systemlink-secrets.yaml#L549
for the link above, for `Before running the taghistorian service, please configure the values according to the instructions from the helm chart.`

We'll need to replace the link so it uses the tag for the 2023-10 release.

* Update README.md

Split info between these sections: New Features & Behavior Changes, Upgrade from -09 to -10, and Upgrade Considerations.

* Update README.md

Updated webserver version to 0.13.12, as in the list of containers (from 0.13.4 where the behavior/breaking change was introduced)

* Create Remove-Kafka-from-the-cluster.md

* Update Remove-Kafka-from-the-cluster.md

* Update README.md

Moved into a separate .md file the steps for Removing Kafka from the cluster.

* Update Remove-Kafka-from-the-cluster.md

* Update Remove-Kafka-from-the-cluster.md

* update configuration link with 2023-10 tag

* 2023 10 release note mblack refactor (#169)

* Update how to reset dremio

* Update Perform-a-hard-reset-on-the-redis-cluster.md

* Update Remove-Kafka-from-the-cluster.md

* Update README.md

* Update to release notes

* apply suggestion

---------

Co-authored-by: PriyadarshiniGopal <[email protected]>

* resolve lint

* Update link to mongo_db migration doc

* Update Helm versions

* Address review suggestions

* Update breaking changes lists to sections

* fix lint errors

* update redis section based on review comments

* fix linting errors

* Update how to reset dremio doc

* fix lint issue

* Address feedback from review

* Update helm version to include patch

---------

Co-authored-by: PriyadarshiniGopal <[email protected]>
Co-authored-by: Fred Visser <[email protected]>
Co-authored-by: CIakab-NI <[email protected]>
Co-authored-by: Mark Black <[email protected]>
Co-authored-by: Ram Ganesh Ramalingam <[email protected]>
Co-authored-by: Ram Ganesh Ramalingam <[email protected]>
Co-authored-by: Christian Nunnally <[email protected]>
  • Loading branch information
8 people authored Oct 25, 2023
1 parent 4da6c60 commit 0a1eb4c
Show file tree
Hide file tree
Showing 102 changed files with 1,152,967 additions and 0 deletions.
31 changes: 31 additions & 0 deletions release-notes/2023-10/How-to-reset-Dremio.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# How to reset Dremio

This document describes how to reset the deployment of Dremio if you want to reduce overall load and lower resource utilization.

## Prerequisites

Before you begin, ensure you have permission to delete pods and Persistent Volume Claims (PVCs) in the Kubernetes cluster.

## Resetting Dremio

To reset Dremio, you must delete all of
PVCs with `dremio` in the name, delete all Dremio pods and delete a DFS pod to trigger
logic in the DFS to reinitialize Dremio. Please follow below steps to reset Dremio -

1. Connect to your Kubernetes cluster with kubectl or another tool of your choosing.
1. Run `kubectl get pvc` to list the PVCs in the cluster.
1. Note the PVCs with `dremio` in the name.
1. Run `kubectl delete pvc <dremio pvc name> <dremio pvc name> ...`. Ensure all dremio PVCs are included in the command.
1. Run `kubectl get pods` to list the deployed pods and note pods with `dremio` in the name.
1. Run `kubectl delete pod <dremio pod> <dremio pod> ...`. Ensure all dremio pods are included in the command.
1. Run `kubectl describe pod <dremio pod>` to verify that the **Age** field of the `dremio` PVCs
are less than a few minutes old. If the PVCs are older than
expected, repeat steps 4-6.
1. Run `kubectl get pods` to list the deployed pods and note notes with `dataframeservice` in the name.
1. Locate one of the pods that belongs to the DFS C# service. The names of
these pods will contain `dataframeservice`, but won't have `mongodb`,
`dremio`, or `zk` in them.
1. Run `kubectl delete pod <DFS pod>` to delete one of the DFS pods listed the previous step.
1. Wait up to a minute for the new DFS pod to become ready. When the pod
becomes ready, Dremio has been reset, and queries for row data in the DFS
should now succeed.
50 changes: 50 additions & 0 deletions release-notes/2023-10/Perform-a-hard-reset-on-the-redis-cluster.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Perform a hard reset of a Redis cluster without deleting the Helm deployment

The following procedure can be used when Redis is in a failure mode cannot
be recovered using the Redis CLI. This will cause downtime of the SystemLink application.

To perform a hard reset of Redis:

1. Determine the name of the stateful set used to manage this cluster.

The stateful set name should be `<helm_release_name>-<service_name>-redis`

2. Determine the name of each persistent volume claim associated with a Redis
node.

The volume claims names should be
`<helm_release_name>-<service_name>-redis-<N>` for each node N in the
cluster.

3. Delete the Redis stateful set.

```bash
kubectl delete statefulset <name_of_stateful_set> -n <namespace>
```

4. The previous command should have terminated all pods in the cluster. Wait for
them to stop.
5. Delete each persistent volume claim.

```bash
kubectl delete pvc <name_of_volume> -n <namespace>
```

6. Determine the current revision number for the helm release.

```bash
helm status <helm_release_name> -n <namespace>
```

7. Restore the Helm deployment to the previous state by rolling back to to the
current revision.

```bash
helm rollback <helm_release_name> <current_revision_number> -n <namespace>
```

Alternatively, run a `helm upgrade` to force redeployment.

8. The Helm rollback or upgrade will recreate the stateful set. Because the
volumes were deleted, the Redis cluster will be re-initialized in a clean
state with no data.
242 changes: 242 additions & 0 deletions release-notes/2023-10/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
# SystemLink Enterprise 2023-10 Release Notes

The 2023-10 release for SystemLink Enterprise has been published to <https://downloads.artifacts.ni.com>. This update includes new features, bug fixes, and security updates. Work with your account representative to obtain credentials to access these artifacts. If you are not upgrading from the previous release, refer to past release notes to ensure you have addressed all required configuration changes.

## New Features and Behavior changes

- You can view the current and historical values of tags in dashboards.
- Data tables have improved reliability and scalability and can support thousands of concurrent writers.
- You can visualize systems data in dashboards.
- You can change the version of a package installed on a managed system.
- You can view all tracked assets on the Assets page
- You can add comments to a test result
- You can connect to external MongoDB instances. This is a breaking change. Refer to **Helm Chart Breaking Changes** for detail on how to opt-out of this capability or migrate to a new MongoDB instance.
- New Test Analytics privilege category is available.
- The Test Analytics privilege category has been added, and includes the Query Measurements privilege. This privilege is not currently functional and is being added in support of features that will release in a future version.

## Upgrading from the release 2023-09 to the release 2023-10

### Redis upgrade from 7.0 to 7.2

This release upgrades Redis from 7.0 to 7.2. This is a breaking change. Redis cluster needs to be upgraded as Helm will not perform this upgrade automatically. You can perform the upgrade using the below steps,
1. Set `webserver.redis-cluster.redis.update-strategy.type = OnDelete`
1. Run the Helm command to upgrade your deployment to this release.
1. Run `kubectl -n <namespace> delete pods <release>-webserver-redis-0 <release>-webserver-redis-1 release>-webserver-redis-2 <release>-webserver-redis-3 <release>-webserver-redis-4 <release>-webserver-redis-5`. The pods of the stateful set will be deleted and will be automatically recreated.
1. Remove the override of the Redis update-strategy from the configuration. You can re-deploy to apply this change but it is not required.

If the above upgrade fails, you should reset redis deployment using the below steps,
1. Run `kubectl -n <namespace> delete statefulset <release>-webserver-redis`. This will delete the redis cluster, preventing UI access to the application.
1. Run the Helm command to upgrade your deployment to this release. The Redis cluster will be recreated and deployed in parallel.

Once upgraded, Redis storage will be incompatible with older versions of the software. If it is necessary to downgrade to an older version of SystemLink Enterprise, you must perform a hard reset on the redis cluster. These steps are not required if you are only upgrading to the latest release.
- Refer to [Perform-a-hard-reset-on-the-redis-cluster.md](https://github.com/ni/install-systemlink-enterprise/tree/2023-10/release-notes/2023-10/Perform-a-hard-reset-on-the-redis-cluster.md) for steps to reset Redis.

## Helm Chart Breaking Changes

### Support for single external MongoDB instance

The systemlink Helm chart now supports an external MongoDB instance

If you have an existing installation of SLE you should set `global.mongodb.install` to `true` in order to maintain the same behavior in future versions of the Helm chart.

If you want to use a single external MongoDB instance:
- Consult the [Configuring SystemLink Enterprise documentation](https://www.ni.com/docs/en-US/bundle/systemlink-enterprise/page/config-systemlink-enterprise.html#GUID-125A1E48-1B3B-4EC8-99FF-808E36EF1586)
- Migrate your existing data to the external MongoDB instance. See the [MongoDB_Migration README file](https://github.com/ni/install-systemlink-enterprise/tree/2023-10/release-notes/2023-10/MongoDB_Migration) for more information.
- Configure `global.mongodb.install` to `false`.
- Provide the connection string in `global.mongodb.connection_string`.

### MongoDB connection string global value override

You can specify the username and password in the global (`mongodb+srv://user:pass@host/<database>`) `<database>` will be replaced during per-service Helm install/upgrade. This forces SystemLink Enterprise to use the same username and password for all databases hosted in your MongoDB instance.

You can also use per-service username and password combinations (`mongodb+srv://<username>:<password>@host/<database>`) `<username>`, `<password>`, and `<database>` will be replaced during per-service Helm install/upgrade. This forces SystemLink Enterprise to use your specified usernames and passwords for each database hosted in your MongoDB instance.

### Data Frame Service

`dataframeservice.requestBodySizeLimitMegabytes` has been renamed to `dataframeservice.requestBodySizeLimit`. It now accepts units in "MiB" (Mebibytes, 1024 KiB) or in "MB" (Megabytes, 1000 KB).
- [View this configuration](https://github.com/ni/install-systemlink-enterprise/blob/2023-10/getting-started/templates/systemlink-values.yaml#L579)

### Tag Historian service

The Tag Historian service is included in the SystemLink Enterprise top level Helm chart.
- You must configure the secrets for MongoDB required by this service.
- [View this configuration](https://github.com/ni/install-systemlink-enterprise/blob/2023-10/getting-started/templates/systemlink-secrets.yaml#L549)
- You must also configure a `continuationTokenEncryptionKey`. When creating the `continuationTokenEncryptionKey`, use a 32-byte cryptographically random value which is base64 encoded.
- [View this configuration](https://github.com/ni/install-systemlink-enterprise/blob/2023-10/getting-started/templates/systemlink-secrets.yaml#L566)

## Upgrade Considerations

- DataFrame Service Kafka dependency has been removed
- The DataFrame Service now uses a more efficient method for writing data to new tables, replacing Kafka. The DataFrame Service will still use Kafka for data ingestion for tables created before the 2023-10 release, while tables created after upgrading to the 2023-10 release will have data written directly to S3 storage. This changes leads to greatly reduced resource utilization.
- After upgrading to the 2023-10 release, you can safely remove Kafka from your cluster once all pre-upgrade tables are set to readonly. Please note that disabling Kafka may lead to data loss if pre-upgrade tables are not readonly, because any buffered data may not get written to storage.
- Refer to [Remove-Kafka-from-the-cluster.md](https://github.com/ni/install-systemlink-enterprise/tree/2023-10/release-notes/2023-10/Remove-Kafka-from-the-cluster.md) for detailed instructions.
- DataFrame Service Dremio refresh interval
- The Dremio data set refresh job interval was increased from 2 minutes to 1 hour. This reduces overall load on Dremio.
- Customers are required to uptake this change. This change prevents Dremio's volumes from filling up, which can get it into a bad state.
- Refer to [How-to-reset-Dremio.md](https://github.com/ni/install-systemlink-enterprise/tree/2023-10/release-notes/2023-10/How-to-reset-Dremio.md) to uptake this change.
- DataFrame Service increased memory limit.
- The default memory request and limit increased from 2GB per DataFrame Service pod to 4GB.

### RabbitMQ Version

SystemLink Enterprise includes a deployment of the [RabbitMQ](https://www.rabbitmq.com/) message bus. Since you cannot skip minor versions when updating RabbitMQ, you may not be able to upgrade directly between versions of the SystemLink Enterprise product. The table below shows the version of the RabbitMQ dependency for each released version of SystemLink Enterprise. Refer to [Updating SystemLink Enterprise](https://www.ni.com/docs/en-US/bundle/systemlink-enterprise/page/updating-systemlink-enterprise.html) for detailed update instructions.

| RabbitMQ Version | First SystemLink Enterprise Version | Last SystemLink Enterprise Version |
|------------------|-------------------------------------|------------------------------------|
| 3.11.x | 0.12.x | 0.15.x |
| 3.12.x | 0.16.x | current |

Refer to [Updating SystemLink Enterprise](https://www.ni.com/docs/en-US/bundle/systemlink-enterprise/page/updating-systemlink-enterprise.html) for detailed instructions on how to safely upgrade the version of the RabbitMQ dependency.

## Bugs Fixed

Only customer facing bugs have been included in this list.

- [closed-bugs-sle-2023-10](https://github.com/ni/install-systemlink-enterprise/tree/2023-10/release-notes/2023-10/closed-bugs-sle-2023-10.xlsx)

## Software Bill of Materials and Notices

- [SBOM](https://github.com/ni/install-systemlink-enterprise/tree/2023-10/release-notes/2023-10/sbom)
- [Notices](https://github.com/ni/install-systemlink-enterprise/tree/2023-10/release-notes/2023-10/notices)

## Versions

**Top Level Helm Chart:**`systemlink 0.18.76`

**Admin Helm Chart:** `systemlink-admin 0.18.9`

### NI Containers

assetservice:0.4.64

assetui:0.3.48

comments:0.2.34

dashboardsui:0.6.38

dataframeservice:0.14.49

dremio-ee:24.1.2

executionsui:0.6.42

filesui:0.7.47

grafana-auth-proxy:20230404.4

grafana-plugins:3.3.0

grafana-rbac-integrator:0.6.10

helium-dataservices:0.5.21

helium-fileingestionservices:0.9.20

helium-salt-master:1.3.18

helium-serviceregistry:0.6.20

helium-taghistoriandataretention:0.1.94

helium-taghistorianservices:0.1.94

helium-userservices:0.6.35

helium-webappservices:0.5.13

helium-webserver:0.13.12

jupyter-notebook-userpod:20230928.21

jupyterui:0.6.36

landingpageui:0.6.45

license:0.6.27

licensesui:0.3.47

nbexec-execution-helpers:20230911.5

nbexec-notebook-runner:20230922.2

nbexecservice:0.7.26

nbparsingservice:0.6.7

ni-grafana:v9.5.8-ed05e1eca2-ni

notification:0.6.19

repository:0.2.16

routineeventtrigger:0.7.6

routineexecutor:0.7.4

routinescheduletrigger:0.7.8

routineservice:0.8.9

routinesui:0.7.30

securityui:0.6.39

session-manager-service:0.7.18

sl-configurable-http-proxy:20230823.1

sl-k8s-hub:20230825.2

smtp:0.6.13

sysmgmtevent:0.7.19

systemsmanagementservice:0.6.23

systemsui:0.7.87

tagsui:0.2.48

testinsightsui:0.6.112

testmonitorservice:0.15.21

userdata:0.6.19

userservice-setup:0.7.3

### 3rd Party Containers

alpine:3.18.3

argoproj/argocli:v3.3.8-linux-amd64

argoproj/argoexec:v3.3.8-linux-amd64

argoproj/workflow-controller:v3.3.8-linux-amd64

bitnami/minio:2023.9.30-debian-11-r0

bitnami/mongodb:5.0.21-debian-11-r12

bitnami/rabbitmq:3.12.6-debian-11-r4

bitnami/redis-cluster:7.2.1-debian-11-r0

busybox:stable@sha256:023917ec6a886d0e8e15f28fb543515a5fcd8d938edb091e8147db4efed388ee

busybox:stable@sha256:51de9138b0cc394c813df84f334d638499333cac22edd05d0300b2c9a2dc80dd

jupyterhub/k8s-image-awaiter:2.0.0

kiwigrid/k8s-sidecar:1.25.1

kube-scheduler:v1.23.10

pause:3.8

swaggerapi/swagger-ui:v5.7.2

zookeeper:3.8.1-temurin
31 changes: 31 additions & 0 deletions release-notes/2023-10/Remove-Kafka-from-the-cluster.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Remove Kafka from the cluster

The 2023-10 release removes SystemLink Enterprise's dependency on Kafka. The following steps can be followed to remove Kafka after upgrading to 2023-10.

To remove Kafka from the cluster:

1. Upgrade to the 2023-10 release.
1. Confirm that you do not have any remaining appendable tables that were created prior to upgrade that you do not want to be made readonly. To check if any remaining appendable tables exist:
1. Use kubectl or a GUI like Lens to find the pod containing `dfs-kafka-ui`.
1. Enable port forwarding for port `8080` to access the Kafka UI on your localhost.
1. Open the Kafka UI in a browser using the port obtained in the previous step.
1. Navigate to "Topics" in the left-hand navigation.
1. Search for topics starting with `dfs` followed by a data table ID.
1. If no Kafka topics exist for data tables, it's safe to proceed with disabling and removing Kafka from the cluster.

The presence of `dfs` topics in Kafka indicates that the associated tables are still open for writing data. By default, newly-created data tables have "SupportsAppend" set to "true." To mark a data table as readonly, use the route `POST /nidataframe/v1/tables/{id}/data` with the table's ID and `endOfData: true` in the JSON request body. This action sets the data table's "SupportsAppend" field to "False," making it readonly. Once a table is readonly, it cannot be reopened, so ensure you've finished appending data before setting `endOfData: true`.
1. Remove the DataFrame Service Kafka pods from the cluster
1. Set the following three Helm values to `false` in the `systemlink-values.yaml` file:
- `dataframeservice.ingestion.kafkaBackend.enabled`
- `dataframeservice.kafkacluster.kafka.enabled`
- `dataframeservice.schema-registry.enabled`
1. Run a Helm upgrade
1. Wait for the `dfs-kafka` pods to disappear from the cluster
1. Remove the Strimzi Kafka Operator from the cluster
1. Set the following Helm value to false in the `systemlink-admin-values.yaml` file:
- `strimzi-kafka-operator.enabled`
1. Run a Helm upgrade
1. Remove the CRDs for the Strimzi Kafka Operator from the cluster. By design, these are not removed when the operator is uninstalled, so they need to be cleaned up manually. Run `kubectl delete -f systemlinkadmin/charts/strimzi-kafka-operator/crds` to delete the CRDs.
1. Delete the Persistent Volume Claims (PVCs) for the Kafka-related pods. Looks for PVCs containing `dfs-kafka` in Lens.

After completing these steps, if you need to update SystemLink Enterprise again, you should skip steps 2 and 3 of the updating instructions for updating the Strimzi Kafka Operator CRDs, to avoid recreating the unneeded CRDs.
Binary file not shown.
Loading

0 comments on commit 0a1eb4c

Please sign in to comment.