From 4b1bb4f681475958b8bb72f0de1ba7431cc70607 Mon Sep 17 00:00:00 2001
From: Ryan Abernathey <ryan.abernathey@gmail.com>
Date: Tue, 12 Apr 2022 15:38:07 -0400
Subject: [PATCH 1/8] first draft user storage guide

---
 index.md        |  11 ++
 user/storage.md | 267 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 278 insertions(+)
 create mode 100644 user/storage.md

diff --git a/index.md b/index.md
index c067128..8936596 100644
--- a/index.md
+++ b/index.md
@@ -26,6 +26,17 @@ About the JupyterHub Service
 Get a hub
 ```
 
+## Hub User Guide
+
+This user guide explains how users should interact with their hub environment.
+
+```{toctree}
+:maxdepth: 1
+:caption: Hub User Guide
+
+user/storage
+```
+
 ## Hub Administration topics
 
 These guides have information on how hub admins can perform specific
diff --git a/user/storage.md b/user/storage.md
new file mode 100644
index 0000000..8345ec1
--- /dev/null
+++ b/user/storage.md
@@ -0,0 +1,267 @@
+# Files and Data in the Cloud
+
+This page describes how files and data storage are handled in 2i2c Hubs.
+The high-level summary of recommendations is:
+- Use your home directory to store code, notebooks, and small data files (<1 GB)
+  for personal use
+- Use cloud object storage to store larger datasets and to share data across your team
+- Consider whether your project would benefit from other cloud-native data storage
+  solutions such as a database, data warehouse, or data lake
+
+:::{attribution}
+The following material was adapted from the
+[Pangeo Cloud User Guide](https://pangeo.io/cloud.html)
+:::
+
+## Your Home Directory
+
+Your notebook server is a linux "virtual machine" with its own filesystem.
+Your are not on a shared server; you are on your own private server.
+Your username is ``jovyan``, and your home directory is ``/home/jovyan``.
+This is the same for all users.
+
+Your home directory is intended only for notebooks, analysis scripts, and small datasets (< 1 GB).
+It is not an appropriate place to store large datasets.
+No one else can see or access the files your home directory.
+
+The easiest way to move files in and out of your home directory is via the JupyterLab web interface.
+Drag a file into the file browser to upload, and right-click to download back out.
+You can also open a terminal via the JupyterLab launcher and use this to ssh / scp / ftp to remote systems.
+However, you can’t ssh in!
+
+## The `shared` Directory
+
+All users have a directory called `shared` in their home directory.
+This is a *readonly* directory - anybody on the hub can *access* and *read from* the `shared` directory.
+The hub administrator may choose to distribute shared materials via this directory.
+The `shared` directory is not intended as a way for hub users to share data with each other.
+
+## Using Git / GitHub
+
+The recommended way to move code in and out of the hub is via git / GitHub.
+You should clone your project repo from the terminal and use git pull / git push to update and push changes.
+In order to push data to GitHub from the hub, you will need to set up GitHub authentication.
+This is a very quick guide to getting your GitHub authentication set up,
+adopted from the [Carpentries GitHub Remotes lesson](https://swcarpentry.github.io/git-novice/07-github/index.html#ssh-background-and-setup).
+
+1. Open a terminal in JupyterHub
+1. Type the command
+   ```
+   ssh-keygen -t ed25519 -C "YOUR EMAIL ADDRESS GOES HERE"`
+   ```
+   (Don't just copy this text; you have to put in tour actual email address in between the quotes.) This command will create an ssh public / private key pair.
+1. Enter a password for your new SSH key and record it in a safe place.
+   This password is used to "lock" the SSH key. It can't be used without the password.
+1. Type the command
+   ```
+   cat ~/.ssh/id_ed25519.pub
+   ```
+   and copy the result. It should look something like `ssh-ed25519 {long random string} {your email address}`.
+1. Go to <https://github.com/settings/keys>. Click the green  button that says "New SSH Key".
+   Give your key the title "JupyterHub SSH Key for Research Computing" and paste the
+   public key from the previous step into the "Key" box.
+1. Verify that your key works by typing
+   ```
+   ssh -T git@github.com
+   ```
+   on the command like of the Hub. (Note you will have to enter your SSH key password from step 3.)
+   This will return a message of the following form
+   ```
+   Hi {username}! You've successfully authenticated, but GitHub does not provide shell access.
+   ```
+   If you see that, it works! 🚀
+
+You should now be able to push to GitHub from the hub.
+
+## Cloud Object Storage
+
+Your hub lives in the cloud.
+The preferred way to store data in the cloud is using cloud object storage, such as Amazon S3 or Google Cloud Storage.
+Cloud object storage is essentially a key/value storage system.
+They keys are strings, and the values are bytes of data.
+Data is read and written using HTTP calls.
+
+The performance of object storage is very different from file storage.
+On one hand, each individual read / write to object storage has a high overhead (10-100 ms), since it has to go over the network.
+On the other hand, object storage “scales out” nearly infinitely, meaning that we can make hundreds, thousands, or millions of concurrent reads / writes.
+This makes object storage well suited for distributed data analytics.
+However, data analysis software must be adapted to take advantage of these properties.
+
+### Cloud-Native Formats
+
+Cloud-native file formats are formats that are designed from the beginning to
+work well with cloud object storage.
+These formats permit exploration of data and metadata without downloading of the
+entire file / dataset and work well with distributed parallel computing.
+Here we enumerate some popular cloud-native formats and their use cases:
+
+| Format | Use Case | Python Libraries |
+|--|--|--|
+| [Apache Parquet](https://parquet.apache.org/) | Column-oriented data file format designed for efficient data storage and retrieval. Suitable for tabular-style data (rows and columns). | pandas, dask.dataframe, vaex, pyarrow |
+| [Zarr](http://zarr.dev/) | Storage of large multidimensional arrays | zarr, numpy, dask.array, xarray |
+| [Cloud Optimized Geotiff](https://www.cogeo.org/) | Geospatial raster data | rasterio, rio-xarray |
+
+There are other more specialized cloud-optimized formats for specific scientific domains.
+
+It is recommended to use cloud-native formats when working with big data in cloud object storage.
+
+### Working with Object Storage
+
+From a user perspective, the main challenge of working with object storage is the need
+to use more specialized tools, rather than just simple files / filenames, to manage data.
+Fortunately, excellent tools exists to make working with object storage easy and familiar.
+
+For python users, the main tool is [filesystem spec](https://filesystem-spec.readthedocs.io/en/latest/)
+(fsspec), a set of packages which enable us to work with many different types of storage.
+Separate fsspec packages exist for each type of object storage:
+
+- **[s3fs](https://s3fs.readthedocs.io/en/latest/)** - for working with AWS S3
+  (Simple Storage Service) and compatible APIs. Most third-party object storage
+  services (e.g. [Wasabi](https://wasabi.com/) and [Open Storage Newtork](https://www.openstoragenetwork.org/))
+  are compatible with S3.
+- **[gcsfs](https://gcsfs.readthedocs.io/en/latest/)** - for working with Google
+  Cloud Storage.
+- **[adlfs](https://github.com/fsspec/adlfs)** - for working with Azure Data Lake
+  and Azure BLOB Storage.
+
+Each system has its own unique mechanisms for authentication and authorization;
+consult the documentation links above for more details.
+
+#### Reading Data
+
+When reading data from cloud object storage, you have two general options:
+- Download the data to the local filesystem; this is fine for small data, but not suitable
+  large data or cloud-optimized datasets. Downloads can be managed with
+  [Pooch](https://www.fatiando.org/pooch/latest/) or fsspec.
+- Open the data with an application that understands how to stream data data
+  over HTTP directly from object storage. This is suitable for large data and
+  cloud-native formats.
+
+As an example of the latter use case, here is how you would open the
+[NASA  Multi-Scale Ultra High Resolution (MUR) Sea Surface Temperature (SST)](https://registry.opendata.aws/mur/)
+dataset from the AWS Public Data program using Xarray:
+
+```python
+import xarray as xr
+ds = xr.open_dataset("s3://mur-sst/zarr/", engine="zarr", storage_options={"anon": True})
+```
+
+#### Writing Data
+
+Writing data (and reading private data) requires credentials for authentication.
+2i2c does not provide credentials to individual users.
+Instead you 2i2c customers should manage their own cloud storage directly.
+
+On S3-type storage, you will have a client key and client secret associated with you account.
+The following code creates a writeable filesystem:
+
+```python
+import s3fs
+fs = s3fs.S3FileSystem(key='<YOUR_CLIENT_KEY>', secret='<YOUR_CLIENT_SECRET')
+```
+
+Non-AWS S3 services (e.g. Wasabi Cloud) can be configured by passing an argument
+such as `client_kwargs={'endpoint_url': 'https://s3.us-east-2.wasabisys.com'}`
+to `S3FileSystem`.
+
+For Google Cloud Storage, the best practice is to create a
+[service account](https://cloud.google.com/iam/docs/service-accounts) with
+appropriate permissions to read / write your private bucket.
+You upload your service account key (a .json file) to your hub
+home directory and then use it as follows:
+
+```python
+ import json
+ import gcsfs
+ with open('<your_token_file>.json') as token_file:
+     token = json.load(token_file)
+ gcs = gcsfs.GCSFileSystem(token=token)
+```
+
+You can then read / write private files with the ``gcs`` object.
+
+### Scratch Bucket
+
+Some 2i2c environments are configured with a "scratch bucket," which
+allows you to temporarily store data. Credentials to write to the scratch
+bucket are pre-loaded into your Pangeo Cloud environment.
+
+:::{warning}
+Any data in scratch buckets will be deleted once it is 7 days old.
+Do not use scratch buckets to store data permanently.
+:::
+
+The location of your scratch bucket is contained in the environment variable ``PANGEO_SCRATCH``.
+
+And an example, here is how you would write Xarray data to the scratch bucket
+in Zarr format.
+
+
+```python
+import os
+import xarray as xr
+PANGEO_SCRATCH = os.environ['PANGEO_SCRATCH']  # -> gs://pangeo-scratch/<username>
+ds = xr.tutorial.open_dataset("rasm")  # load example data
+ds.to_zarr(f'{PANGEO_SCRATCH}/rasm.zarr')  # write data
+```
+
+:::{warning}
+A common set of credentials is currently used for accessing scratch buckets.
+This means users can read, and potentially remove / overwrite, each others'
+data. You can avoid this problem by always using ``PANGEO_SCRATCH`` as a prefix.
+Still, you should not store any sensitive or mission-critical data in
+the scratch bucket.
+:::
+
+### Data Catalogs
+
+To make it easier to discover share data in your project, it is recommended to use
+data catalogs.
+[Intake](https://intake.readthedocs.io/en/latest/) is a popular tool for making
+data catalogs in python.
+
+Below is an example of an intake data catalog for loading Zarr data in Xarray from
+OpenStorageNetwork.
+(This example is borrowed from the [Ocean Eddy CPT project](https://github.com/ocean-eddy-cpt/cpt-data/blob/master/catalog.yaml).)
+
+```yaml
+plugins:
+  source:
+    - module: intake_xarray
+
+sources:
+
+  neverworld_five_day_averages:
+    description: Five-day-average fields from Neverworld2
+    driver: zarr
+    args:
+      urlpath: s3://Pangeo/ocean-eddy-cpt/5-day-averages/
+      consolidated: True
+      storage_options:
+        anon: True
+        client_kwargs:
+          endpoint_url: 'https://ncsa.osn.xsede.org'
+
+  neverworld_quarter_degree_snapshots:
+    description: snapshots of fields from Neverworld2
+    driver: zarr
+    args:
+      urlpath: s3://Pangeo/ocean-eddy-cpt/quarter-degree/snapshots/
+      consolidated: True
+      storage_options:
+        anon: True
+        client_kwargs:
+          endpoint_url: 'https://ncsa.osn.xsede.org'
+```
+
+To use this catalog, place it online and share the URL with your team.
+
+Here is an example of how to use this catalog file:
+
+```python
+import intake
+cat_url = "https://raw.githubusercontent.com/ocean-eddy-cpt/cpt-data/master/catalog.yaml"
+cat = intake.open_catalog(cat_url)
+list(cat)  # discover what is in the catalog
+ds = cat['neverworld_five_day_averages'].to_dask()  # open lazily with Xarray
+```

From a745cae820dba9c4430be10b5c00114e10477b81 Mon Sep 17 00:00:00 2001
From: Ryan Abernathey <ryan.abernathey@gmail.com>
Date: Tue, 12 Apr 2022 15:45:55 -0400
Subject: [PATCH 2/8] fix admonition block

---
 user/storage.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/user/storage.md b/user/storage.md
index 8345ec1..f8e01b3 100644
--- a/user/storage.md
+++ b/user/storage.md
@@ -8,7 +8,7 @@ The high-level summary of recommendations is:
 - Consider whether your project would benefit from other cloud-native data storage
   solutions such as a database, data warehouse, or data lake
 
-:::{attribution}
+:::{admonition} Attribution
 The following material was adapted from the
 [Pangeo Cloud User Guide](https://pangeo.io/cloud.html)
 :::

From c7712b9d7b539c4c3bf9fbd8660cffd0dae0f121 Mon Sep 17 00:00:00 2001
From: YuviPanda <yuvipanda@gmail.com>
Date: Wed, 20 Apr 2022 02:45:07 -0700
Subject: [PATCH 3/8] Replace ssh-keygen with gh-scoped-creds

Much secure, such power
---
 user/storage.md | 47 ++++++++++++++++-------------------------------
 1 file changed, 16 insertions(+), 31 deletions(-)

diff --git a/user/storage.md b/user/storage.md
index f8e01b3..18f5f52 100644
--- a/user/storage.md
+++ b/user/storage.md
@@ -41,37 +41,22 @@ The `shared` directory is not intended as a way for hub users to share data with
 The recommended way to move code in and out of the hub is via git / GitHub.
 You should clone your project repo from the terminal and use git pull / git push to update and push changes.
 In order to push data to GitHub from the hub, you will need to set up GitHub authentication.
-This is a very quick guide to getting your GitHub authentication set up,
-adopted from the [Carpentries GitHub Remotes lesson](https://swcarpentry.github.io/git-novice/07-github/index.html#ssh-background-and-setup).
-
-1. Open a terminal in JupyterHub
-1. Type the command
-   ```
-   ssh-keygen -t ed25519 -C "YOUR EMAIL ADDRESS GOES HERE"`
-   ```
-   (Don't just copy this text; you have to put in tour actual email address in between the quotes.) This command will create an ssh public / private key pair.
-1. Enter a password for your new SSH key and record it in a safe place.
-   This password is used to "lock" the SSH key. It can't be used without the password.
-1. Type the command
-   ```
-   cat ~/.ssh/id_ed25519.pub
-   ```
-   and copy the result. It should look something like `ssh-ed25519 {long random string} {your email address}`.
-1. Go to <https://github.com/settings/keys>. Click the green  button that says "New SSH Key".
-   Give your key the title "JupyterHub SSH Key for Research Computing" and paste the
-   public key from the previous step into the "Key" box.
-1. Verify that your key works by typing
-   ```
-   ssh -T git@github.com
-   ```
-   on the command like of the Hub. (Note you will have to enter your SSH key password from step 3.)
-   This will return a message of the following form
-   ```
-   Hi {username}! You've successfully authenticated, but GitHub does not provide shell access.
-   ```
-   If you see that, it works! 🚀
-
-You should now be able to push to GitHub from the hub.
+[gh-scoped-creds](https://github.com/yuvipanda/gh-scoped-creds/) should be already setup
+on your 2i2c managed JupyterHub, and we shall use that to authenticate to GitHub for
+push / pull access.
+
+Open a terminal in JupyterHub, run `gh-scoped-creds` and follow the prompts.
+
+Alternatively, in a notebook, run the following code and follow the prompts:
+
+```
+import gh_scoped_creds
+%ghscopedcreds
+```
+
+You should now be able to push to GitHub from the hub! These credentials will expire after
+8 hours (or whenever your JupyterHub server stops), and you'll have to repeat these steps
+to fetch a fresh set of credentials.
 
 ## Cloud Object Storage
 

From 61eb28dd54996997e816091daac98e7494908a04 Mon Sep 17 00:00:00 2001
From: YuviPanda <yuvipanda@gmail.com>
Date: Wed, 20 Apr 2022 17:17:35 -0700
Subject: [PATCH 4/8] Add a little more info about github app

---
 user/storage.md | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/user/storage.md b/user/storage.md
index 18f5f52..3c9c140 100644
--- a/user/storage.md
+++ b/user/storage.md
@@ -56,7 +56,13 @@ import gh_scoped_creds
 
 You should now be able to push to GitHub from the hub! These credentials will expire after
 8 hours (or whenever your JupyterHub server stops), and you'll have to repeat these steps
-to fetch a fresh set of credentials.
+to fetch a fresh set of credentials. Once you authenticate, you'll be provided with a link
+to a [GitHub App](https://docs.github.com/en/developers/apps/getting-started-with-apps/about-apps)
+that you have to [install](https://docs.github.com/en/developers/apps/managing-github-apps/installing-github-apps)
+on the repositories you want to be able to push to from this particular JupyterHub. You only
+need to do this once per JupyterHub, and can revoke access any time. You can always provide
+access to your own personal repositories, but might need approval from admins of GitHub
+organizations if you want to push to repos in that organization.
 
 ## Cloud Object Storage
 

From 4a625ebbbc066c685c0681bfb2f0b774d2f2230d Mon Sep 17 00:00:00 2001
From: Ryan Abernathey <ryan.abernathey@gmail.com>
Date: Thu, 21 Apr 2022 09:34:36 -0400
Subject: [PATCH 5/8] Apply suggestions from code review

Co-authored-by: Chris Holdgraf <choldgraf@gmail.com>
---
 user/storage.md | 34 +++++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/user/storage.md b/user/storage.md
index 3c9c140..23a7c74 100644
--- a/user/storage.md
+++ b/user/storage.md
@@ -67,7 +67,7 @@ organizations if you want to push to repos in that organization.
 ## Cloud Object Storage
 
 Your hub lives in the cloud.
-The preferred way to store data in the cloud is using cloud object storage, such as Amazon S3 or Google Cloud Storage.
+The preferred way to store data in the cloud is using [cloud object storage](https://aws.amazon.com/what-is-cloud-object-storage/), such as Amazon S3 or Google Cloud Storage.
 Cloud object storage is essentially a key/value storage system.
 They keys are strings, and the values are bytes of data.
 Data is read and written using HTTP calls.
@@ -100,7 +100,7 @@ It is recommended to use cloud-native formats when working with big data in clou
 
 From a user perspective, the main challenge of working with object storage is the need
 to use more specialized tools, rather than just simple files / filenames, to manage data.
-Fortunately, excellent tools exists to make working with object storage easy and familiar.
+Fortunately, excellent tools exist to make working with object storage easy and familiar.
 
 For python users, the main tool is [filesystem spec](https://filesystem-spec.readthedocs.io/en/latest/)
 (fsspec), a set of packages which enable us to work with many different types of storage.
@@ -121,7 +121,7 @@ consult the documentation links above for more details.
 #### Reading Data
 
 When reading data from cloud object storage, you have two general options:
-- Download the data to the local filesystem; this is fine for small data, but not suitable
+- Download the data to the local filesystem; this is fine for small data, but not suitable for
   large data or cloud-optimized datasets. Downloads can be managed with
   [Pooch](https://www.fatiando.org/pooch/latest/) or fsspec.
 - Open the data with an application that understands how to stream data data
@@ -141,7 +141,14 @@ ds = xr.open_dataset("s3://mur-sst/zarr/", engine="zarr", storage_options={"anon
 
 Writing data (and reading private data) requires credentials for authentication.
 2i2c does not provide credentials to individual users.
-Instead you 2i2c customers should manage their own cloud storage directly.
+Instead, 2i2c customers should manage their own cloud storage directly.
+See [the Amazon S3](https://aws.amazon.com/s3/getting-started/), [Google Cloud Storage](https://cloud.google.com/storage), and [Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/) instructions for information on getting started.
+
+:::{note}
+This section refers to "S3 Storage" in a generic sense.
+Amazon S3 is the most well-known form of S3 storage, but something like it exists across each major cloud provider as well.
+:::
+
 
 On S3-type storage, you will have a client key and client secret associated with you account.
 The following code creates a writeable filesystem:
@@ -157,8 +164,8 @@ to `S3FileSystem`.
 
 For Google Cloud Storage, the best practice is to create a
 [service account](https://cloud.google.com/iam/docs/service-accounts) with
-appropriate permissions to read / write your private bucket.
-You upload your service account key (a .json file) to your hub
+appropriate permissions to read / write to your private bucket.
+You upload your service account key (a `.json` file) to your hub
 home directory and then use it as follows:
 
 ```python
@@ -174,32 +181,33 @@ You can then read / write private files with the ``gcs`` object.
 ### Scratch Bucket
 
 Some 2i2c environments are configured with a "scratch bucket," which
-allows you to temporarily store data. Credentials to write to the scratch
-bucket are pre-loaded into your Pangeo Cloud environment.
+allows you to temporarily store data (for example, when you need to store intermediate files during data transformations).
+Credentials to write to the scratch
+bucket are pre-loaded into your Hub's user environment.
 
 :::{warning}
 Any data in scratch buckets will be deleted once it is 7 days old.
 Do not use scratch buckets to store data permanently.
 :::
 
-The location of your scratch bucket is contained in the environment variable ``PANGEO_SCRATCH``.
+The location of your scratch bucket is contained in the environment variable ``SCRATCH_BUCKET ``.
 
-And an example, here is how you would write Xarray data to the scratch bucket
+For example, here is how you would write Xarray data to the scratch bucket
 in Zarr format.
 
 
 ```python
 import os
 import xarray as xr
-PANGEO_SCRATCH = os.environ['PANGEO_SCRATCH']  # -> gs://pangeo-scratch/<username>
+SCRATCH_BUCKET = os.environ['SCRATCH_BUCKET'] 
 ds = xr.tutorial.open_dataset("rasm")  # load example data
-ds.to_zarr(f'{PANGEO_SCRATCH}/rasm.zarr')  # write data
+ds.to_zarr(f'{SCRATCH_BUCKET}/rasm.zarr')  # write data
 ```
 
 :::{warning}
 A common set of credentials is currently used for accessing scratch buckets.
 This means users can read, and potentially remove / overwrite, each others'
-data. You can avoid this problem by always using ``PANGEO_SCRATCH`` as a prefix.
+data. You can avoid this problem by always using ``SCRATCH_BUCKET`` as a prefix.
 Still, you should not store any sensitive or mission-critical data in
 the scratch bucket.
 :::

From 9f852224531b52243d59e7c5e6eaeafd65a53718 Mon Sep 17 00:00:00 2001
From: Chris Holdgraf <choldgraf@gmail.com>
Date: Thu, 21 Apr 2022 06:49:38 -0700
Subject: [PATCH 6/8] Update user/storage.md

---
 user/storage.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/user/storage.md b/user/storage.md
index 23a7c74..6e36f15 100644
--- a/user/storage.md
+++ b/user/storage.md
@@ -108,7 +108,7 @@ Separate fsspec packages exist for each type of object storage:
 
 - **[s3fs](https://s3fs.readthedocs.io/en/latest/)** - for working with AWS S3
   (Simple Storage Service) and compatible APIs. Most third-party object storage
-  services (e.g. [Wasabi](https://wasabi.com/) and [Open Storage Newtork](https://www.openstoragenetwork.org/))
+  services (e.g. [Wasabi](https://wasabi.com/) and [Open Storage Newtork](https://openstoragenetwork.org/))
   are compatible with S3.
 - **[gcsfs](https://gcsfs.readthedocs.io/en/latest/)** - for working with Google
   Cloud Storage.

From ee45403c4475a835a9b7a0c6b9b5638a9eb057ec Mon Sep 17 00:00:00 2001
From: Chris Holdgraf <choldgraf@gmail.com>
Date: Thu, 21 Apr 2022 15:53:53 +0200
Subject: [PATCH 7/8] Fix linkcheck

---
 conf.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/conf.py b/conf.py
index 8b9c4ec..20a9cb0 100644
--- a/conf.py
+++ b/conf.py
@@ -65,6 +65,9 @@
 
 # Disable linkcheck for anchors because it throws false errors for any JS anchors
 linkcheck_anchors = False
+linkcheck_ignore = [
+    "*openstoragenetwork.org*",  # It incorrectly fails with `Max retries exceeded with url`
+]
 
 
 def setup(app):

From 316424d4ce0c51c88290738d03bde6ad33931531 Mon Sep 17 00:00:00 2001
From: Chris Holdgraf <choldgraf@gmail.com>
Date: Thu, 21 Apr 2022 07:10:55 -0700
Subject: [PATCH 8/8] Update conf.py

---
 conf.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/conf.py b/conf.py
index f9e21c8..df6620b 100644
--- a/conf.py
+++ b/conf.py
@@ -66,7 +66,7 @@
 # Disable linkcheck for anchors because it throws false errors for any JS anchors
 linkcheck_anchors = False
 linkcheck_ignore = [
-    "*openstoragenetwork.org*",  # It incorrectly fails with `Max retries exceeded with url`
+    "https://openstoragenetwork.org*",  # It incorrectly fails with `Max retries exceeded with url`
     "https://docs.github.com*",  # Because docs.github.com returns 403 Forbidden errors
 ]