Skip to content

Commit

Permalink
DAOS-14226 docker: deployment of vcluster with md-on-ssd (#13087)
Browse files Browse the repository at this point in the history
Update DAOS docker vcluster scripts for being able to deploy a minimal docker DAOS system using the md-on-ssd feature.
This PR also fix miscellaneous minor issues such as default DAOS rpms repos, variables naming, etc.
The Doc-only pragma as been used as there is nothing yet tested by the CI related to this docker stuff.

Signed-off-by: Cedric Koch-Hofer <[email protected]>
  • Loading branch information
knard38 authored Mar 27, 2024
1 parent f0837e4 commit 6c59aa0
Show file tree
Hide file tree
Showing 25 changed files with 488 additions and 459 deletions.
178 changes: 37 additions & 141 deletions docs/QSG/docker.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,12 @@

This section describes how to build and deploy Docker images allowing to simulate a small cluster
using DAOS as backend storage. This small cluster is composed of the following three nodes:

- The `daos-server` node running a DAOS server daemon managing data storage devices such as SCM or
NVMe disks.
- The `daos-admin` node allowing to manage the DAOS server thanks to `dmg`command.
- The `daos-client` node using the the DAOS server to store data.

At this time only emulated hardware storage are supported by this Docker platform:

- SCM (i.e. Storage Class Memory) are emulated with standard RAM memory.
- NVMe disks are emulated with a file device.

Expand Down Expand Up @@ -45,7 +43,7 @@ The platform was tested and validated with the following dependencies:
[RPMs](https://download.docker.com/linux/centos/docker-ce.repo)
- [DAOS 2.6](https://docs.daos.io/v2.6/) local RPMS builds from [DAOS master
branch](https://github.com/daos-stack/daos/tree/master)
- [rockylinux/rockylinux:8.6](https://hub.docker.com/r/rockylinux/rockylinux/) official docker
- [rockylinux/rockylinux:8.9](https://hub.docker.com/r/rockylinux/rockylinux/) official docker
images.

### Configuring HugePages
Expand Down Expand Up @@ -75,93 +73,58 @@ $ sysctl -p
### Base DAOS Image

The first image to create is the `daos-base` image which is not intetended to be used as it, but as
a base image for building the other three daos images. This first image could be built directly
from GitHub with the following command:

```bash
$ docker build --tag daos-base:rocky8.6 \
https://github.com/daos-stack/daos.git#master:utils/docker/vcluster/daos-base/el8
```

This Docker file accept the following arguments:

- `RHEL_BASE_IMAGE`: Base docker image to use (default "rockylinux/rockylinux")
- `RHEL_BASE_VERSION`: Version of the base docker image to use (default "8.6")
a base image for building the other three daos images. The easiest way is to use the `docker
compose` sub command from a local DAOS source file tree. The first step is to update the docker
environment file "utils/docker/examples/.env" according to the targeted DAOS system. The following
environment variables allow to customize the Docker image to build:
- `LINUX_DISTRO`: Linux distribution identifier (default "el8")
- `DAOS_DOCKER_IMAGE_NSP`: Namespace identifier of the base DAOS docker image (default "daos")
- `DAOS_DOCKER_IMAGE_TAG`: Tag identifier of the base DAOS docker image (default "v2.4.1")
- `BUST_CACHE`: Manage docker building cache (default ""). To invalidate the cache, a random value
such as the date of the day shall be given.
- `DAOS_AUTH`: Enable DAOS authentication when set to "yes" (default "yes")
such as the date of day shall be given.
- `LINUX_IMAGE_NAME`: Base docker image name to use (default "rockylinux/rockylinux")
- `LINUX_IMAGE_TAG`: Tag identifier of the base docker image to use (default "8.9")
- `DAOS_REPOS`: Space separated list of repos needed to install DAOS (default
"https://packages.daos.io/v2.2/EL8/packages/x86\_64/")
"https://packages.daos.io/v2.4.1/EL8/packages/x86\_64/")
- `DAOS_GPG_KEYS`: Space separated list of GPG keys associated with DAOS repos (default
"https://packages.daos.io/RPM-GPG-KEY")
"https://packages.daos.io/v2.4.1/RPM-GPG-KEY-2023")
- `DAOS_REPOS_NOAUTH`: Space separated list of repos to use without GPG authentication
(default "")

For example, building a DAOS base image, with authentication disabled, could be done with the
following command:

```bash
$ docker build --tag daos-base:rocky8.6 --build-arg DAOS_AUTH=no \
https://github.com/daos-stack/daos.git#master:utils/docker/vcluster/daos-base/el8
```

It is also possible to build the `daos-base` image from a local tree with the following command:

```bash
$ docker build --tag daos-base:rocky8.6 utils/docker/vcluster/daos-base/el8
```

From a local tree, a more straightforward way to build these images could be done with
`docker compose`:
(default "")
- `DAOS_VERSION`: Version of DAOS to use (default "2.4.1-2.el8")
- `DAOS_AUTH`: Enable DAOS authentication when set to "yes" (default "yes")

When the environment file has been properly filled, run the following command to build the base DAOS
docker image.
```bash
$ docker compose --file utils/docker/vcluster/docker-compose.yml build daos_base
```

The same arguments are accepted but they have to be defined in the Docker Compose environment file
`utils/docker/vcluster/.env`.
!!! warning
For working properly, the DAOS authentication have to be enabled in all the images (i.e. nodes
images and base image).

### DAOS Nodes Images

The three images `daos-server`, `daos-admin` and `daos-client` could be built directly from GitHub
or from a local tree in the same way as for the `daos-base` image. Following command could be used
to build directly the three images from GitHub:

```bash
$ for image in daos-server daos-admin daos-client ; do \
docker build --tag "$image:rocky8.6" \
"https://github.com/daos-stack/daos.git#master:utils/docker/vcluster/$image/el8"; \
done
```

The Docker file of the `daos-server` image accept the following arguments:

- `DAOS_BASE_IMAGE`: Base docker image to use (default "daos-base")
- `DAOS_BASE_VERSION`: Version of the base docker image to use (default "rocky8.6")
To build the the three docker images `daos-server`, `daos-admin` and `daos-client`, the first step
is to update the docker environment file "utils/docker/examples/.env" according to the targeted DAOS
system. The `daos-server`,`daos-client` and `daos-admin` images can be customize with the following
environment variables:
- `DAOS_DOCKER_IMAGE_TAG`: Tag identifier of the base DAOS docker image to use (default "v2.4.1")
- `DAOS_VERSION`: Version of DAOS to use (default "2.4.1-2.el8")
- `DAOS_AUTH`: Enable DAOS authentication when set to "yes" (default "yes")

The `daos-server` image is also using the following environment variables:
- `DAOS_HUGEPAGES_NBR`: Number of huge pages to allocate for SPDK (default 4096)
- `DAOS_SCM_SIZE`: Size in GB of the RAM emulating SCM devices (default 4)
- `DAOS_BDEV_SIZE`: Size in GB of the file created to emulate NVMe devices (default 16)
- `DAOS_IFACE_NAME`: Fabric network interface used by the DAOS engine (default "eth0")
- `DAOS_MD_ON_SSD`: Enable DAOS MD-on-SSD feature when set to "yes" (default "no")

!!! note
The IP address of the network interface referenced by the `DAOS_IFACE_NAME` argument will be
required when starting DAOS.

The Dockerfile of the `daos-client` and `daos-admin` images accept the following arguments:

- `DAOS_BASE_IMAGE`: Base docker image to use (default "daos-base")
- `DAOS_BASE_VERSION`: Version of the base docker image to use (default "rocky8.6")
- `DAOS_AUTH`: Enable DAOS authentication when set to "yes" (default "yes")
- `DAOS_ADMIN_USER`: Name or uid of the daos administrattor user (default "root")
- `DAOS_ADMIN_GROUP`: Name or gid of the daos administrattor group (default "root")

!!! warning
For working properly, the DAOS authentication have to be enabled in all the images (i.e. nodes
images and base image).

The Dockerfile of the `daos-client` image accept the following arguments:

The `daos-client` image is also using the following environment variables:
- `DAOS_AGENT_IFACE_CFG`: Enable manual configuration of the interface to use by the agent (default
"yes")
- `DAOS_AGENT_IFACE_NUMA_NODE`: Numa node of the interface to use by the agent (default "0").
Expand All @@ -175,88 +138,21 @@ The Dockerfile of the `daos-client` image accept the following arguments:
On most of the system the`DAOS_IFACE_CFG` should be enabled: The DAOS Network Interface
auto-detection could not yet be properly done inside a DAOS Agent Docker container.

From a local tree, a more straightforward way to build these images could be done with
`docker compose`:

When the environment file has been properly filled, run the following command to build the docker
images:
```bash
$ docker compose --file utils/docker/vcluster/docker-compose.yml build daos_server daos_admin daos_client
```

The same arguments are accepted but they have to be defined in the Docker Compose environment file
`utils/docker/vcluster/.env`.

!!! warning
For working properly, the DAOS authentication have to be enabled in all the images (i.e. nodes
images and base image).

## Running the DAOS Containers

### Via Docker Commands

Once the images are created, the containers could be directly started with docker with the following
commands:

```bash
$ export DAOS_IFACE_IP=x.x.x.x
$ docker run --detach --privileged --name=daos-server --hostname=daos-server \
--add-host "daos-server:$DAOS_IFACE_IP" --add-host "daos-admin:$DAOS_IFACE_IP" \
--add-host "daos-client:$DAOS_IFACE_IP" --volume=/sys/fs/cgroup:/sys/fs/cgroup:ro \
--volume=/dev/hugepages:/dev/hugepages --tmpfs=/run --network=host \
daos-server:rocky8.6
$ docker run --detach --privileged --name=daos-agent --hostname=daos-agent \
--add-host "daos-server:$DAOS_IFACE_IP" --add-host "daos-admin:$DAOS_IFACE_IP" \
--add-host "daos-client:$DAOS_IFACE_IP" --volume=/sys/fs/cgroup:/sys/fs/cgroup:ro \
--tmpfs=/run --network=host daos-agent:rocky8.6
$ docker run --detach --privileged --name=daos-client --hostname=daos-client \
--add-host "daos-server:$DAOS_IFACE_IP" --add-host "daos-admin:$DAOS_IFACE_IP" \
--add-host "daos-client:$DAOS_IFACE_IP" --volume=/sys/fs/cgroup:/sys/fs/cgroup:ro \
--tmpfs=/run --network=host daos-client:rocky8.6
```

The value of the `DAOS_IFACE_IP` shall be replaced with the one of the network interface which was
provided when the images have been built.

Once started, the DAOS server waits for the administrator to format the system.
This can be done using the following command:

```bash
$ docker exec daos-admin dmg -i storage format
```

Upon successful completion of the format, the storage engine is started, and pools
can be created using the daos admin tool. For more advanced configurations and usage refer to the
section [DAOS Tour](https://docs.daos.io/v2.6/QSG/tour/).


### Via Docker Compose

From a local tree, a more straightforward way to start the containers could be done with
`docker compose`:

```bash
$ docker compose --file utils/docker/vcluster/docker-compose.yml up --detach
```

!!! note
Before starting the containers with `docker compose`, the IP address of the network interface,
which was provided when the images have been built, shall be defined in the Docker
Compose environment file `utils/docker/vcluster/.env`.

As with the docker command, the system shall be formatted, pools created, etc..


### Via Custom Scripts

From a local tree, the bash script `utils/docker/vcluster/daos-cm.sh` could be used to start the
containers and setup a simple DAOS system composed of the following elements:

Once the images are created, the bash script `utils/docker/vcluster/daos-cm.sh` can be used to to
start the containers and setup a simple DAOS system composed of the following elements:
- 1 DAOS pool of 10GB (i.e. size of the pool is configurable)
- 1 DAOS POSIX container mounted on /mnt/daos-posix-fs
This script can also be used to respectively stop and monitor the containers.

This script could also be used to respectively stop and monitor the containers.

More details on the usage of `daos-cm.sh` command could be found with running the following command:

To get more details on the usage of `daos-cm.sh` run the following command:
```bash
$ utils/docker/vcluster/daos-cm.sh --help
```
10 changes: 5 additions & 5 deletions utils/docker/examples/.env
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ DAOS_AGENT_CERTS_TXZ="secrets/daos_agent-certs.txz"
BUST_CACHE=""
LINUX_DISTRO="el8"
LINUX_IMAGE_NAME="rockylinux/rockylinux"
LINUX_IMAGE_TAG="8.8"
DAOS_REPOS="https://packages.daos.io/v2.4/EL8/packages/x86_64/"
DAOS_GPG_KEYS="https://packages.daos.io/v2.4.0/RPM-GPG-KEY-2023"
LINUX_IMAGE_TAG="8.9"
DAOS_REPOS="https://packages.daos.io/v2.4.1/EL8/packages/x86_64/"
DAOS_GPG_KEYS="https://packages.daos.io/v2.4.1/RPM-GPG-KEY-2023"
DAOS_REPOS_NOAUTH=""
DAOS_VERSION="2.4.0-2.el8"
DAOS_VERSION="2.4.1-2.el8"
DAOS_DOCKER_IMAGE_NSP="daos"
DAOS_DOCKER_IMAGE_TAG="v2.4.0"
DAOS_DOCKER_IMAGE_TAG="v2.4.1"
12 changes: 6 additions & 6 deletions utils/docker/examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ The platform was tested and validated with the following dependencies:
- [Docker CE](https://docs.docker.com/engine/install/centos/) latest
[RPMs](https://download.docker.com/linux/centos/docker-ce.repo)
- [DAOS 2.4](https://docs.daos.io/v2.4/) official [RPMS](https://packages.daos.io/v2.4/)
- [rockylinux/rockylinux:8.8](https://hub.docker.com/r/rockylinux/rockylinux/) official docker
- [rockylinux/rockylinux:8.9](https://hub.docker.com/r/rockylinux/rockylinux/) official docker
images.


Expand All @@ -34,18 +34,18 @@ properly build a docker image:
The following environment variables allow to customize the Docker image to build:
- `LINUX_DISTRO`: Linux distribution identifier (default "el8")
- `DAOS_DOCKER_IMAGE_NSP`: Namespace identifier of the base DAOS docker image (default "daos")
- `DAOS_DOCKER_IMAGE_TAG`: Tag identifier of the base DAOS docker image (default "v2.4.0")
- `DAOS_DOCKER_IMAGE_TAG`: Tag identifier of the base DAOS docker image (default "v2.4.1")
- `BUST_CACHE`: Manage docker building cache (default ""). To invalidate the cache, a random value
such as the date of day shall be given.
- `LINUX_IMAGE_NAME`: Base docker image name to use (default "rockylinux/rockylinux")
- `LINUX_IMAGE_TAG`: Tag identifier of the base docker image to use (default "8.8")
- `LINUX_IMAGE_TAG`: Tag identifier of the base docker image to use (default "8.9")
- `DAOS_REPOS`: Space separated list of repos needed to install DAOS (default
"https://packages.daos.io/v2.4/EL8/packages/x86\_64/")
"https://packages.daos.io/v2.4.1/EL8/packages/x86\_64/")
- `DAOS_GPG_KEYS`: Space separated list of GPG keys associated with DAOS repos (default
"https://packages.daos.io/v2.4.0/RPM-GPG-KEY-2023")
"https://packages.daos.io/v2.4.1/RPM-GPG-KEY-2023")
- `DAOS_REPOS_NOAUTH`: Space separated list of repos to use without GPG authentication
(default "")
- `DAOS_VERSION`: Version of DAOS to use (default "2.4.0-2.el8")
- `DAOS_VERSION`: Version of DAOS to use (default "2.4.1-2.el8")

When the environment file has been properly filled, run the following command to build the base DAOS
docker image.
Expand Down
26 changes: 16 additions & 10 deletions utils/docker/examples/daos-admin/el8/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,24 +1,30 @@
# Copyright 2021-2023 Intel Corporation
# Copyright 2021-2024 Intel Corporation
# All rights reserved.
#
# 'recipe' for building a base RHEL DAOS admin image
#
# This Dockerfile accept the following input build arguments:
# - LINUX_DISTRO Linux distribution identifier (default "el8")
# - DAOS_DOCKER_IMAGE_NSP Namespace identifier of the base DAOS docker image (default "daos")
# - DAOS_DOCKER_IMAGE_TAG Tag identifier of the DAOS client docker image (default "v2.4.0")
# - DAOS_VERSION Version of DAOS to use (default "2.4.0-2.el8")
# - LINUX_DISTRO Linux distribution identifier (mandatory)
# - DAOS_DOCKER_IMAGE_NSP Namespace identifier of the base DAOS docker image (mandatory)
# - DAOS_DOCKER_IMAGE_TAG Tag identifier of the DAOS client docker image (mandatory)
# - DAOS_VERSION Version of DAOS to use (mandatory)

# Pull base image
ARG LINUX_DISTRO="el8"
ARG DAOS_DOCKER_IMAGE_NSP="daos"
ARG DAOS_DOCKER_IMAGE_TAG="v2.4.0"
ARG LINUX_DISTRO=""
ARG DAOS_DOCKER_IMAGE_NSP=""
ARG DAOS_DOCKER_IMAGE_TAG=""
FROM "$DAOS_DOCKER_IMAGE_NSP/daos-base-$LINUX_DISTRO:$DAOS_DOCKER_IMAGE_TAG"
LABEL maintainer="[email protected]"

# Install DAOS package
ARG DAOS_VERSION="2.4.0-2.el8"
RUN echo "[INFO] Installing DAOS containerization dependencies" ; \
ARG DAOS_VERSION=""
RUN for it in DAOS_VERSION ; do \
if eval "[[ -z \$$it ]]" ; then \
echo "[ERROR] Docker build argument $it is not defined" ; \
exit 1 ; \
fi ; \
done && \
echo "[INFO] Installing DAOS containerization dependencies" ; \
dnf install \
sudo \
xz && \
Expand Down
2 changes: 1 addition & 1 deletion utils/docker/examples/daos-admin/el8/daos-bash.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/bin/bash

# set -x
set -e -o pipefail
set -u -e -o pipefail

if [[ "$(id -u)" != "0" ]] ; then
echo "[ERROR] daos-bash can only be run as root"
Expand Down
26 changes: 16 additions & 10 deletions utils/docker/examples/daos-agent/el8/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,24 +1,30 @@
# Copyright 2021-2023 Intel Corporation
# Copyright 2021-2024 Intel Corporation
# All rights reserved.
#
# 'recipe' for building a base RHEL DAOS client docker image
#
# This Dockerfile accept the following input build arguments:
# - LINUX_DISTRO Linux distribution identifier (default "el8")
# - DAOS_DOCKER_IMAGE_NSP Namespace identifier of the base DAOS docker image (default "daos")
# - DAOS_DOCKER_IMAGE_TAG Tag identifier of the DAOS client docker image (default "v2.4.0")
# - DAOS_VERSION Version of DAOS to use (default "2.4.0-2.el8")
# - LINUX_DISTRO Linux distribution identifier (mandatory)
# - DAOS_DOCKER_IMAGE_NSP Namespace identifier of the base DAOS docker image (mandatory)
# - DAOS_DOCKER_IMAGE_TAG Tag identifier of the DAOS client docker image (mandatory)
# - DAOS_VERSION Version of DAOS to use (mandatory)

# Pull base image
ARG LINUX_DISTRO="el8"
ARG DAOS_DOCKER_IMAGE_NSP="daos"
ARG DAOS_DOCKER_IMAGE_TAG="v2.4.0"
ARG LINUX_DISTRO=""
ARG DAOS_DOCKER_IMAGE_NSP=""
ARG DAOS_DOCKER_IMAGE_TAG=""
FROM "$DAOS_DOCKER_IMAGE_NSP/daos-base-$LINUX_DISTRO:$DAOS_DOCKER_IMAGE_TAG"
LABEL maintainer="[email protected]"

# Install DAOS package
ARG DAOS_VERSION="2.4.0-2.el8"
RUN echo "[INFO] Installing DAOS containerization dependencies" ; \
ARG DAOS_VERSION=""
RUN for it in DAOS_VERSION ; do \
if eval "[[ -z \$$it ]]" ; then \
echo "[ERROR] Docker build argument $it is not defined" ; \
exit 1 ; \
fi ; \
done && \
echo "[INFO] Installing DAOS containerization dependencies" ; \
dnf install \
sudo \
xz && \
Expand Down
2 changes: 1 addition & 1 deletion utils/docker/examples/daos-agent/el8/run-daos_agent.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/bin/bash

# set -x
set -e -o pipefail
set -u -e -o pipefail

if [[ "$(id -u)" != "0" ]] ; then
echo "[ERROR] run-daos_agent can only be run as root"
Expand Down
Loading

0 comments on commit 6c59aa0

Please sign in to comment.