Skip to content

Latest commit

 

History

History
203 lines (165 loc) · 12.6 KB

running-in-production-k8s.md

File metadata and controls

203 lines (165 loc) · 12.6 KB

Running Nuclio Over Kubernetes in Production

After familiarizing yourself with Nuclio and deploying it over Kubernetes, you might find yourself in need of more information pertaining to running Nuclio in production. Nuclio is integrated, for example, within the Iguazio Data Science Platform, which is used extensively in production, both by Iguazio and its customers, running various workloads. This document describes advanced configuration options and best-practice guidelines for using Nuclio in a production environment.

In this document

The preferred deployment method

There are several alternatives to deploying (installing) Nuclio in production, but the recommended method is by using Helm charts. This is currently the preferred deployment method at Iguazio as it's the most tightly maintained, it's best suited for "heavy lifting" over Kubernetes, and it's often used to roll out new production-oriented features.

Following is a quick example of how to use Helm charts to set up a specific stable version of Nuclio.

  1. Create a namespace for your Nuclio functions:

    kubectl create namespace nuclio
  2. Create a secret with valid credentials for logging into your target container (Docker) registry:

    read -s mypassword
    <enter your password>
    
    kubectl --namespace nuclio create secret docker-registry registry-credentials \
        --docker-username <username> \
        --docker-password $mypassword \
        --docker-server <URL> \
        --docker-email <some email>
    
    unset mypassword

    Note: If you are using Amazon's ECR see [using kaniko with ECR](#Using kaniko with amazon elastic container registry (ECR)) section.

  3. Add and Install nuclio Helm chart:

    helm repo add nuclio https://nuclio.github.io/nuclio/charts
    helm install nuclio \
        --set registry.secretName=registry-credentials \
        --set registry.pushPullUrl=<your registry URL> \
        nuclio/nuclio

Note: For a full list of configuration parameters, see the Helm values file (values.yaml)

Multi-Tenancy

Implementation of multi-tenancy can be done in many ways and to various degrees. The experience of the Nuclio team has lead to the adoption of the Kubernetes approach of tenant isolation using namespaces. Note:

  • To achieve tenant separation for various Nuclio projects and functions, and to avoid cross-tenant contamination and resource races, a fully functioning Nuclio deployment is used in each namespace and the Nuclio controller is configured to be namespaced. This means that the controller handles Nuclio resources (functions, function events, and projects) only within its own namespace. This is supported by using the controller.namespace and rbac.crdAccessMode Helm values configurations.
  • To provide ample separation at the level of the container registry, it's highly recommended that the Nuclio deployments of multiple tenants either don't share container registries, or that they don't share a tenant when using a multi-tenant registry (such as registry.hub.docker.com or quay.io).

Freezing a qualified version

When working in production, you need reproducibility and consistency. It's therefore recommended that you don't use the latest stable version, but rather qualify a specific Nuclio version and "freeze" it in your configuration. Stick with this version until you qualify a newer version for your system. Because Nuclio adheres to backwards-compatibility standards between patch versions, and even minor version updates don't typically break major functionality, the process of qualifying a newer Nuclio version should generally be short and easy.

To use Helm to freeze a specific Nuclio version, set all of the *.image.repository and *.image.tag Helm values to the names and tags that represent the images for your chosen version. Note the configured images must be accessible to your Kubernetes deployment (which is especially relevant for air-gapped deployments).

Air-gapped deployment

Nuclio is fully compatible with execution in air-gapped environments ("dark sites"), and supports the appropriate configuration to avoid any outside access. The following guidelines refer to more advanced use cases and are based on the assumption that you can handle the related DevOps tasks. Note that such implementations can get a bit tricky; to access a fully-managed, air-gap friendly, "batteries-included", Nuclio deployment, which also offers plenty of other tools and features, check out the enterprise-grade Iguazio Data Science Platform. If you select to handle the implementation yourself, follow these guidelines; the referenced configurations are all Helm values:

  • Set *.image.repository and *.image.tag to freeze a qualified version, and ensure that the configured images are accessible to the Kubernetes deployment.

  • Set *.image.pullPolicy to Never or to IfNotPresent to ensure that Kubernetes doesn't try to fetch the images from the web.

  • Set offline to true to put Nuclio in "offline" mode.

  • Set dashboard.baseImagePullPolicy to Never.

  • Set registry.pushPullUrl to a registry URL that's reachable from your system.

  • Ensure that base, "onbuild", and processor images are accessible to the dashboard in your environment, as they're required for the build process (either by docker build or Kaniko). You can achieve this using either of the following methods:

    • Make the images available on the host Docker daemon (local cache).
    • Preload the images to a registry that's accessible to your system, to allow pulling the images from the registry. When using this method, set registy.dependantImageRegistryURL to the URL of an accessible local registry that contains the preloaded images (thus overriding the default location of quay.io/nuclio, which isn't accessible in air-gapped environments).

      Note: To save yourself some work, you can use the prebaked Nuclio registry, either as-is or as a reference for creating your own local registry with preloaded images.

  • To use the Nuclio templates library (optional), package the templates into an archive; serve the templates archive via a local server whose address is accessible to your system; and set dashboard.templatesArchiveAddress to the address of this local server.

Using Kaniko as an image builder

When dealing with production deployments, you should avoid bind-mounting the Docker socket to the service pod of the Nuclio dashboard; doing so would allow the dashboard access to the host machine's Docker daemon, which is akin to giving it root access to your machine. This is understandably a concern for real production use cases. Ideally, no pod should access the Docker daemon directly, but because Nuclio is a container-based serverless framework, it needs the ability to build OCI images at run time. While there are several alternatives to bind-mounting the Docker socket, the selected Nuclio solution, starting with Nuclio version 1.3.15, is to integrate Kaniko as a production-ready method of building OCI images in a secured way. Kaniko is well maintained, stable, easy to use, and provides an extensive set of features. Nuclio currently supports Kaniko only on Kubernetes.

To deploy Nuclio and direct it to use the Kaniko engine to build images, use the following Helm values parameters; replace the <...> placeholders with your specific values:

helm upgrade --install --reuse-values nuclio \
    --set registry.secretName=<your secret name> \
    --set registry.pushPullUrl=<your registry URL> \
    --set dashboard.containerBuilderKind=kaniko \
    --set controller.image.tag=<version>-amd64 \
    --set dashboard.image.tag=<version>-amd64\
    nuclio/nuclio

This is rather straightforward; however, note the following:

  • When running in an air-gapped environment, Kaniko's executor image must also be available to your Kubernetes cluster.
  • Kaniko requires that you work with a registry to which push the resulting function images. It doesn't support accessing images on the host Docker daemon. Therefore, you must set registry.pushPullUrl to the URL of the registry to which Kaniko should push the resulting images, and in air-gapped environments, you must also set registry.defaultBaseRegistryURL and registry.defaultOnbuildRegistryURL to the URL of an accessible local registry that contains the preloaded base, "onbuild", and processor images (see Air-gapped deployment).
  • quay.io doesn't support nested repositories. If you're using Kaniko as a container builder and quay.io as a registry (--set registry.pushPullUrl=quay.io/<repo name>), add the following to your configuration to allow Kaniko caching to push successfully; (replace the <repo name> placeholder with the name of your repository):
    --set dashboard.kaniko.cacheRepo=quay.io/<repo name>/cache

Using kaniko with amazon elastic container registry (ECR):

To work with ECR, you must create a secret with your AWS credentials, and a secret with ECR Token while providing both secret names to the helm install command. This is relevant for instances running without attached IAM roles. To work with instances running with attached IAM roles, you can skip the AWS credentials and ECR Token secrets creation.

Before you begin, make sure you have the following IAM roles attached to your user:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ecr:CreateRepository",
                "ecr:GetAuthorizationToken",
                "ecr:BatchCheckLayerAvailability",
                "ecr:BatchGetImage",
                "ecr:CompleteLayerUpload",
                "ecr:GetDownloadUrlForLayer",
                "ecr:InitiateLayerUpload",
                "ecr:PutImage",
                "ecr:UploadLayerPart"
            ],
            "Resource": "*"
        }
    ]
}

Common environment variables:

export AWS_REGION=<Your AWS region>
export AWS_ACCOUNT=<Your AWS account ID>
export ECR_PASSWORD=$(aws ecr get-login-password --region ${AWS_REGION})

Create the AWS credentials secret generated from .aws/credentials file configured with access key id and secret access key using the following command:

cat << EOF | kubectl --namespace nuclio create secret generic aws-credentials --save-config \
--dry-run=client --from-file=credentials=/dev/stdin -o yaml | kubectl apply -f -
[default]
aws_access_key_id = ${AWS_ACCESS_KEY_ID}
aws_secret_access_key = ${AWS_SECRET_ACCESS_KEY}
EOF

Note: This is needed to allow Kaniko creating the image repository prior to pushing the function image. Otherwise, Kaniko will fail to push the image to ECR because the image name is being determined during the build process.

Create the ECR token secret to be used as imagePullSecret of function pods. Since ECR tokens go stale after 12 hours, the secret must be refreshed periodically (can be done with a cron job as described in this blogpost)

kubectl -n nuclio create secret docker-registry ecr-registry-credentials \
  --docker-server=${AWS_ACCOUNT}.dkr.ecr.${AWS_REGION}.amazonaws.com \
  --docker-username=AWS \
  --docker-password=${ECR_PASSWORD} 

Finally, install the chart with the following command:

helm repo add nuclio https://nuclio.github.io/nuclio/charts
helm install nuclio \
    --set dashboard.kaniko.registryProviderSecretName=<aws-secret-name> \
    --set registry.secretName=<ecr-secret-name>
    --set registry.pushPullUrl=<your registry URL> \
    nuclio/nuclio