This project represents a container-based Pixel Streaming runtime on Oracle Container Engine for Kubernetes (OKE)
- Unreal Pixel Streaming on OKE
WebRTC defines a web peering technology for real-time media and data streaming. To establish the peer-to-peer connection, WebRTC uses a four-way handshake where the various peer networking configurations and firewalls are traversed. The basic design of a WebRTC system includes the following:
- A signalling service that establishes an initial connection with the streaming application and exchanges Interactive Connectivity Establishment (ICE) candidate configurations.
- Session Traversal Utilities for NAT (STUN) and Traversal Using Relays around NAT (TURN) servers which provide ICE candidates to the signal service
- A Session Description Protocol network path is discovered through ICE negotiation and the peer-to-peer connection is created.
- STUN/TURN server transports the encoded media stream between the streaming app and browser.
There are three distinct node pools to use in this setup. Specifics regarding node shape are suggested as baseline starting points, and can be customized by requirements.
Name | Description | Node Shape | Node Count |
---|---|---|---|
Default | General cluster workloads | VM.Standard.E4.Flex |
3+ |
Turn | Deploy coturn as DaemonSet with host networking |
VM.Standard.E4.Flex |
1+ |
GPU | PixelStreaming runtime with signal server as sidecar | * | 1+ |
*
Specific GPU shape can also vary depending on the application and scaling demands. It is recommended to evaluate performance and settings accordingly.
The default (or general) node pool is considered for multipurpose installations or cluster-wide resources such as ingress controller, telemetry services, applications, etc.
For purposes of this example, the standard Quick Create workflow with public API and private workers is considered adequate. Select alternatives, or customize as necessary.
Once created, note that the worker node subnet will have a
10.0.10.0/24
CIDR range.
*
Specific GPU shape can also vary depending on the application and scaling demands. It is recommended to evaluate performance and settings accordingly.
This node pool is used exclusively for the STUN/TURN services running coturn. While coTURN is the most prevalent suggestion for hosting our own TURN services, alternates like Pion TURN may be viable.
STUN
andTURN
are network bound services, so specific attention to network bandwidth and associative compute sizing should be considered.
For public access, the nature of STUN/TURN dictates that the node pool is created in a public subnet, with associative security list rules and a public route table to work within OKE. In order to leverage host networking, the services are run as a DaemonSet on this specific node pool. These following setup used to acheive a single node deployment:
-
Create a public subnet (regional) in the OKE cluster VCN. (This example used
10.0.30.0/24
CIDR block) -
Assign default DHCP options for cluster vcn
-
Assign the public route table to the public subnet (default from OKE is fine)
-
Assign/update the existing node security list for the TURN subnet CIDR block
Dir Source/Dest Protocol Src Port Dest Port Type/Code Description Ingress 10.0.30.0/24
All Protocols * * Allow pods on turn nodes to communicate with pods on other worker nodes Egress 10.0.30.0/24
All Protocols * * Allow pods on turn nodes to communicate with pods on other worker nodes Ingress 0.0.0.0/0
TCP * 3478 STUN TCP Ingress 0.0.0.0/0
UDP * 3478 TURN UDP Ingress 0.0.0.0/0
TCP * 49152-65535 STUN Connection ports Ingress 0.0.0.0/0
UDP * 49152-65535 TURN Connection ports -
Update K8s API endpoint security list to include ingress/egress to the turn CIDR block
Dir Source/Dest Protocol Src Port Dest Port Type/Code Description Ingress 10.0.30.0/24
TCP * 6443 turn worker to k8s API endpoint Ingress 10.0.30.0/24
TCP * 12250 turn worker to OKE control plane Ingress 10.0.30.0/24
ICMP 3, 4 path discovery turn Egress 10.0.30.0/24
ICMP 3, 4 path discovery turn Egress 10.0.30.0/24
TCP * * TURN traffic from worker nodes -
Create the node pool using Advanced Options to specify additional k8s key-value labels:
app.pixel/turn=true
-
Taint each node after they start to ensure selective node assignment/affinity
# assuming node pool was labeled with 'app.pixel/turn=true' in provisioning kubectl taint nodes $(kubectl get nodes -l app.pixel/turn=true --no-headers | awk '{print $1}') app.pixel/turn=true:NoSchedule
NOTE: this is done automatically as part of the
turn
DaemonSet
This is the node pool for the Pixel Streaming GPU workloads. Part of the design,
each Pixel Streaming runtime is directly asosciated with a corresponding node.js
signal server known as "cirrus" (./signalserver
here). As such, each pixel streaming
container runs with cirrus as a sidecar on the same pod.
It is necessary to differentiate the GPU pool from others with a kubernetes label. Create the node pool using Advanced Options to specify additional k8s labels:
app.pixel/gpu=true
Read more on this under GPU Allocation
The architecture used here for pixel streaming does not require any specific network/subnet other than the general OKE node subnet
As with many kubernetes systems, there are several choices for anciliary services such as ingress controllers, certificate management, metrics, etc.. This solution aims to offer viability using the most basic/standard dependencies:
-
Ingress Controller: documentation
helm upgrade --install ingress-nginx ingress-nginx \ --repo https://kubernetes.github.io/ingress-nginx \ --namespace ingress-nginx --create-namespace
-
Cert Manager: documentation
-
Install CRDs (if not already installed)
kubectl apply -f \ https://github.com/jetstack/cert-manager/releases/download/v1.6.0/cert-manager.crds.yaml
-
Install chart
helm upgrade --install cert-manager cert-manager \ --repo https://charts.jetstack.io \ --namespace cert-manager --create-namespace \ --version v1.6.0
-
Create associative
ClusterIssuer
orIssuer
resources depending on needs. As an example:# adjust as needed kubectl apply -f ./support/misc/issuer.yaml
-
-
Metrics Server: documentation
-
Install chart
helm upgrade --install metrics-server metrics-server \ --repo https://kubernetes-sigs.github.io/metrics-server/ \ --namespace metrics --create-namespace
-
Prebuilt images are included with this repo, along with a demo Pixel Streaming image. With a cluster configured per the instructions above, you can deploy the entire runtime with the following:
kubectl create ns demo
kubectl apply -f demo.yaml
See demo.yaml for complete details
All of the services/constructs are contained within this repo with the exception of the Unreal project source code. See more on this below.
As a convenience all service images can be built with the following command:
# from project root
docker compose build -f ../docker-compose.yml
Each service image should be built and pushed to the respective OCIR
registy. Image tags
can be found in the ./k8s/kustomization.yaml file, however any
tag name can be used, so long as it's repo/tag is known prior to deployment.
For this piece, an example Dockerfile
is provided in the unreal directory.
In this example, it is expected that the ./project
relative path contains the
full project source, which would be ./project/PixelStreamingDemo.uproject
in
this case - update as necessary.
NOTE Access to the official Unreal Engine docker images (hosted on
ghcr.io/epicgames/unreal-engine
) is restricted, so it is necessary to sign up and register for access. Instructions are here
Once repo access is obtained, the basic build process is as follows:
-
Authenticate to
ghcr
-
Build the unreal project
# change to the project directory containing dockerfile described above cd path/to/ue4/project # docker build (in current directory '.') docker build -t my-pixelstream:latest .
-
Tag and push to OCIR per documentation.
Although we've used helm
to install various objects in the kubernetes environment,
this Pixel Streaming demo deployment is designed using plain
kubectl
and kustomize
commands directly to de-mystify the k8s manifests in our runtime
, offering higher transparency to the readers :).
-
As a first step, create a namespace for our application and it's respective systems
export NAMESPACE=pixel kubectl create ns $NAMESPACE
-
Create an OCIR registry secret (refer to documentation)
kubectl create secret docker-registry ocirsecret \ --docker-server=<region-key>.ocir.io \ --docker-username='<tenancy-namespace>/<oci-username>' \ --docker-password='<oci-auth-token>' \ --docker-email='<email-address>' \ --namespace $NAMESPACE
-
Optionally locate an ingress controller ip address for use of a wilcard dns name from nip.io
# get public ip address kubectl get svc -A | grep ingress | grep LoadBalancer | awk '{print $5}' | head -n 1 # or use a hex format printf '%02x' $(kubectl get svc -A | grep LoadBalancer | grep ingress-nginx | awk '{print $5}' | head -n 1 | tr '.' ' '); echo
Set the ip dns name in
.env
below -
Create a
.env
file in this directory or set environment with configuration variables like the following:# kubernetes namespace for pixel streaming NAMESPACE=pixel # container registry/repo path REPO=iad.ocir.io/mytenancy/pixeldemo # container registry secret (optional) REPO_SECRET= # tag version (all services use same) IMAGE_TAG=latest # unreal image container registry UNREAL_REPO=iad.ocir.io/mytenancy/pixeldemo # name of the unreal container in OCIR UNREAL_IMAGE_NAME=my-pixelstream # unreal container registry secret (optional) UNREAL_REPO_SECRET= # version for the streamer image (can differ from the services) UNREAL_IMAGE_TAG=latest # a hostname to use (nip.io ip example) INGRESS_HOST=my-pixelstream.<load balancer ip>.nip.io # optionally specify ingress path prefix (example: /game) INGRESS_PATH= # specify initial TURN service username TURN_USER=userx0000 # also specify a turn password TURN_PASS=passx1111 # specify whether or not to enable the pod proxy PROXY_ENABLE=false # configure proxy prefix PROXY_PATH_PREFIX=/proxy # configure basic auth users (unreal/demo) https://doc.traefik.io/traefik/middlewares/http/basicauth/ PROXY_AUTH_USERS=
-
Use the ./configure.sh wrapper to generate a
kustomization
overlay and (optionally) apply:# run to generate ./overlay and output manifests ./configure.sh # run to generate ./overlay and output manifests with different env path ./configure.sh path/to/.env # generate ./overlay AND apply the manifests ./configure.sh | kubectl apply -f -
NOTE to delete, just run
./configure.sh | kubectl delete -f -
-
Inspect objects created in the cluster on
pixel
namespacekubectl get all -n pixel
-
Checkout the traefik proxy dashboard
kubectl -n pixel port-forward service/router 8080
GPU Telemetry is done through the use of prometheus and the DCGM exporter. Setup and configuration details can be found in the NVIDIA Documentation
A values file for the prometheus
stack is provided
with settings to include GPU metrics from dcgm-exporter
. These values
are unmodified from the nvidia installation guide
helm upgrade --install prometheus-stack kube-prometheus-stack \
--repo https://prometheus-community.github.io/helm-charts \
--namespace prometheus --create-namespace \
--values ./support/prometheus.values.yaml \
--set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false
Although prometheus has been installed, it won't collect any GPU metrics until the Data Center GPU Manager exporter is deployed which exposes the metrics endpoint scraped by prometheus. Again, a custom values file is defined to ensure the DaemonSet is properly deployed on GPU nodes.
helm upgrade --install dcgm-exporter dcgm-exporter \
--repo https://nvidia.github.io/dcgm-exporter/helm-charts \
--namespace prometheus \
--values ./support/dcgm.values.yaml
The values applied for this helm chart release are specific to this use case:
# Establish known gpu node selections
nodeSelector:
app.pixel/gpu: "true"
# ensure scheduling is allowed based on OKE GPU node taints
tolerations:
- key: "nvidia.com/gpu"
effect: "NoSchedule"
operator: "Exists"
Grafana is installed automatically as part of the kube-prometheus-stack
chart.
The installation is pre-loaded with several useful kubernetes dashboards. In order
to see GPU metrics, we'll add a dashboard related specifically to the dcgm-exporter
metrics.
-
Get the grafana
admin
password:kubectl get secret prometheus-stack-grafana \ -n prometheus \ -o jsonpath="{.data.admin-password}" | base64 --decode; echo
The
admin
account password defaults toprom-operator
in the prometheus helm chart -
In order to access the grafana user interface, you can enable ingress through the
kube-prometheus-stack
grafana
settings or define it separately.-
Based on the prometheus installation, the grafana service will be named
prometheus-stack-grafana
. For now, simply open a local port-forward on to the service and load the dashboard.kubectl port-forward svc/prometheus-stack-grafana -n prometheus 8000:80
-
Open localhost:8000 and use the admin credentials found above.
-
-
Once Grafana is opened, import relevant dashboards:
- Custom pixel streaming dashboard included as json
- DCGM exporter dashboard for overall GPU metrics
In order to acheive the desired autoscaling scenario of reactive scaling the GPU streaming application, it is necessary to leverage the Prometheus Adapter with custom metrics on the streamer Horizontal Pod Autoscaler.
Each signal server produces a metric that indicates whether or not (1 or 0)
its stream is allocated to a client. By scaling on this metric with a target
total value of 1
, the replicaset will be adjusted to hit this goal. It's worth
noting that the GPU pool/shape should be chosen such that cluster autoscaling
happens infrequently enough as not to impact the user experience.
Install the prometheus adapter:
helm upgrade --install prometheus-adapter prometheus-adapter \
--repo https://prometheus-community.github.io/helm-charts \
--namespace prometheus \
--values ./support/prometheus-adapter.values.yaml
See the custom values for the custom metric configurations
Test the custom metrics:
# average player connections
kubectl get --raw '/apis/custom.metrics.k8s.io/v1beta1/namespaces/pixel/pods/*/stream_player_connections' | jq .
# total free streams
kubectl get --raw '/apis/custom.metrics.k8s.io/v1beta1/namespaces/pixel/services/*/pixelstream_available_count' | jq .
# ratio of number of players (channels) to streams
kubectl get --raw '/apis/custom.metrics.k8s.io/v1beta1/namespaces/pixel/services/*/player_stream_pool_ratio' | jq .
With the Prometheus Adapter deployed, custom metrics are used to establish horizontal pod autoscaling on the streaming app deployment.
Using the player_stream_pool_ratio
, the following target logic applies:
- A value of
1
adjusts so that the number of players should equal the number of streams. - A value
< 1
(such as 900m) means that streams increase proactively to accommodate future player sessions
Refer to stream-hpa.yaml for the full specification.
Scaling down applies the heuristic notion pod-deletion-cost
for replicaset scaling order/preference. Note that this feature requires Kubernetes 1.22
or later. Without this, active pods (with connected players) may be terminated indiscriminately.
Once the total number of requested pods exceeds available cluster resources, it is necessary to configure Cluster Autoscaling. Refer to this guide for details.
This architecture is partially based on original sample code from Epic Games to support Unreal Engine Pixel Streaming (signalserver and matchmaker). There are some associated limitations with those services, many of which are described here, as well as some introduced by this design. This section is meant to call attention to some known shortcomings that may require additional work.
- Authentication is not included. Users should consider adding upstream auth, or extending the router configurations.
- Streamer availability is done via broadcast to matchmaker rather than using service discovery from endpoints.
- Player websocket connections are queued through matchmaker and forwarded to matched streams.
- Matchmaker replicas do not share state, therefore stream availability and player session affinity may be unpredictable.
- Each WebRTC session establishes a peer-to-peer mesh, so the number of
connections is n2 where
n
is the number of participants. - The static browser code in
src/player
is mostly original, but slightly adapted for this runtime. It is meant purely as a starting point, however is not a model for modern web apps. - The demo applies some defaults to the pixel streaming runtime, including a maximum 30 frames per second value. This is an arbitrary selection for demo performance, and may be adjusted in the env ConfigMap. Refer to documentation.
- Containers (and Pods) normally do not share GPUs - as in, there's no
overcommitting of GPUs. Each container can request one or more GPUs, but it
is not possible to request a fraction of a GPU.
- This demo uses the
app.pixel/gpu
label for affinity and proportionate CPU requests to allow more than one stream on a single GPU, which may not be suitable in production.See stream-runtime.yaml for more information.
- This demo uses the
- MIG Support (multi-instance GPU partitioning) will require testing with A100 shapes,
- Add STUN/TURN metrics and define approach for autoscaling
- Revisit autoscale (nodes and pods) based on GPU availability and app design
- Define a k8s service for the streamer, allowing matchmaker (or similar) to perform endpoint discovery and manage player affinity
- Support distributed matchmaker state (with
redis
for example) to widen the player-to-stream broker system
Below is a list of helpful references with concepts applied within this architecture
Link | About |
---|---|
TURN servers for cloud | has some good information about coturn in docker |
GCP WebRTC + GPU | perhaps the holy grail of related examples. It does not relate to pixel streaming, but much of the architecture is derived from this example |
Trickle ICE | Tests STUN/TURN |
Pion TURN | Alternate to coturn |
- | |
UE4 Containers | Unreal Engine official docs on general container usage |
unrealcontainers.com | resource created by Adam Rehn - AKA God of Unreal running in Linux/Containers |
Unreal Engine Images | requires permissions with Epic Games, but this is where the ghcr.io base images from Unreal live |
Unreal Image EULA | Information on how Unreal Engine EULA restricts the distribution of Unreal Engine container images |
- | |
NVIDIA containers | Information from NVIDIA on GPUs in cloud native |
NVIDIA GPU Monitoring | How to collect GPU metrics for prometheus in k8s (Data Center GPU Metrics exporter) |
GPU Monitoring Tools | Helm charts for GPU Telemetry |
MIG Support | Multi-instance GPU partitioning support (NVIDIA A100) |
Oracle GPU | Oracle Cloud Infrastructure NVIDIA GPU Instances |