-
Notifications
You must be signed in to change notification settings - Fork 16
Google Kubernetes Engine setup and useful commands
-
Create the GKE cluster on https://console.cloud.google.com/ under "Kubernetes Engine" or figure out how to do it through gcloud
- Most default cluster configuration values work for the setup, the values you usually want to adapt are:
- Cluster basics tab:
- Zone: usually you put it "close" to the storage element
- Node pools:
- Number of nodes and if you require autoscaling
- Nodes: series, size of your nodes, boot disk size and special needs such as Local SSDs. You can also choose whether you want to use preemptible nodes.
- Cluster basics tab:
- Most default cluster configuration values work for the setup, the values you usually want to adapt are:
-
Create a service account in https://console.cloud.google.com/ under "IAM"
- Under "Service Accounts" create a service account. A possible role is "Kubernetes Engine" > "Kubernetes Engine Developer".
- When the account is ready, create and download the key (a small json file).
- You can fine tune the permissions further through a custom role. As of today the required permissions are.
container.clusters.get
container.clusters.list
container.daemonSets.create
container.daemonSets.delete
container.daemonSets.get
container.daemonSets.list
container.events.list
container.jobs.create
container.jobs.delete
container.jobs.list
container.namespaces.create
container.nodes.get
container.nodes.list
container.persistentVolumeClaims.create
container.persistentVolumeClaims.list
container.persistentVolumes.create
container.persistentVolumes.list
container.pods.exec
container.pods.get
container.pods.getLogs
container.pods.list
container.secrets.create
container.secrets.list
container.secrets.update
container.serviceAccounts.create
container.storageClasses.create
resourcemanager.projects.get
- Install gcloud on the harvester host
- Log in the service account
gcloud auth activate-service-account --key-file <LOCATION OF THE SERVICE ACCOUNT KEY>
gcloud init --console-only
- Set where the kubeconfig file will be downloaded. If this variable points to an existing file, the configuration will be added!!!
export KUBECONFIG=<WHERE YOU WANT THE KUBECONFIG FILE>
gcloud container clusters get-credentials <YOUR CLUSTER NAME> --region <YOUR REGION>
Follow the CVMFS installation section here
The cluster is ready to be configured in the Harvester queue config file
If you are using preemptible nodes, you can follow GCE preemption actions through Operations > Logging > Logs Explorer. Type in following query to see preemptions in your cluster:
resource.type="gce_instance"
protoPayload.methodName="compute.instances.preempted"
Autoscaling will not scale down your cluster, unless you set PDBs on kube-system pods. Do this at your own risk.
kubectl create poddisruptionbudget konnectivity-agent --namespace=kube-system --selector k8s-app=konnectivity-agent --max-unavailable 1
kubectl create poddisruptionbudget kube-dns --namespace=kube-system --selector k8s-app=kube-dns --max-unavailable 1
kubectl create poddisruptionbudget event-exporter-gke --namespace=kube-system --selector k8s-app=event-exporter-gke --max-unavailable 1
kubectl create poddisruptionbudget metrics-server --namespace=kube-system --selector k8s-app=metrics-server --max-unavailable 1
kubectl create poddisruptionbudget konnectivity-agent-autoscaler --namespace=kube-system --selector k8s-app=konnectivity-agent-autoscaler --max-unavailable 1
There are two autoscaling profiles. "Resource optimization" profile scales the cluster down almost immediately, while the default profile is a bit slower.
If you want to use local SSD, there is a beta option to use the SSD directly for ephemeral storage, e.g. emptyDir. These node pools currently have to be setup via gcloud. This is my favorite setup:
gcloud beta container node-pools create workers-ssd-eph --ephemeral-storage local-ssd-count=1 --machine-type n2-standard-8 --disk-size=60 --enable-autoscaling --max-nodes=2 --min-nodes=1 --cluster=panda-gke --zone europe-west1-b --preemptible --tags=frontier
Getting started |
---|
Installation and configuration |
Testing and running |
Debugging |
Work with Middleware |
Admin FAQ |
Development guides |
---|
Development workflow |
Tagging |
Production & commissioning |
---|
Scale up submission |
Condor experiences |
Commissioning on the grid |
Production servers |
Service monitoring |
Auto Queue Configuration with CRIC |
SSH+RPC middleware setup |
Kubernetes section |
---|
Kubernetes setup |
X509 credentials |
AWS setup |
GKE setup |
CERN setup |
CVMFS installation |
Generic service accounts |
Advanced payloads |
---|
Horovod integration |