This helm chart offers an easy deployment for GPU Telemetry on a Kubernetes Cluster running NVIDIA's GPU Operator. A detailed design spec and installation guide for this helm chart can be found here
After installation, you can import various grafana dashboards to visualize GPU metrics. The JSON files containing the dashboards are availible in the repository in the dashboards folder, and the dashboards are availilbe for import here:
Standard Dashboard Grafana ID: 12239
MIG Dashboard Grafana ID: 16640
vGPU Dashboard Grafana ID: 16727