You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My dockerfile and entry.sh is the exact same as the example, only that I specified version 510.54.
Not sure if this is a problem with the NVIDIA driver or with balena, I hope you can help me fix this
The text was updated successfully, but these errors were encountered:
When I try to run the GPU container example, nvidia-smi does not work.
dmesg shows the following output:
[ 254.247462] nvidia: loading out-of-tree module taints kernel.
[ 254.247469] nvidia: module license 'NVIDIA' taints kernel.
[ 254.247470] Disabling lock debugging due to kernel taint
[ 254.260134] nvidia-nvlink: Nvlink Core is being initialized, major device number 235
[ 254.260671] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[ 254.302121] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 510.54 Tue Feb 8 04:42:21 UTC 2022
[ 254.312290] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 510.54 Tue Feb 8 04:34:06 UTC 2022
[ 254.313570] nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint.
[ 254.314776] nvidia-uvm: Loaded the UVM driver, major device number 511.
[ 255.529701] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x56:1463)
[ 255.529730] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 255.529763] BUG: unable to handle page fault for address: 0000000000002a04
[ 255.529765] #PF: supervisor read access in kernel mode
[ 255.529781] #PF: error_code(0x0000) - not-present page
[ 255.529781] PGD 0 P4D 0
[ 255.529784] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 255.529787] CPU: 9 PID: 123060 Comm: nv_queue Tainted: P OE 5.10.43-yocto-standard #1
[ 255.529787] Hardware name: OnLogic K700/RXM-181, BIOS Z01-0001A037 10/13/2021
[ 255.529947] RIP: 0010:_nv009917rm+0x38/0xc0 [nvidia]
[ 255.529948] Code: 9b f0 01 48 8b bb 68 01 00 00 e8 33 59 4d 00 85 c0 74 0f 48 83 c4 08 5b 41 5c c3 0f 1f 80 00 00 00 00 44 89 e7 e8 f8 08 be ff <8b> 90 04 2a 00 00 83 fa 01 74 2f 80 b8 0c 05 00 00 00 74 12 80 b8
[ 255.529949] RSP: 0018:ffffa2dcc064fdd0 EFLAGS: 00010246
[ 255.529950] RAX: 0000000000000000 RBX: ffff933965021c08 RCX: 0000000000000000
[ 255.529951] RDX: ffffa2dcc064fdfc RSI: 0000000000000000 RDI: 0000000000000000
[ 255.529952] RBP: ffff933ab319e000 R08: 0000000000003000 R09: ffffa2dcc064fe00
[ 255.529952] R10: ffff933ab3242900 R11: 0000000000000001 R12: 0000000000000000
[ 255.529953] R13: ffffa2dcc064fec0 R14: ffff93394e806808 R15: ffff933ab3242900
[ 255.529954] FS: 0000000000000000(0000) GS:ffff93408bc40000(0000) knlGS:0000000000000000
[ 255.529955] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 255.529955] CR2: 0000000000002a04 CR3: 000000052200c006 CR4: 00000000003706e0
[ 255.529956] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 255.529957] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 255.529957] Call Trace:
[ 255.530020] ? rm_execute_work_item+0x108/0x120 [nvidia]
[ 255.530071] ? os_execute_work_item+0x4c/0x70 [nvidia]
[ 255.530122] ? _main_loop+0x8c/0x140 [nvidia]
[ 255.530173] ? nvidia_modeset_resume+0x30/0x30 [nvidia]
[ 255.530176] ? kthread+0x129/0x170
[ 255.530177] ? kthread_park+0x90/0x90
[ 255.530178] ? ret_from_fork+0x1f/0x30
[ 255.530179] Modules linked in: nvidia_uvm(POE) nvidia_modeset(POE) nvidia(POE) ip6t_REJECT(E) nf_reject_ipv6(E) ip6table_filter(E) xt_state(E) ipt_REJECT(E) nf_reject_ipv4(E) ip6_tables(E) xt_MASQUERADE(E) nf_conntrack_netlink(E) nfnetlink(E) xfrm_user(E) xt_owner(E) snd_soc_skl(E) snd_soc_hdac_hda(E) intel_rapl_msr(E) snd_hda_ext_core(E) intel_rapl_common(E) snd_soc_sst_ipc(E) snd_soc_sst_dsp(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) snd_soc_acpi_intel_match(E) snd_soc_acpi(E) coretemp(E) snd_soc_core(E) snd_compress(E) kvm_intel(E) ac97_bus(E) snd_hda_codec_hdmi(E) snd_pcm_dmaengine(E) kvm(E) snd_hda_intel(E) snd_intel_dspcfg(E) irqbypass(E) snd_hda_codec(E) crct10dif_pclmul(E) crc32_pclmul(E) 8250_dw(E) mei_wdt(E) intel_wmi_thunderbolt(E) wmi_bmof(E) ghash_clmulni_intel(E) snd_hda_core(E) mxm_wmi(E) snd_hwdep(E) ttm(E) aesni_intel(E) snd_pcm(E) crypto_simd(E) nvidiafb(E) igb(E) intel_lpss_pci(E) iTCO_wdt(E) mei_me(E) cryptd(E) vgastate(E) intel_pmc_bxt(E) intel_lpss(E)
[ 255.530203] glue_helper(E) efi_pstore(E) pcspkr(E) ee1004(E) iTCO_vendor_support(E) snd_timer(E) fb_ddc(E) cdc_acm(E) dca(E) mei(E) intel_pch_thermal(E) idma64(E) wmi(E) pinctrl_cannonlake(E) evbug(E) video(E) mac_hid(E) acpi_pad(E) acpi_tad(E) sch_fq_codel(E) [last unloaded: nouveau]
[ 255.530213] CR2: 0000000000002a04
[ 255.530214] ---[ end trace 0a22e754d9968912 ]---
[ 255.905919] RIP: 0010:_nv009917rm+0x38/0xc0 [nvidia]
[ 255.905921] Code: 9b f0 01 48 8b bb 68 01 00 00 e8 33 59 4d 00 85 c0 74 0f 48 83 c4 08 5b 41 5c c3 0f 1f 80 00 00 00 00 44 89 e7 e8 f8 08 be ff <8b> 90 04 2a 00 00 83 fa 01 74 2f 80 b8 0c 05 00 00 00 74 12 80 b8
[ 255.905922] RSP: 0018:ffffa2dcc064fdd0 EFLAGS: 00010246
[ 255.905924] RAX: 0000000000000000 RBX: ffff933965021c08 RCX: 0000000000000000
[ 255.905925] RDX: ffffa2dcc064fdfc RSI: 0000000000000000 RDI: 0000000000000000
[ 255.905926] RBP: ffff933ab319e000 R08: 0000000000003000 R09: ffffa2dcc064fe00
[ 255.905926] R10: ffff933ab3242900 R11: 0000000000000001 R12: 0000000000000000
[ 255.905927] R13: ffffa2dcc064fec0 R14: ffff93394e806808 R15: ffff933ab3242900
[ 255.905928] FS: 0000000000000000(0000) GS:ffff93408bc40000(0000) knlGS:0000000000000000
[ 255.905928] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 255.905929] CR2: 0000000000002a04 CR3: 000000010dcb8005 CR4: 00000000003706e0
[ 255.905930] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 255.905930] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 916.128870] kauditd_printk_skb: 12 callbacks suppressed
My dockerfile and entry.sh is the exact same as the example, only that I specified version 510.54.
Not sure if this is a problem with the NVIDIA driver or with balena, I hope you can help me fix this
The text was updated successfully, but these errors were encountered: