Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for CUDA init failure #227

Closed
wants to merge 2 commits into from
Closed

Conversation

cjgarson
Copy link

@cjgarson cjgarson commented Sep 19, 2024

There appear to be conflicts between the container-based CUDA libraries and most host CUDA libraries/Nvidia drivers, such that NodeODM output will complain that CUDA cannot be initialized even though nvidia-smi is detected. Removing these packages appears to resolve the conflict without causing any other issues. Installing newer versions of the same does not appear to be necessary.

See this thread for more info: https://community.opendronemap.org/t/opendronemap-nodeodm-gpu-nvidia-smi-detected-cannot-initialize-cuda

Also changed maintainer style to resolve deprecation warnings.

Submitting PR on request from Saijin.

@pierotofy
Copy link
Member

pierotofy commented Sep 19, 2024

Thanks for taking the time to make a PR.

I've no doubt that this fixes the issue on a particular machine, but I'm a bit concerned this might not work for others.

We are basically compiling the binaries with CUDA 11.2, but then we are removing the libraries, and this works because --gpus all will instruct docker to mount the CUDA libraries that match the host version into the container. Ok, maybe there aren't differences between 11.2 and the machine's particular host version so everything works OK, but with other versions?

Before merging this I would like to know why CUDA fails to initialize (is it a library conflict? Is it not picking the correct version?). Does the DensifyPointCloud executable work regardless of the warning? Is it just a problem in https://github.com/OpenDroneMap/ODM/blob/master/opendm/gpu.py#L35 ?

A fix should also probably be implemented in https://github.com/OpenDroneMap/ODM/blob/master/gpu.Dockerfile rather than here (NodeODM builds upon ODM).

@cjgarson cjgarson closed this Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants