Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

您好,tensorrt c++代码:关键点为2的数据集合训练后导出onnx,c++推理报cuda failure 700错误,请问可能是什么原因? #3

Open
newforrestgump001 opened this issue Feb 23, 2023 · 14 comments

Comments

@newforrestgump001
Copy link

您好,tensorrt部分的代码关键点的个数为4的时候正常运行,但是关键点为2的数据集合训练后导出onnx,c++推理报cuda failure 700错误,请问可能是什么原因?非常感谢!

@we0091234
Copy link
Owner

您好,tensorrt部分的代码关键点的个数为4的时候正常运行,但是关键点为2的数据集合训练后导出onnx,c++推理报cuda failure 700错误,请问可能是什么原因?非常感谢!

你给个onnx给我试下,加群,发我

@newforrestgump001
Copy link
Author

请问群号是多少,非常感谢!

@we0091234
Copy link
Owner

请问群号是多少,非常感谢!

871797331 qq群

@newforrestgump001
Copy link
Author

链接: https://pan.baidu.com/s/1pfnonWjCKXmCsMfbUl88Ww 密码: g8aj
--来自百度网盘超级会员V8的分享

@newforrestgump001
Copy link
Author

请问这样直接分享可以吗 我还在Linux上,随后我再加群,多谢您!

@we0091234
Copy link
Owner

请问这样直接分享可以吗 我还在Linux上,随后我再加群,多谢您!

onnx 推理试过吗,结果对吗

@newforrestgump001
Copy link
Author

newforrestgump001 commented Feb 23, 2023

推理报错,报错的地方在[02/23/2023-19:18:43] [W] [TRT] TensorRT was linked against cuBLAS/cuBLAS LT 11.6.3 but loaded cuBLAS/cuBLAS LT 11.1.0
[02/23/2023-19:18:43] [W] [TRT] TensorRT was linked against cuBLAS/cuBLAS LT 11.6.3 but loaded cuBLAS/cuBLAS LT 11.1.0
[convolutionRunner.cpp::executeConv::511] Error Code 1: Cudnn (CUDNN_STATUS_EXECUTION_FAILED)
void doInference_cu(IExecutionContext &context, cudaStream_t &stream, void **buffers, float *output, int batchSize, int OUTPUT_SIZE)
{
// infer on the batch asynchronously, and DMA output back to host
context.enqueue(batchSize, buffers, stream, nullptr);
CHECK(cudaMemcpyAsync(output, buffers[1], batchSize * OUTPUT_SIZE * sizeof(float), cudaMemcpyDeviceToHost, stream));

@we0091234
Copy link
Owner

推理报错,报错的地方在[02/23/2023-19:18:43] [W] [TRT] TensorRT was linked against cuBLAS/cuBLAS LT 11.6.3 but loaded cuBLAS/cuBLAS LT 11.1.0 [02/23/2023-19:18:43] [W] [TRT] TensorRT was linked against cuBLAS/cuBLAS LT 11.6.3 but loaded cuBLAS/cuBLAS LT 11.1.0 [convolutionRunner.cpp::executeConv::511] Error Code 1: Cudnn (CUDNN_STATUS_EXECUTION_FAILED) void doInference_cu(IExecutionContext &context, cudaStream_t &stream, void **buffers, float *output, int batchSize, int OUTPUT_SIZE) { // infer on the batch asynchronously, and DMA output back to host context.enqueue(batchSize, buffers, stream, nullptr); CHECK(cudaMemcpyAsync(output, buffers[1], batchSize * OUTPUT_SIZE * sizeof(float), cudaMemcpyDeviceToHost, stream));

我说的是onnxruntime 推理

@newforrestgump001
Copy link
Author

还没有测试。同样的训练代码和同样的推理代码,类别数都为2,一个关键点个数为4,一个关键带你个数为2,前者可以,后者报错,我现在测试一下onnxruntime.

@newforrestgump001
Copy link
Author

四个关键点的模型参数为:[info][simple_yolo.cu:2281]:Input shape is 1 x 3 x 1024 x 1024
[info][simple_yolo.cu:2283]:Set max workspace size = 1024.00 MB
[info][simple_yolo.cu:2286]:Network has 1 inputs:
[info][simple_yolo.cu:2292]: 0.[images] shape is 1 x 3 x 1024 x 1024
[info][simple_yolo.cu:2298]:Network has 1 outputs:
[info][simple_yolo.cu:2303]: 0.[output] shape is 1 x 64512 x 19
[info][simple_yolo.cu:2307]:Network has 1326 layers

@newforrestgump001
Copy link
Author

两类关键点的模型参数为[info][simple_yolo.cu:2281]:Input shape is 1 x 3 x 1024 x 1024
[info][simple_yolo.cu:2283]:Set max workspace size = 1024.00 MB
[info][simple_yolo.cu:2286]:Network has 1 inputs:
[info][simple_yolo.cu:2292]: 0.[images] shape is 1 x 3 x 1024 x 1024
[info][simple_yolo.cu:2298]:Network has 1 outputs:
[info][simple_yolo.cu:2303]: 0.[output] shape is 1 x 64512 x 13
[info][simple_yolo.cu:2307]:Network has 1326 layers

@newforrestgump001
Copy link
Author

唯一个区别在于输出维数的第三维:19和13,但是doInference_cu这个函数跑就是有区别。

@we0091234
Copy link
Owner

唯一个区别在于输出维数的第三维:19和13,但是doInference_cu这个函数跑就是有区别。

等会加我说吧,要给我张图,我测试下,跑一下看看

@newforrestgump001
Copy link
Author

好的,多谢您!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants