Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiler result is not consistent with each run #794

Open
RavikumarLav opened this issue Dec 17, 2024 · 0 comments
Open

Profiler result is not consistent with each run #794

RavikumarLav opened this issue Dec 17, 2024 · 0 comments

Comments

@RavikumarLav
Copy link

RavikumarLav commented Dec 17, 2024

Hello,

i am using below code part to capture the runtime for model inference

// Import the TensorFlow model. Note: use CreateNetworkFromBinaryFile for .pb files.
armnnTfLiteParser::ITfLiteParserPtr parser = armnnTfLiteParser::ITfLiteParser::Create();

armnn::INetworkPtr network = parser->CreateNetworkFromBinaryFile("model_latest.tflite");

// Find the binding points for the input and output nodes  
armnnTfLiteParser::BindingPointInfo inputBindingInfo = parser->GetNetworkInputBindingInfo(0, "conv2d_input");
armnnTfLiteParser::BindingPointInfo outputBindingInfo = parser->GetNetworkOutputBindingInfo(0, "Identity");

// Create ArmNN runtime
armnn::IRuntime::CreationOptions options; // default options
armnn::IRuntimePtr runtime = armnn::IRuntime::Create(options);

armnn::Compute device= armnn::Compute::CpuAcc; 
//armnn::Compute device= armnn::Compute::CpuRef;
armnn::IOptimizedNetworkPtr optNet = Optimize(*network, {device}, runtime->GetDeviceSpec()); 
// Load the optimized network onto the runtime device
armnn::NetworkId networkIdentifier;
runtime->LoadNetwork(networkIdentifier, std::move(optNet));

// Create a profiler and register it for the current thread.
std::shared_ptrarmnn::IProfiler profiler = runtime->GetProfiler(networkIdentifier);
profiler->EnableProfiling(true);

// Enable profiling.
profiler->EnableProfiling(true);

// Run Inference
armnn::InputTensors inputTensor = MakeInputTensors(inputBindingInfo, &input[0]);
armnn::OutputTensors outputTensor = MakeOutputTensors(outputBindingInfo, &output[0]);
armnn::Status ret = runtime->EnqueueWorkload(networkIdentifier, inputTensor, outputTensor);

// Print output
profiler->Print(std::cout);

able to see json format of each layer profiler result .

Problem: Running .tflite model on arm a78 core with CpuAcc as the option the runtime is different for each run of same model.

for one of model it is varying from 0.8 to 1.2ms

Need to know how runtime is calculating using system clock or by using arm registers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant