-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Group GPU child zones by name. #262
base: master
Are you sure you want to change the base?
Conversation
What is an unique source location in your use case? Is this duplication some artifact of optimization/inlining? |
These are GPU locations that use this approach: https://github.com/wolfpld/tracy/blob/master/TracyVulkan.hpp#L401 |
(more to your question: we are running dynamic content that can't be compiled into the application and use the static source locations that are uniqued in the application using the static SourceLocationData) |
This should already be deduplicated. The It would be interesting to find out why and where this breaks. |
Awesome! I'll debug through that! |
Debugging through loading the file, I see TracyWorker.cpp 744 loading 1303 entries into sourceLocationPayload with 81 deduplicated entries in sourceLocationPayloadMap. So it looks like it is deduplicating there correctly on load, but the fact that there are 1303 payloads makes me think it's not deduplicating on capture correctly. I believe the issue causing the behavior I'm seeing above in the zone view is that though that deduplication happens on load the GpuEvents cpuStart_srcloc is still pointing at the raw sourceLocationPayload index from the file - this means that the view code that's strictly using that index for its cmap will still be acting in the unduplicated domain. The source locations I'm seeing are negative, probably coming from m_pendingSourceLocationPayload on the capture side. Debugging into that will be trickier (ugh, Android), but maybe the above triggers something in your mind? In particular, I'm wondering if namehash may be causing issues with the AddSourceLocationPayload deduplicating. One particularly interesting thing I've observed is that this trace I'm debugging through was captured using the capture tool against a process running on an Android device, but if I capture a trace by directly attaching to a process on my Windows machine things work fine! Debugging through the same sourceLocationPayload loading code in a saved trace I see that both sourceLocationPayload and sourceLocationPayloadMap have the same number of entries (31 in my particular run) - indicating that the worker did deduplicate the payloads correctly. Any thoughts as to why Android + capture tool would not deduplicate while Windows + profiler UI would? (I'm not sure where best to look at next) Attached is the trace showing the unduplicated sourceLocationPayloads in case it's helpful: (and thanks for the help!) |
There is no deduplication at load time. Everything is done at capture time.
This is the intended behavior.
This is because there are two ways of providing source location data, as you can see in the source location retrieval code: const SourceLocation& Worker::GetSourceLocation( int16_t srcloc ) const
{
if( srcloc < 0 )
{
return *m_data.sourceLocationPayload[-srcloc-1];
}
else
{
return m_data.sourceLocation.find( m_data.sourceLocationExpand[srcloc] )->second;
}
} Positive Negative You can see that the loading code just populates the
I don't think Android is a factor here, the allocated source location transfer format is rather simple, and not platform specific: // Allocated source location data layout:
// 2b payload size
// 4b color
// 4b source line
// fsz function name
// 1b null terminator
// ssz source file name
// 1b null terminator
// nsz zone name (optional) Furthermore, both Profiler UI and the |
Ok cool - that all makes sense. I apologize for being a bit dense, but why would if they have duplicate data and everything is working as intended are they not deduplicated and shown as unique entries in the GPU zone listing? (I don't know where to look next :) |
I have no idea. The best course of action would be to try to do a repro case and then debug the server part to find out what exactly happens in the deduplication code and why it is failing. |
2511616
to
aa451b4
Compare
This gives a more intuitive grouping in this UI pane than the uniqued source locations which aren't fully displayed - people were thinking the UI pane was broken as identically named zones weren't grouping when the checkbox was ticked.
Previous:
Now: