-
Notifications
You must be signed in to change notification settings - Fork 160
Profiler
It is possible to profile HL bytecode applications in order to get accurate CPU measurements.
HashLink profiler is a sampling profiler. It runs as a separate thread that will capture the call stack of other VM threads N times per seconds. With a big N (10000 for example) this give you a 0.1 ms precision for a given measurement. If you measure over a long period of time a running program, for example if you capture a game frames, selecting several frames and looking at the average measurements will further improve the precision.
The best part about a sampling profiler is that since it does not instrument the running code but only sample it from another thread, the original code will run exactly at the same speed as it usually does, no need to compile using a Debug slower build. This also means the profiler will measure your application speed exactly the way it will run on every computer.
In order to start a profiling session, you can use the --profile
command line parameter:
hl --profile 10000 myapp.hl <app args>
The application will start profiling immediately. Once it terminates, a hlprofile.dump
binary file will be created that contains all the profiling information for the session.
If you are using the Visual Studio Code HashLink Debugger, you can also set "profileSamples" : 10000
in your launch.json options (requires HL Debugger 0.9.0+).
The hlprofile.dump
obtained cannot be exploited directly, it first needs to be analysed and converted into JSON format. In order to do this, compile the ProfileGen Haxe project and run it with hl profiler.hl </path/to/hlprofile.dump>
. It will transform the binary dump file into a JSON file that can be displayed. By default it overwrites the hlprofile.dump but you can add -out profile.json
parameter to write it to another file.
Once you have converted the hlprofile.dump
to JSON, you can open it using Chrome Profiler.
- open google chrome
- open developer tools
- navigate the Performance tab
- click on the arrow up (Load Profile) and select your hlprofile.dump file
This will give you the following result:
From here, you can analyse the results, select a specific frame range and see which function runs and where are the bottlenecks for optimization purposes.
Compile with -D hl-profile
to allow Heaps.io framework to display per-frame profiling. This uses the Profiler API detailed below.
You can use hl.Profile.event
API to send a profiler event from your application runtime. These events will be inserted into the binary dump and can be processed by the ProfileGen later, allowing you to create a fully customized reporting.
For example:
hl.Profile.event(44); // insert event 44
hl.Profile.event(55,"My label"); // insert event with string data
hl.Profile.eventBytes(66, myBytesData); // insert event with bytes data
While the profiler supports profiling several threads, the hl.Profile.event API is not thread safe, use with care.
Also, the following special event code are already implemented:
-
-1
: pause the profiler for the current thread -
-2
: resume the profiler for the current thread -
-3
: clear all accumulated profile data -
-4
: pause the profiler for all threads -
-5
: resume the profiler for all threads -
-6
: force generation ofhlprofile.dump
- don't wait for program exit -
-7
: allows to start the profiler setup, you can pass optional samplesCount (default: 1000) -
0
: insert end-of-frame, for games and other realtime applications
The profiler was introduced in HL 1.11 and is only implemented for Windows and, since HL 1.12, Linux.
See src/profile.c
to contribute other platforms.