-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve OSM PBF read performance. #97
Comments
Thanks for testing this out. Performance hasn't been a major focus for OsmSharp for a while but with all the new features now available to use in .NET core there is a lot that can be improved. My resources are limited to spend time on OsmSharp and for now I'm happy performance-wise but I have a list in my head of things we can do:
|
@xivk Do you have ideas on how to implement those? Especially regarding the streaming model. I plan on using OsmSharp in a big project. |
I was thinking about instead of using Current() to return an OsmGeo object we use properties on the OsmStreamSource class exposing the data (ID, changeset, user, nodes, members) OR (and this is risky) we mute the object we return in Current() forcing consumers to clone it if they want to keep it around. |
Another option is to use an internal property that exposes an internal mutable object. It would also be a good idea to refactor the nodes and members array into something we can reused. Now this is an array in both cases and we have to rebuild the array when the # of nodes or members change. It's also probably a good idea to define lightweight objects without meta-data, a node being just a lat/lon when that's all that is needed. Also the tags collections could use an update and a better approach with regards to performance. Just to say, there is a lot to be done and considered here, I can help you get started, but I believe we should start with some experimentation before making design decisions just to confirm we are fixing the right things. |
How does the mutable object play together with multi core support? |
Multicore support? I hadn't considered that yet but it could be an option to read pbf data into a cache and use that to enumerate the OSM objects. |
Some more info on this, I looked into this, the main effort is in reading the PBF blocks (obviously). From what I can see about how protobuf-net works, our only option to improve this is to write our own PBF reader/writer specifically for OSM data, customized and optimized. |
I guess this requires a lot of effort. I am not that familiar with protobuf-net. So the "easiest" part is reduce allocations and other operations slowing down the process? |
The OSM PBF format is a bit strange, it seems to use two steps, one with raw data encoded in a protobuf message that then again is a protobuf message. I think the biggest is with the decoding for the Blob message, it allocates a new byte array on every message and then optionally decompresses it into yet another buffer. I think the byte array is just there in the stream. It would need some figuring out to see what protobuf does with a message like that. More info here: https://wiki.openstreetmap.org/wiki/PBF_Format#Low_level_encoding The relevant code is here: https://github.com/OsmSharp/core/blob/develop/src/OsmSharp/IO/PBF/PBFReader.cs#L100 The second thing is the PrimitiveBlock message. I think we can probably decode that data without first dumping it into a list/array/stringtable. We could also try to get in touch with @mgravell the author of the protobuf-net library, but I guess he has better things to do than help us read PBF OSM as fast as possible! ;-) |
Hi! Thought this might be useful to dive into the internals : https://github.com/mapbox/osmpbf-tutorial |
Well the basic idea for a low-allocation reader would be:
The actual tricky part is getting a Another weird part is that we usually don't see many examples of compressing or decompressing from one buffer to another without allocating extra garbage each time we do so. Assuming you're using Ionic.Zlib, it's basically (warning: I haven't compiled or tested this exact version of this snippet, so it might be slightly broken): static void InflateAssumingExactSize(this ZlibCodec inflater, ArraySegment<byte> compressed, ArraySegment<byte> decompressed)
{
inflater.InitializeInflate();
inflater.InputBuffer = compressed.Array;
inflater.NextIn = compressed.Offset;
inflater.AvailableBytesIn = compressed.Count;
inflater.OutputBuffer = decompressed.Array;
inflater.NextOut = decompressed.Offset;
inflater.AvailableBytesOut = decompressed.Count;
inflater.Inflate(FlushType.Finish);
}
// you only need one instance of this:
ZlibCodec inflater = new ZlibCodec(CompressionMode.Decompress);
// for each data block, usage is something like:
ArraySegment<byte> compressed = /* you should have this already. */;
int rawDataSize = /* you should have this already. */;
byte[] rentedDecompressedBuffer = ArrayPool<byte>.Shared.Rent(rawDataSize);
try
{
ArraySegment<byte> decompressed = new ArraySegment<byte>(rentedDecompressedBuffer, 0, rawDataSize);
inflater.InflateAssumingExactSize(compressed, decompressed);
DoStuffWith(decompressed);
}
finally
{
ArrayPool<byte>.Shared.Return(rentedDecompressedBuffer);
} |
There's also something I came across when getting complete relations (and thus reading multiple times the same PBF) : It's hard to know where the Ways and Relations start. When getting relations in my first pass (not specific to osmsharp) I have to read and decompress every blob to know what the PrimitiveBlock is holding. |
I will add https://github.com/dotnet/BenchmarkDotNet here just in case. One idea could be to first add this benchmarking library .NET repo also use and then benchmark the code to be refactored. This way there are quantfiable results and gradually increasing performance coverage. |
Opening osm pbf files and counting the elements in a decently large osm.pbf file is very slow compared to other languages osm libraries/tools.
For example, running the example snippet in the readme (without the console prints) on a ~9GB osm.pbf file takes over 10 minutes. Running the osmium tags-filter tool with a complex filter command (which reads the file multiple times) over the same file on the same machine takes less than 3 minutes. Something funky is going on in the osm pbf stream reading logic or the readme provided example.
System and dotnet version (although I ran into the same thing on my ubuntu18 machine).
(Click to expand)
dottrace output of all "OsmSharp" functions. Time values are in milliseconds. Total time was 853211 ms (~14 minutes).
(Click to expand)
Steps to reproduce dotTrace output:
(Click to expand)
I ran the profiling tool again, this time also looking at protobuf and the entire system namespace.
OsmSharpDotTraceResults.zip
The text was updated successfully, but these errors were encountered: