-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Parse numbers as strings #872
Comments
possible in theory, if we add it to the parse options. they will come in as strings. However, we can consider adding a flag to |
We technically already support this for parsing. Just use The more complicated part of the eqation is serialisation. We don't have a customisable serialiser. On the other hand, custom serialisation is very easy to implement with iostreams. So, no special bit for "this is actually a number" is required. BTW, I am sceptical that such change would not affect performance, even if only in a minor way. @cryptochassis do you only need this special handling for parsing? Is using basic_parser with a custom handler enough for you? |
Here's an example of what I meant: https://godbolt.org/z/KE7YK7h97 |
@grisumbras Very sorry for the late reply. I completely missed your previous messages. Yes, we only need this special handling for parsing. Using basic_parser with a custom handler seems to be sufficient. Thanks a lot for providing a concrete example. One question: for the example, when the parser encounters a number, say, a double, will it still call std::stod behind the scene? Because we are a high-frequency-trading code provider, performance is of utmost importance to us. Without calling std::stod, I'd guess it'd save lots of CPU time. |
The number will still be parsed. But our parser doesn't call Also, this made me think we might want a parser option that disables number parsing outright. |
We parse about millions of json messages per second and therefore skipping string to number conversion would probably have visible impact on our system's performance. We'd appreciate if there could be provided a parser option that disables number parsing. Many thanks! |
if you want the highest performance why don't you use simdjson? Do you need the ability to modify the JSON values? |
We don't need the ability to modify the JSON values. At the time that we first started our library development in 2019 and published its first version, simdjson wasn't available. Based on the best judgement at that time, we picked rapidjson. We ourselves is a library rather than an end-user application. The reason that we are now aiming at migrating to boost json instead of simdjson is because a sizable part of our current users (or those who are thinking about using our library) comes from a Python background and therefore are intermediate to beginner levels in C++. They need a simple way of getting started to build their applications using our library. The simplest way is to only rely on the header-only components of boost but nothing else. And we are getting closer to that: currently we only depend on boost, websocketpp, and rapidjson. We are almost there of moving away from websocketpp by using your beast websocket. So now the only thing to trim is rapidjson after which our only dependency are the header-only components of boost. To sum up, the reason is to achieve a good balance between performance and usability aiming at a wide array of audience having vastly different C++ proficiencies. |
Wow... that rationale is actually rather perfect :) |
Let me know whether we can have such a parser option. Thanks a lot. |
An option to disable number parsing outright? I have a PR for that (#901). IIRC, my benchmarking shows that for number-heavy inputs the speed of parsing increases by 80% (but don't quote me on that). @vinniefalco should I pursue it? |
To be clear, it still sort of does number validation (we need it to know when the number ends and the parser should start parsing another value), it just doesn't convert the characters into a number. |
its an interesting mode |
Perfect. Looking forward to the finalization. Thanks a lot. |
#901 has been merged into develop |
Local benchmarking results:
canada.json is +81% on GCC and +48% on clang, numbers.json is +122% on GCC and +104% on clang. |
When working with various counterparties dealing with monetary systems, we found that, quite often than not, we'd recieve json strings like [1.2345] instead of ["1.2345"]. If we parse that as a double, then we might loose precisions in some cases. In order to preserve precision, we have to parse that number as a string. rapidjson offers a solution by providing kParseNumbersAsStringsFlag: https://rapidjson.org/namespacerapidjson.html#a81379eb4e94a0386d71d15fda882ebc9a13981c0b803803f59d7a01aef3dfc987. Interesting enough, Python standard json library also offers the capability to parse numbers as strings:
https://docs.python.org/3/library/json.html#json.load (see parse_float and parse_int parameters).
We are looking into migrating to boost json library. Parsing numbers as strings is a key thing for us to preserve monetary precision. Thank you.
The text was updated successfully, but these errors were encountered: