Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write interface for new serialization system #8868

Merged
merged 1 commit into from
Aug 17, 2023

Conversation

vtnerd
Copy link
Contributor

@vtnerd vtnerd commented May 21, 2023

This PR contains the write serialization interface, and some glue code from #8867 . This does not replace any existing serialization; the goal is to start a discussion on the design of the write system.

The write system - for epee binary - is roughly 56% faster than the existing design. This is primarily due to the removal of the intermediate DOM building stage - everything is written directly from objects to the "sink". I will work on JSON output numbers later.

The code for epee binary writing and json writing is not included, that can be reviewed in a future PR.

Copy link
Contributor

@jeffro256 jeffro256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting very close to approval, just a couple of things left

Comment on lines 33 to 40
#define WIRE_DECLARE_BLOB(type) \
template<> \
struct is_blob<type> \
: std::true_type \
{}

#define WIRE_DECLARE_BLOB_NS(type) \
namespace wire { WIRE_DECLARE_BLOB(type); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big issue, but I think the names of these should be swapped. If you look at Boost struct specialization macros, they automatically include the namespace ... {} component. By default, I'd imagine by principle of least surprise, you would want users to declare traits by including this macro at the root namespace. Then if they want to put it in a specific namespace, they could use the _NS version.

{ return source.write_bytes(dest); }
}

namespace wire_write
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have the namespace wire_write vs wire::write?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a write_bytes is added to the ::wire:: namespace then this prevents ADL lookup. There are many such overloads, because any type that is not "owned" by our repo has to go into the ::wire:: namespace.

So you need a "clean" namespace (including global) of any write_bytes calls.

Comment on lines +78 to +94
struct custom_tag{};
void read_bytes(wire::reader&, boost::fusion::pair<custom_tag, std::string&>)
{ ... }
void write_bytes(wire::writer&, boost::fusion::pair<custom_tag, const std::string&>)
{ ... }

template<typename F, typename T>
void object_map(F& format, T& self)
{
wire::object(format,
wire::field("foo", boost::fusion::make_pair<custom_tag>(std::ref(self.foo)))
);
}
```

Basically each input/output format needs a unique type so that the compiler
knows how to "dispatch" the read/write calls. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't actually tested this, but I feel that personally specializations for std types like integers, bools, strings, etc would be more readable than creating a wrapper type and could potentially make smaller binary sizes, especially for debug builds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean here. You think specializations should be even easier than creating a wrapper type? I've done this in numerous places, and it didn't seem that bad in practice. Typically, I use a custom struct instead of this boost::fusion::pair object.

The other alternative is skipping usage of the wire::object(...) and instead call start_object() and end_object(), etc., directly. Its just that typically you'll need to handle out-of-order fields, and wire::object(...) does this for you.

Comment on lines +93 to +128
template<typename W>
inline void write_arithmetic(W& dest, const bool source)
{ dest.boolean(source); }

template<typename W>
inline void write_arithmetic(W& dest, const int source)
{ dest.integer(source); }

template<typename W>
inline void write_arithmetic(W& dest, const long source)
{ dest.integer(std::intmax_t(source)); }

template<typename W>
inline void write_arithmetic(W& dest, const long long source)
{ dest.integer(std::intmax_t(source)); }

template<typename W>
inline void write_arithmetic(W& dest, const unsigned source)
{ dest.unsigned_integer(source); }

template<typename W>
inline void write_arithmetic(W& dest, const unsigned long source)
{ dest.unsigned_integer(std::uintmax_t(source)); }

template<typename W>
inline void write_arithmetic(W& dest, const unsigned long long source)
{ dest.unsigned_integer(std::uintmax_t(source));}

template<typename W>
inline void write_arithmetic(W& dest, const double source)
{ dest.real(source); }

// Template both arguments to allow derived writer specializations
template<typename W, typename T>
inline std::enable_if_t<std::is_arithmetic<T>::value> write_bytes(W& dest, const T source)
{ write_arithmetic(dest, source); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
template<typename W>
inline void write_arithmetic(W& dest, const bool source)
{ dest.boolean(source); }
template<typename W>
inline void write_arithmetic(W& dest, const int source)
{ dest.integer(source); }
template<typename W>
inline void write_arithmetic(W& dest, const long source)
{ dest.integer(std::intmax_t(source)); }
template<typename W>
inline void write_arithmetic(W& dest, const long long source)
{ dest.integer(std::intmax_t(source)); }
template<typename W>
inline void write_arithmetic(W& dest, const unsigned source)
{ dest.unsigned_integer(source); }
template<typename W>
inline void write_arithmetic(W& dest, const unsigned long source)
{ dest.unsigned_integer(std::uintmax_t(source)); }
template<typename W>
inline void write_arithmetic(W& dest, const unsigned long long source)
{ dest.unsigned_integer(std::uintmax_t(source));}
template<typename W>
inline void write_arithmetic(W& dest, const double source)
{ dest.real(source); }
// Template both arguments to allow derived writer specializations
template<typename W, typename T>
inline std::enable_if_t<std::is_arithmetic<T>::value> write_bytes(W& dest, const T source)
{ write_arithmetic(dest, source); }
template<typename W>
inline void write_bytes(W& dest, const bool source)
{ dest.boolean(source); }
template<typename W, typename T>
inline std::enable_if_t<std::is_integral<T>() && std::is_signed<T>()> write_bytes(W& dest, const T source)
{ dest.integer(source); }
template<typename W, typename T>
inline std::enable_if_t<std::is_integral<T>() && std::is_unsigned<T>()> write_bytes(W& dest, const T source)
{ dest.unsigned_integer(source); }
template<typename W>
inline void write_bytes(W& dest, const double source)
{ dest.real(source); }

To me, this is less code to parse thru while being more direct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is bool gets accepted for one of those types. There's probably a SFINAE method to fix that too, but I didn't bother coming up with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misspoke, I did this in the read interface, not sure why I didn't do it here. So this will be updated, along with the namespace recommendation and some squashing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, still getting the ambiguity errors, despite it being nearly identical to the read code. I'm not wasting further time on this, maybe in a future PR this can get cleaned up a bit.

}

template<typename W, typename... T>
inline std::enable_if_t<std::is_base_of<writer, W>::value> object(W& dest, T... fields)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the enable_if_t needed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an overload in read.h for the wire::reader case. I could omit this until the reader gets merged, but I think it's worth keeping in.

The overload allows for easy cpp usage - the write_bytes and read_bytes can forward to a single templated "map" function that calls wire::object(...) once for both the read and write case. i.e.

struct foo { int x; int y; }
void write_bytes(wire::writer&, const foo&);
void read_bytes(wire::reader&, foo&);

// .. cpp
template<typename F, typename T>
static void foo_map(F& format, T& self)
{ wire::object(format, WIRE_FIELD(x), WIRE_FIELD(y)); }

void write_bytes(wire::writer& dest, const foo& source) { foo_map(dest, source); }
void read_bytes(wire::reader& source, foo& dest) { foo_map(source, dest); } 

The boilerplate in the header and cpp can also further by macro'ed to make this a bit easier. This retains the "specify" once of the current method, while allowing for cpp files to reduce header bloat.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And ! I forgot to mention why not wire::object(reader& .. ) - templating allows forwarding of the derived type for performance reasons. Many of the calls within our codebase only actually use one type (p2p only uses epee, etc.), so specifying the derived makes more sense.

@vtnerd vtnerd force-pushed the new_serialization_write branch from 3dc62a7 to a6ce2dc Compare August 11, 2023 20:34
@vtnerd
Copy link
Contributor Author

vtnerd commented Aug 11, 2023

I just force pushed a squashing commit, hopefully this won't be too hard to re-review.

@luigi1111 luigi1111 merged commit b9fd761 into monero-project:master Aug 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants