Write interface for new serialization system #8868

vtnerd · 2023-05-21T17:07:01Z

This PR contains the write serialization interface, and some glue code from #8867 . This does not replace any existing serialization; the goal is to start a discussion on the design of the write system.

The write system - for epee binary - is roughly 56% faster than the existing design. This is primarily due to the removal of the intermediate DOM building stage - everything is written directly from objects to the "sink". I will work on JSON output numbers later.

The code for epee binary writing and json writing is not included, that can be reviewed in a future PR.

contrib/epee/include/serialization/wire/write.h

contrib/epee/include/serialization/wire/field.h

jeffro256

I'm getting very close to approval, just a couple of things left

jeffro256 · 2023-07-18T21:50:53Z

contrib/epee/include/serialization/wire/traits.h

+#define WIRE_DECLARE_BLOB(type)                 \
+  template<>                                    \
+  struct is_blob<type>                          \
+    : std::true_type                            \
+  {}
+
+#define WIRE_DECLARE_BLOB_NS(type)              \
+  namespace wire { WIRE_DECLARE_BLOB(type); }


Not a big issue, but I think the names of these should be swapped. If you look at Boost struct specialization macros, they automatically include the namespace ... {} component. By default, I'd imagine by principle of least surprise, you would want users to declare traits by including this macro at the root namespace. Then if they want to put it in a specific namespace, they could use the _NS version.

jeffro256 · 2023-07-31T13:58:25Z

contrib/epee/include/serialization/wire/write.h

+  { return source.write_bytes(dest); }
+}
+
+namespace wire_write


Why have the namespace wire_write vs wire::write?

If a write_bytes is added to the ::wire:: namespace then this prevents ADL lookup. There are many such overloads, because any type that is not "owned" by our repo has to go into the ::wire:: namespace.

So you need a "clean" namespace (including global) of any write_bytes calls.

jeffro256 · 2023-07-31T14:14:54Z

contrib/epee/include/serialization/wire/field.h

+    struct custom_tag{};
+    void read_bytes(wire::reader&, boost::fusion::pair<custom_tag, std::string&>)
+    { ... }
+    void write_bytes(wire::writer&, boost::fusion::pair<custom_tag, const std::string&>)
+    { ... }
+
+    template<typename F, typename T>
+    void object_map(F& format, T& self)
+    {
+      wire::object(format,
+        wire::field("foo", boost::fusion::make_pair<custom_tag>(std::ref(self.foo)))
+      );
+    }
+    ```
+
+    Basically each input/output format needs a unique type so that the compiler
+    knows how to "dispatch" the read/write calls. */


I haven't actually tested this, but I feel that personally specializations for std types like integers, bools, strings, etc would be more readable than creating a wrapper type and could potentially make smaller binary sizes, especially for debug builds.

I'm not sure what you mean here. You think specializations should be even easier than creating a wrapper type? I've done this in numerous places, and it didn't seem that bad in practice. Typically, I use a custom struct instead of this boost::fusion::pair object.

The other alternative is skipping usage of the wire::object(...) and instead call start_object() and end_object(), etc., directly. Its just that typically you'll need to handle out-of-order fields, and wire::object(...) does this for you.

jeffro256 · 2023-07-31T14:22:00Z

contrib/epee/include/serialization/wire/write.h

+  template<typename W>
+  inline void write_arithmetic(W& dest, const bool source)
+  { dest.boolean(source); }
+
+  template<typename W>
+  inline void write_arithmetic(W& dest, const int source)
+  { dest.integer(source); }
+
+  template<typename W>
+  inline void write_arithmetic(W& dest, const long source)
+  { dest.integer(std::intmax_t(source)); }
+
+  template<typename W>
+  inline void write_arithmetic(W& dest, const long long source)
+  { dest.integer(std::intmax_t(source)); }
+
+  template<typename W>
+  inline void write_arithmetic(W& dest, const unsigned source)
+  { dest.unsigned_integer(source); }
+
+  template<typename W>
+  inline void write_arithmetic(W& dest, const unsigned long source)
+  { dest.unsigned_integer(std::uintmax_t(source)); }
+
+  template<typename W>
+  inline void write_arithmetic(W& dest, const unsigned long long source)
+  { dest.unsigned_integer(std::uintmax_t(source));}
+
+  template<typename W>
+  inline void write_arithmetic(W& dest, const double source)
+  { dest.real(source); }
+
+  // Template both arguments to allow derived writer specializations
+  template<typename W, typename T>
+  inline std::enable_if_t<std::is_arithmetic<T>::value> write_bytes(W& dest, const T source)
+  { write_arithmetic(dest, source); }


Suggested change

template<typename W>

inline void write_arithmetic(W& dest, const bool source)

{ dest.boolean(source); }

template<typename W>

inline void write_arithmetic(W& dest, const int source)

{ dest.integer(source); }

template<typename W>

inline void write_arithmetic(W& dest, const long source)

{ dest.integer(std::intmax_t(source)); }

template<typename W>

inline void write_arithmetic(W& dest, const long long source)

{ dest.integer(std::intmax_t(source)); }

template<typename W>

inline void write_arithmetic(W& dest, const unsigned source)

{ dest.unsigned_integer(source); }

template<typename W>

inline void write_arithmetic(W& dest, const unsigned long source)

{ dest.unsigned_integer(std::uintmax_t(source)); }

template<typename W>

inline void write_arithmetic(W& dest, const unsigned long long source)

{ dest.unsigned_integer(std::uintmax_t(source));}

template<typename W>

inline void write_arithmetic(W& dest, const double source)

{ dest.real(source); }

// Template both arguments to allow derived writer specializations

template<typename W, typename T>

inline std::enable_if_t<std::is_arithmetic<T>::value> write_bytes(W& dest, const T source)

{ write_arithmetic(dest, source); }

template<typename W>

inline void write_bytes(W& dest, const bool source)

{ dest.boolean(source); }

template<typename W, typename T>

inline std::enable_if_t<std::is_integral<T>() && std::is_signed<T>()> write_bytes(W& dest, const T source)

{ dest.integer(source); }

template<typename W, typename T>

inline std::enable_if_t<std::is_integral<T>() && std::is_unsigned<T>()> write_bytes(W& dest, const T source)

{ dest.unsigned_integer(source); }

template<typename W>

inline void write_bytes(W& dest, const double source)

{ dest.real(source); }

To me, this is less code to parse thru while being more direct

The problem is bool gets accepted for one of those types. There's probably a SFINAE method to fix that too, but I didn't bother coming up with it.

I misspoke, I did this in the read interface, not sure why I didn't do it here. So this will be updated, along with the namespace recommendation and some squashing.

Nope, still getting the ambiguity errors, despite it being nearly identical to the read code. I'm not wasting further time on this, maybe in a future PR this can get cleaned up a bit.

jeffro256 · 2023-08-01T19:42:37Z

contrib/epee/include/serialization/wire/write.h

+  }
+
+  template<typename W, typename... T>
+  inline std::enable_if_t<std::is_base_of<writer, W>::value> object(W& dest, T... fields)


Why is the enable_if_t needed here?

There's an overload in read.h for the wire::reader case. I could omit this until the reader gets merged, but I think it's worth keeping in.

The overload allows for easy cpp usage - the write_bytes and read_bytes can forward to a single templated "map" function that calls wire::object(...) once for both the read and write case. i.e.

struct foo { int x; int y; } void write_bytes(wire::writer&, const foo&); void read_bytes(wire::reader&, foo&); // .. cpp template<typename F, typename T> static void foo_map(F& format, T& self) { wire::object(format, WIRE_FIELD(x), WIRE_FIELD(y)); } void write_bytes(wire::writer& dest, const foo& source) { foo_map(dest, source); } void read_bytes(wire::reader& source, foo& dest) { foo_map(source, dest); }

The boilerplate in the header and cpp can also further by macro'ed to make this a bit easier. This retains the "specify" once of the current method, while allowing for cpp files to reduce header bloat.

And ! I forgot to mention why not wire::object(reader& .. ) - templating allows forwarding of the derived type for performance reasons. Many of the calls within our codebase only actually use one type (p2p only uses epee, etc.), so specifying the derived makes more sense.

vtnerd · 2023-08-11T20:36:55Z

I just force pushed a squashing commit, hopefully this won't be too hard to re-review.

jeffro256 reviewed Jul 1, 2023

View reviewed changes

contrib/epee/include/serialization/wire/write.h Outdated Show resolved Hide resolved

jeffro256 reviewed Jul 11, 2023

View reviewed changes

contrib/epee/include/serialization/wire/field.h Show resolved Hide resolved

jeffro256 reviewed Aug 1, 2023

View reviewed changes

jeffro256 approved these changes Aug 7, 2023

View reviewed changes

Write interface for new serialization system

a6ce2dc

vtnerd force-pushed the new_serialization_write branch from 3dc62a7 to a6ce2dc Compare August 11, 2023 20:34

selsta approved these changes Aug 11, 2023

View reviewed changes

luigi1111 merged commit b9fd761 into monero-project:master Aug 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write interface for new serialization system #8868

Write interface for new serialization system #8868

vtnerd commented May 21, 2023

jeffro256 left a comment

jeffro256 Jul 18, 2023

jeffro256 Jul 31, 2023

vtnerd Aug 1, 2023

jeffro256 Jul 31, 2023

vtnerd Aug 1, 2023

jeffro256 Jul 31, 2023

vtnerd Aug 1, 2023

vtnerd Aug 11, 2023

vtnerd Aug 11, 2023

jeffro256 Aug 1, 2023

vtnerd Aug 1, 2023

vtnerd Aug 1, 2023

vtnerd commented Aug 11, 2023

Write interface for new serialization system #8868

Write interface for new serialization system #8868

Conversation

vtnerd commented May 21, 2023

jeffro256 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vtnerd commented Aug 11, 2023