From c5925bbafdcce6ca073e34b22b9969133807fc61 Mon Sep 17 00:00:00 2001 From: Clement Rey Date: Thu, 23 Jan 2025 09:29:58 +0100 Subject: [PATCH] Forbid compaction of tagged+untagged data for a single component (#8782) This implements the second solution described in https://github.com/rerun-io/rerun/issues/8760#issuecomment-2607749175: > I knew I was forgetting yet another subtle complication: maintaining an untagged index is not enough, you can still end up in a situation where a single Chunk has both untagged and tagged data for a component, and no index is gonna save you there. > This is what's happening here (see attached screenshot). if i had to guess, this is because a runtime blueprint write ends up compacted in a pre-existing, tagged blueprint chunk, and now the resulting chunk is both tagged and untagged for that component. > > Once we're done with all the API updates on the SDK side, it shouldn't ever be possible for a user to end up in that situation *when working in new recording*, so that end is covered. > That leaves A) runtime blueprint writes and B) user writes to a pre-existing, legacy recording. Obviously the correct fix for blueprint writes is to port all of them to tagged APIs, but A) that will not happen for 0.22 and B) that doesn't take care of the other problem. > > I see two possible avenues here: > * Modify the compaction logic so that when tagged data is compacted with untagged data of the same component, we merge them together and keep the tags going forward. This should be well-specified with the current data model, and helps with propagating tags going forward. > * Modify the compaction logic so that tagged and untagged data of the same component is never compacted together. I can confirm this fixes the "changing view origin crashes in debug" issue. * Part of #8760 --- crates/store/re_chunk/src/merge.rs | 34 +++++++++++++++++++++++++++++- 1 file changed, 33 insertions(+), 1 deletion(-) diff --git a/crates/store/re_chunk/src/merge.rs b/crates/store/re_chunk/src/merge.rs index d54990e836a1..a7141243a187 100644 --- a/crates/store/re_chunk/src/merge.rs +++ b/crates/store/re_chunk/src/merge.rs @@ -250,6 +250,35 @@ impl Chunk { }) } + /// Returns `true` if both chunks share the same descriptors for the components that + /// _they have in common_. + #[inline] + pub fn same_descriptors(&self, rhs: &Self) -> bool { + self.components.values().all(|lhs_per_desc| { + if lhs_per_desc.len() > 1 { + // If it's already in UB land, we might as well give up immediately. + return false; + } + + lhs_per_desc.keys().all(|lhs_desc| { + let Some(rhs_per_desc) = rhs.components.get(&lhs_desc.component_name) else { + return true; + }; + + if rhs_per_desc.len() > 1 { + // If it's already in UB land, we might as well give up immediately. + return false; + } + + if let Some(rhs_desc) = rhs_per_desc.keys().next() { + lhs_desc == rhs_desc + } else { + true + } + }) + }) + } + /// Returns true if two chunks are concatenable. /// /// To be concatenable, two chunks must: @@ -258,7 +287,10 @@ impl Chunk { /// * Use the same datatypes for the components they have in common. #[inline] pub fn concatenable(&self, rhs: &Self) -> bool { - self.same_entity_paths(rhs) && self.same_timelines(rhs) && self.same_datatypes(rhs) + self.same_entity_paths(rhs) + && self.same_timelines(rhs) + && self.same_datatypes(rhs) + && self.same_descriptors(rhs) } }