Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug fix] Append() in Nullable columns causes nulls_ array to be incorrectly appended to if nested column Append() fails #376

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion clickhouse/columns/nullable.h
Original file line number Diff line number Diff line change
Expand Up @@ -95,12 +95,13 @@ class ColumnNullableT : public ColumnNullable {
}

inline void Append(ValueType value) {
ColumnNullable::Append(!value.has_value());
if (value.has_value()) {
typed_nested_data_->Append(std::move(*value));
} else {
typed_nested_data_->Append(typename ValueType::value_type{});
}
// Must be called after Append to nested column because it might fail
ColumnNullable::Append(!value.has_value());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may also fail, so perhaps we can make ColumnNullable::Append(!value.has_value()); the first operation, wrap second part in try-catch (or with some RAII wrapper), and rollback last value in nulls_ in case of error. (perhaps by introducing protected method ColumnNullable::PopBackNull())

Copy link
Contributor Author

@moaazassali moaazassali Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the wanted behavior for Append() to fail if the nested Append() fails too? For example, if we append a 10-character string to ColumnFixedString(5), it will throw a validation error:

string.cpp, line 38:

    if (str.size() > string_size_) {
        throw ValidationError("Expected string of length not greater than "
                                 + std::to_string(string_size_) + " bytes, received "
                                 + std::to_string(str.size()) + " bytes.");
    }

So, my assumption was that Nullable(ColumnFixedString(5)) should behave the same way as the nested column and also throw an error instead of catching the error and failing silently. By moving ColumnNullable::Append(!value.has_value()); to the end, we ensure that we throw an error early, matching the behavior of the nested column without causing unintended side effects.

However, I may have misunderstood the intended behavior. Could you please clarify further?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it needs to fail, but also the behavior must be consistent. I.e. nulls_->Size() must be the same as typed_nested_data_->Size(), which is not true with current implementation (and your proposed change).

So I just propose the way to rollback changes of nulls_, since it is the easiest one to do, if there is any exception.

Copy link
Contributor Author

@moaazassali moaazassali Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please provide a specific example of how it leads to inconsistencies? I can't seem to think of one. This is my understanding:

We have 3 cases:
1- value is not null and nested Append() does not fail:

  • we call typed_nested_data_->Append(std::move(*value)); which does not fail
  • typed_nested_data_ size = N+1, nulls_ size = N
  • we call ColumnNullable::Append(!value.has_value());
  • typed_nested_data_ size = N+1, nulls_ size = N+1
  • we have consistent sizes

2- value is not null and nested Append fails:

  • we call typed_nested_data_->Append(std::move(*value)); which fails
  • an error is thrown so we exit the function
  • typed_nested_data_ size = N, nulls_ size = N
  • we have consistent sizes

3- value is null:

  • we call typed_nested_data_->Append(typename ValueType::value_type{}); which cannot fail
  • we call ColumnNullable::Append(!value.has_value());
  • typed_nested_data_ size = N+1, nulls_ size = N+1
  • we have consistent sizes

I have also tested those 3 cases and it passed them:

const auto col = new ColumnNullableT<ColumnFixedString>(5);
col->Append("hello");
CHECK(col->Size() == 1);
CHECK(col->typed_nested_data_->Size() == 1);

CHECK_THROWS(col->Append("world!"));
CHECK(col->Size() == 1);
CHECK(col->typed_nested_data_->Size() == 1);

col->Append(std::nullopt);
CHECK(col->Size() == 2);
CHECK(col->typed_nested_data_->Size() == 2);

Copy link
Collaborator

@Enmk Enmk Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but what if ColumnNullable::Append(!value.has_value()); fails?

In that case we have typed_nested_data_->Size() == ColumnNullable::Size() + 1 and hence the ColumnNullableT instance is in inconsistent state.

}

/** Create a ColumnNullableT from a ColumnNullable, without copying data and offsets, but by
Expand Down
Loading