Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support unpartitioned tables in Kudu #24674

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chenjian2664
Copy link
Contributor

@chenjian2664 chenjian2664 commented Jan 10, 2025

Description

Fix #24661

The current Kudu connector does not support unpartitioned tables, as it assumes the existence of partition keys. However, Kudu itself allows the creation of unpartitioned tables. Interacting with such tables may result in issues like the error partitioning information is missing, as mentioned in #24661.

Refer IMPALA-5546, unpartitioned tables can be handled by using range partitions without specifying any columns.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## kudu
* Support unpartitioned table. ({issue}`24661`)

@cla-bot cla-bot bot added the cla-signed label Jan 10, 2025
@chenjian2664 chenjian2664 force-pushed the kudu_unpartitioned branch 2 times, most recently from 2c165e3 to 68aee9d Compare January 10, 2025 03:44
@chenjian2664 chenjian2664 force-pushed the kudu_unpartitioned branch 2 times, most recently from 643fdcd to cd0ac2f Compare January 10, 2025 05:51
Copy link
Member

@Praveen2112 Praveen2112 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are adding support for unpartitioned tables, can we simplify the test code here ? TestKuduConnectorTest has some customization since we don't support it - now the technical debt can be cleared right ?

@chenjian2664
Copy link
Contributor Author

@Praveen2112 createTable not handle the primary key automatically yet, if add that logic I think maybe we can remove the customization

Comment on lines 111 to 115
String sqlStatement = (String) computeScalar("SHOW CREATE TABLE " + tableName);
String tableProperties = sqlStatement.split("\\)\\s*WITH\\s*\\(")[1];
// empty partition keys
assertTableProperty(tableProperties, "partition_by_hash_columns", Pattern.quote("ARRAY[]"));
assertTableProperty(tableProperties, "partition_by_range_columns", Pattern.quote("ARRAY[]"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check the output of SHOW CREATE .. and assert that String ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a follow-up we can also introduce a SystemTable for Kudu and simplify these further instead of parsing the String.

@chenjian2664 chenjian2664 force-pushed the kudu_unpartitioned branch 6 times, most recently from 44b966c to ab9b3ae Compare January 10, 2025 14:57
Copy link
Member

@Praveen2112 Praveen2112 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this.

@@ -153,11 +156,16 @@ else if (!hashColumns2.isEmpty()) {
}

@SuppressWarnings("unchecked")
List<String> rangeColumns = (List<String>) tableProperties.get(PARTITION_BY_RANGE_COLUMNS);
List<String> rangeColumns = (List<String>) tableProperties.getOrDefault(PARTITION_BY_RANGE_COLUMNS, ImmutableList.of());
requireNonNull(rangeColumns, "rangeColumns is null");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rnn check would be redundant in this case.

@@ -133,7 +134,9 @@ public static PartitionDesign getPartitionDesign(Map<String, Object> tableProper
requireNonNull(tableProperties);

@SuppressWarnings("unchecked")
List<String> hashColumns = (List<String>) tableProperties.get(PARTITION_BY_HASH_COLUMNS);
List<String> hashColumns = (List<String>) tableProperties.getOrDefault(PARTITION_BY_HASH_COLUMNS, ImmutableList.of());
requireNonNull(hashColumns, "hashColumns is null");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rnn check would be redundant in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

DELETE on Kudu table created from Impala fails when partitioning information is missing
3 participants