-
Notifications
You must be signed in to change notification settings - Fork 0
Schema Types
When creating a Parquet file using parquet-java, you can specify the schema types for your data fields. The available schema types are:
- requiredGroup
- repeatedGroup
- required
- repeated
- optionalGroup
- optional
Below is a brief explanation of each schema type.
A requiredGroup
is a group of fields (a nested schema) that must be present in every record and cannot be null
.
The fields within this group can have their own repetition levels (required, optional, or repeated).
A repeatedGroup
represents a group that can occur zero or more times,
effectively modeling a list or array of nested records.
A required
field must be present in every record and cannot be null
. This ensures that the field always contains a value.
A repeated
field can have zero or more values, modeling a list or array of values of the same type.
An optionalGroup
is a group of fields that may or may not be present in a record. The entire group can be null
.
An optional
field may or may not be present in a record. If the field is not present, it is considered null
.
Developed and maintained by the Altinity team.
- Home
- Parquet File Name
- Options of the File
- File Compression
- Writer Version
- Row and Page Size
- Bloom Filter
- Configure with Hadoop
- Integer Columns
- Unsigned Integer Columns
- UTF8 Columns
- Decimal Columns
- Date Columns
- Time and Timestamp Columns
- JSON and BSON Columns
- String Columns
- Enum Columns
- UUID Columns
- Float16 Column
- Array Columns
- Nested Array Columns
- Tuple Columns
- Nested Tuple Columns
- Schema Types
- Encodings
- File Encryption
- Extra Metadata Entries