Skip to content

Commit

Permalink
Update docs/hub/datasets-libraries.md
Browse files Browse the repository at this point in the history
Co-authored-by: Julien Chaumond <[email protected]>
  • Loading branch information
davanstrien and julien-c authored Jan 17, 2025
1 parent ecf8c7c commit 937ea9e
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/hub/datasets-libraries.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ You can find more information about loading data from the Hub [here](https://hug

The Hub's dataset viewer and Parquet conversion system provide a standardized way to integrate with datasets, regardless of their original format. This infrastructure is a reliable integration layer between the Hub and external libraries.

The Hub automatically converts the first 5GB of every dataset to Parquet format (unless already in Parquet) to power the dataset viewer and provide consistent access patterns. This standardization offers several benefits for library integrations:
If the dataset is not already in Parquet, the Hub automatically converts the first 5GB of every dataset to Parquet format to power the dataset viewer and provide consistent access patterns. This standardization offers several benefits for library integrations:

- Consistent data access patterns regardless of original format
- Built-in dataset preview and exploration through the Hub's dataset viewer. The dataset viewer can also be embedded as an iframe in your applications, making it easy to provide rich dataset previews. For more information about embedding the viewer, see the [dataset viewer embedding documentation](https://huggingface.co/docs/hub/en/datasets-viewer-embed).
Expand Down

0 comments on commit 937ea9e

Please sign in to comment.