Skip to content

Proposal for very large resolution images

Dirk Farin edited this page Jul 25, 2024 · 22 revisions

The HEIF grid image item is limited to images with less than 256x256 tiles because the number of tiles per row/column is stored in an 8 bit integer and also because the number of references in iref is limited to 65535. Moreover, it has significant overhead because each tile image has a copy of the metadata iinf, ipma, iref, iloc that sum to >3.3 MB for a 256x255 tile image. This metadata is significant because it has to be loaded completely before decoding the image can start.

In order to support larger images, I propose to introduce a new image item_type, e.g. 'tild', as an alternative to grid.

Design Considerations

These features have been taken into account when designing the tild syntax:

  • support for very large resolutions,
  • much less overhead than grid,
  • enable streaming the image content over the internet with small initial setup delays,
  • support tiled images in which some tiles are not covered with image data,
  • saving tiles in arbitrary order to allow gradually growing files,
  • ability to order the tile storage locations such that locally neighboring tiles are closer together,
  • interleaved storage of multiple tiled images, e.g. for multi-resolution pyramids where storage of the lower resolution layer is interleaved with the higher-resolution layers,
  • ability to build multi-resolution pyramids with a mixture of grid, tild, and unci images in order to have partial compatibility to software without tild support.

tild Image Item Syntax

class TiledImage {
  unsigned int(8) version = 0;
  unsigned int(8) flags;

  DimensionFieldLength = (flags & 1) ? 64 : 32;
  unsigned int(DimensionFieldLength) output_width;
  unsigned int(DimensionFieldLength) output_height;

  unsigned int(32) tile_width;
  unsigned int(32) tile_height;

  unsigned int(32) tile_compression_type;

  TileColumns = (output_width + tile_width -1)/tile_width;
  TileRows    = (output_height + tile_height -1)/tile_height;

  OffsetFieldLength = (flags & 2) ? 64 : 32;

  for (int i=0; i<TileColumns*TileRows ; i++) {
    unsigned int(OffsetFieldLength) tile_start_offset[i];
  }

  SequentialOrder = (flags & 4);

  // ... followed by compressed tile data ...
}

Semantics

  • output_width, output_height is the total image size. This does not have to be an even multiple of tile_width, tile_height.
  • tile_width, tile_height is the size of a single tile. All tiles have the same size.
  • tile_compression_type is the four-character code that would have been used as tile item type in a grid image. E.g. hvc1 for h265 compression or j2k1 for JPEG2000.
  • tile_start_offset points to the start of the compressed data of the tile. The position is given relative to the start of the tild data. If a tile is not coded, the tile_start_offset[i] shall be 0. Note that this is not a file offset, but an offset into the tild data that can potentially span several iloc extents.
  • SequentialOrder is a hint to the decoder whether the compressed tile data is stored in sequential order.

Notes

  • The tild item shall have associated properties that are implicitly assigned to each tile. E.g. a tild image with tile_compression_type=hvc1 shall have an associated hvcC box that describes the coded stream of each tile.

  • The ispe item associated with the tild defines the size of the tild image, not the size of a tile.

  • The tild data shall be stored in an mdat box. This enables to read the starting positions of the tiles on-demand instead of having to the read them entirely at startup as is would be required when the tiles were each referenced in an iloc.

  • Even though the compressed tile data logically follows continuously after the metadata, we can still write the data interleaved into the file (e.g. intermixed with other tild resolution layers) by employing iloc extents.

  • Compressed data for the tiles can be stored in the file in any order.

  • The compressed tile size is not stored because the length of each tile can be computed from the start positions of the tiles. This is even possible if the tiles are not stored in sequential order. In that case, it is necessary to sort the tile_start_offset[] array. Whether the sorting step can be skipped is indicated by the SequentialOrder flag.

File structure

Simple file with single tild image

file1

File with two interleaved tild images

file2

tild, grid, and unci coexistence

When building a multi-resolution pymd pyramid, different image types can be used for each layer. For example, it would be possible to use grid images for the lower resolution layers so that these can be read with software that does not understand tild image types. Software support for tild is only needed for the high resolution layers.

pyramid