Skip to content

Releases: OCR-D/core

v3.0.0a2

22 Aug 09:12
@kba kba
Compare
Choose a tag to compare
v3.0.0a2 Pre-release
Pre-release

Changed:

  • 🔥 OcrdPage as proxy of PcGtsType instead of alias; also contains etree and mapping now
  • 🔥 Processor.zip_input_files now can throw ocrd.NonUniqueInputFile and ocrd.MissingInputFile
    (the latter only if OCRD_MISSING_INPUT=ABORT)
  • 🔥 Processor.zip_input_files does not by default use require_first anymore
    (so the first file in any input file tuple per page can be None as well)
  • 🔥 no more Workspace.overwrite_mode, merely delegate to OCRD_EXISTING_OUTPUT=OVERWRITE
  • 🎨 improve on docs result for ocrd_utils.config

Added:

  • 👉 OCRD_DOWNLOAD_INPUT for whether input files should be downloaded before processing
  • 👉 OCRD_MISSING_INPUT for how to handle missing input files (SKIP or ABORT)
  • 👉 OCRD_MISSING_OUTPUT for how to handle processing failures (SKIP or ABORT or COPY)
    the latter behaves like ocrd-dummy for the failed page(s)
  • 👉 OCRD_EXISTING_OUTPUT for how to handle existing output files (SKIP or ABORT or OVERWRITE)
  • new CLI option --debug as short-hand for ABORT choices above
  • Processor.logger set up by constructor already (for re-use by processor implementors)
  • default-expand and validate ocrd_tool.json in Processor constructor, log invalidities
  • handle JSON deprecation in ocrd_tool.json by reporting warnings

v3.0.0a1

15 Aug 17:17
@kba kba
Compare
Choose a tag to compare
v3.0.0a1 Pre-release
Pre-release

#1240 for details

Changed:

  • 🔥 Deprecate Processor.process
  • update spec to v3.25.0, which requires annotating fileGrp cardinality in ocrd-tool.json
  • 🔥 Remove passing non-processing kwargs to Processor constructor, add as members..
    (i.e. show_help, dump_json, dump_module_dir, list_resources, show_resource, resolve_resource)
  • 🔥 Deprecate passing processing arg / kwargs to Processor constructor..
    (i.e. workspace, page_id, input_file_grp, output_file_grp; now all set by run_processor)
  • 🔥 Deprecate passing ocrd-tool.json metadata to Processor constructor
  • ocrd.processor: Handle loading of bundled ocrd-tool.json generically

Added:

  • Processor.process_workspace: process a complete workspace, with default implementation
  • Processor.process_page_file: process an OcrdFile, with default implementation
  • Processor.process_page_pcgts: process a single OcrdPage, produce a single OcrdPage, required to implement
  • Processor.verify: handle fileGrp cardinality verification, with default implementation
  • Processor.setup: to set up processor before processing, optional

v2.67.1

17 Jul 10:18
@kba kba
Compare
Choose a tag to compare

Fixed:

  • Build and tests fixed, no functional changes from #1258

v2.67.0

16 Jul 17:42
@kba kba
Compare
Choose a tag to compare

Changed:

  • Additional docker base images with preinstalled tensorflow 1 (core-cuda-tf1), tensorflow 2 (core-cuda-tf2) and torch (core-cuda-torch), #1239
  • Resource Manager: Skip instead of raise an exception download if target file already exists (unless --overwrite), #1246
  • Resource Manager: Try to use bundled ocrd-all-tool.json if available, #1250, OCR-D/all#444

Added:

  • ocrd process does support -U/--mets-server, #1243

Fixed:

  • ocrd process-derived tasks are not run in a temporary directory when not called from within workspace, #1243
  • regression from #1238 where processors failed that had required parameters, #1255, #1256
  • METS Server: Unlink UDS sockert file if it exists before startup, #1244
  • Resource Manager: Do not create zero-size files for failing downloads, #1201, #1246
  • Workspace.add_file: Allow multiple processors to create file group folders simultaneously, #1203, #1253
  • Resource Manager: Do not try to run --dump-json for known non-processors ocrd-{cis-data,import,make}, #1218, #1249
  • Resource Manager: Properly handle copying of directories, #1237, #1248
  • bashlib: regression in parsing JSON from introducing parameter preset files, #1258

Removed:

  • Defaults for -I/--input-file-grp/-O/--output-file-grp, #1256, #274

v2.66.1

01 Jul 16:11
@kba kba
Compare
Choose a tag to compare

Fixed:

  • GHA Docker: build docker.io first, then tag ghcr.io

v2.66.0

07 Jun 14:05
@kba kba
Compare
Choose a tag to compare

Fixed:

  • OcrdFile.url can now be removed properly, #1226, #1227
  • ocrd workspace find --undo-download: Only remove file refs if it's an actual download, #1150, #1235
  • ocrd workspace find --undo-download: When --keep-files is not set, remove file from disk, #1150, #1235
  • OCRD_LOGGING_DEBUG: Normalize/lowercase boolean values, #1230, #1231
  • Workspace.download_file: Use Ocrd.local_filename if set but not already present in the FS, #1149, #1228

Changed:

Added:

  • Separate docker versions for tensorflow v1, tensorflow v2 and torch, #1186
  • Processing server can serve as a proxy for METS Server TCP requests, forwarding to UDS, #1220
  • ocrd workspace clean to remove "untracked", i.e. not METS-referenced, files, #1150, #1236
  • -p now supports parameter preset resources in addition to raw JSON and absolute/relative paths to JSON files, #930, #969, #1238

v2.65.0

03 May 16:57
@kba kba
Compare
Choose a tag to compare

Fixed:

  • bashlib processors will download on-demand, like pythonic processors do, #1216, #1217

Changed:

  • Replace distutils which equivalents from shutil for compatibility with python 3.12+, #1219
  • CI: Updated GitHub actions, #1206
  • CI: Fixed scrutinizer, #1217

Added:

  • Integration tests for ocrd_network, #1184

v2.64.1

22 Apr 12:43
@kba kba
Compare
Choose a tag to compare

Fixed:

  • Broken PyPI release

v2.64.0

22 Apr 12:14
@kba kba
Compare
Choose a tag to compare

Removed:

  • Support for Python <= 3.7, #1207

Fixed:

v2.63.3

13 Mar 13:02
@kba kba
Compare
Choose a tag to compare

Added:

  • make uninstall-workaround compantion to make install-workaround, #1198

Fixed:

  • OcrdMets.add_file: fix finding existing el_pagediv, #1199