Releases: OCR-D/core
Releases · OCR-D/core
v3.0.0a2
Changed:
- 🔥
OcrdPage
as proxy ofPcGtsType
instead of alias; also containsetree
andmapping
now - 🔥
Processor.zip_input_files
now can throwocrd.NonUniqueInputFile
andocrd.MissingInputFile
(the latter only ifOCRD_MISSING_INPUT=ABORT
) - 🔥
Processor.zip_input_files
does not by default userequire_first
anymore
(so the first file in any input file tuple per page can beNone
as well) - 🔥 no more
Workspace.overwrite_mode
, merely delegate toOCRD_EXISTING_OUTPUT=OVERWRITE
- 🎨 improve on docs result for
ocrd_utils.config
Added:
- 👉
OCRD_DOWNLOAD_INPUT
for whether input files should be downloaded before processing - 👉
OCRD_MISSING_INPUT
for how to handle missing input files (SKIP
orABORT
) - 👉
OCRD_MISSING_OUTPUT
for how to handle processing failures (SKIP
orABORT
orCOPY
)
the latter behaves like ocrd-dummy for the failed page(s) - 👉
OCRD_EXISTING_OUTPUT
for how to handle existing output files (SKIP
orABORT
orOVERWRITE
) - new CLI option
--debug
as short-hand forABORT
choices above Processor.logger
set up by constructor already (for re-use by processor implementors)default
-expand and validateocrd_tool.json
inProcessor
constructor, log invalidities- handle JSON
deprecation
inocrd_tool.json
by reporting warnings
v3.0.0a1
#1240 for details
Changed:
- 🔥 Deprecate
Processor.process
- update spec to v3.25.0, which requires annotating fileGrp cardinality in
ocrd-tool.json
- 🔥 Remove passing non-processing kwargs to
Processor
constructor, add as members..
(i.e.show_help
,dump_json
,dump_module_dir
,list_resources
,show_resource
,resolve_resource
) - 🔥 Deprecate passing processing arg / kwargs to
Processor
constructor..
(i.e.workspace
,page_id
,input_file_grp
,output_file_grp
; now all set byrun_processor
) - 🔥 Deprecate passing
ocrd-tool.json
metadata toProcessor
constructor ocrd.processor
: Handle loading of bundledocrd-tool.json
generically
Added:
Processor.process_workspace
: process a complete workspace, with default implementationProcessor.process_page_file
: process an OcrdFile, with default implementationProcessor.process_page_pcgts
: process a single OcrdPage, produce a single OcrdPage, required to implementProcessor.verify
: handle fileGrp cardinality verification, with default implementationProcessor.setup
: to set up processor before processing, optional
v2.67.1
v2.67.0
Changed:
- Additional docker base images with preinstalled tensorflow 1 (
core-cuda-tf1
), tensorflow 2 (core-cuda-tf2
) and torch (core-cuda-torch
), #1239 - Resource Manager: Skip instead of raise an exception download if target file already exists (unless
--overwrite
), #1246 - Resource Manager: Try to use bundled
ocrd-all-tool.json
if available, #1250, OCR-D/all#444
Added:
ocrd process
does support-U/--mets-server
, #1243
Fixed:
ocrd process
-derived tasks are not run in a temporary directory when not called from within workspace, #1243- regression from #1238 where processors failed that had required parameters, #1255, #1256
- METS Server: Unlink UDS sockert file if it exists before startup, #1244
- Resource Manager: Do not create zero-size files for failing downloads, #1201, #1246
- Workspace.add_file: Allow multiple processors to create file group folders simultaneously, #1203, #1253
- Resource Manager: Do not try to run
--dump-json
for known non-processorsocrd-{cis-data,import,make}
, #1218, #1249 - Resource Manager: Properly handle copying of directories, #1237, #1248
- bashlib: regression in parsing JSON from introducing parameter preset files, #1258
Removed:
v2.66.1
v2.66.0
Fixed:
OcrdFile.url
can now be removed properly, #1226, #1227ocrd workspace find --undo-download
: Only remove file refs if it's an actual download, #1150, #1235ocrd workspace find --undo-download
: When--keep-files
is not set, remove file from disk, #1150, #1235OCRD_LOGGING_DEBUG
: Normalize/lowercase boolean values, #1230, #1231Workspace.download_file
: UseOcrd.local_filename
if set but not already present in the FS, #1149, #1228
Changed:
- Install ocrd with
pip --editable
inside Docker, #1225, OCR-D/ocrd_all#416 - Reduce log spam in ocrd_network, #1222
- CI: Stop testing for 3.7, #1207, #1221
Added:
- Separate docker versions for tensorflow v1, tensorflow v2 and torch, #1186
- Processing server can serve as a proxy for METS Server TCP requests, forwarding to UDS, #1220
ocrd workspace clean
to remove "untracked", i.e. not METS-referenced, files, #1150, #1236-p
now supports parameter preset resources in addition to raw JSON and absolute/relative paths to JSON files, #930, #969, #1238