From 4ac70906d82a07144672697bd87c19064b3626a3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jose=20Garc=C3=ADa?= Date: Tue, 2 Jan 2024 11:25:41 +0100 Subject: [PATCH] Improve GeoNetwork harvesters documentation pages --- .../harvesting/harvesting-geonetwork-20.md | 27 +++++++++++++ .../harvesting/harvesting-geonetwork-3x.md | 39 +++++++++++++++++++ .../harvesting/harvesting-geonetwork-4x.md | 39 +++++++++++++++++++ .../harvesting/harvesting-geonetwork.md | 3 -- .../docs/user-guide/harvesting/index.md | 4 +- docs/manual/mkdocs.yml | 4 +- 6 files changed, 111 insertions(+), 5 deletions(-) create mode 100644 docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-20.md create mode 100644 docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-3x.md create mode 100644 docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-4x.md delete mode 100644 docs/manual/docs/user-guide/harvesting/harvesting-geonetwork.md diff --git a/docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-20.md b/docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-20.md new file mode 100644 index 00000000000..280d91e77b6 --- /dev/null +++ b/docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-20.md @@ -0,0 +1,27 @@ +# GeoNetwork 2.0 Harvester {#gn2_harvester} + +GeoNetwork 2.1 introduced a new powerful harvesting engine which is not compatible with GeoNetwork version 2.0 based catalogues. Old 2.0 servers can still harvest from 2.1 servers but harvesting metadata from a v2.0 server requires this harvesting type. Due to the fact that GeoNetwork 2.0 was released more than 5 years ago, this harvesting type is deprecated. + +## Adding a GeoNetwork 2.0 Harvester + +Configuration options: + +- **Identification** - Options describing the remote site. + - *Name* - This is a short description of the remote site. It will be shown in the harvesting main page as the name for this instance of the harvester. + - *Group* - Group that owns the harvested metadata. + - *User* - User that owns the harvested metadata. +- **Schedule** - Schedule configuration to execute the harvester. +- **Configure connection to GeoNetwork**: + - *Catalog URL* - The URL of the GeoNetwork server from which metadata will be harvested. + - *Search filter* - This allows you to select metadata records for harvest based on certain criteria: + - *Full text* + - *Title* + - *Abstract* + - *Keyword* + - *Site id* - Identifier of the source to filter the metadata to harvest. + +- **Configure response processing** + - *Remote authentication* + - *Validate records before import* + +- **Privileges** - Assign privileges to harvested metadata. diff --git a/docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-3x.md b/docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-3x.md new file mode 100644 index 00000000000..bafc1375217 --- /dev/null +++ b/docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-3x.md @@ -0,0 +1,39 @@ +# GeoNetwork 2.1-3.x Harvester {#gn3_harvester} + +GeoNetwork 2.1 introduced a new powerful harvesting engine which is not compatible with GeoNetwork version 2.0 +based catalogues. To harvest GeoNetwork servers based on versions 2.1 or 3.x requires this harvesting type. + +## Adding a GeoNetwork 2.1-3.x Harvester + +Configuration options: + +- **Identification** - Options describing the remote site. + - *Name* - This is a short description of the remote site. It will be shown in the harvesting main page as the name for this instance of the harvester. + - *Group* - Group that owns the harvested metadata. + - *User* - User that owns the harvested metadata. +- **Schedule** - Schedule configuration to execute the harvester. +- **Configure connection to GeoNetwork**: + - *Catalog URL* - The URL of the GeoNetwork server from which metadata will be harvested. + - *Node name* - GeoNetwork node name to harvest, by default `srv`. + - *Search filter* - This allows you to select metadata records for harvest based on certain criteria: + - *Full text* + - *Title* + - *Abstract* + - *Keyword* + - *Custom criteria* - Allows to define whatever criteria are supported by the remote node and not available in the predefined filters (eg. `similarity` set to `1` for non fuzzy search). You may specify multiple criteria separated by `;` (eg. `_schema;siteId` with values `iso19139;7fc45be3-9aba-4198-920c-b8737112d522`). + - *Catalog* - Allows to select a source to filter the metadata to harvest. + +- **Configure response processing** + - *Action on UUID collision* - Allows to configure the action when a harvester finds the same uuid on a record collected by another method (another harvester, importer, dashboard editor,...). + - skipped (default) + - overriden + - generate a new UUID + - *Remote authentication* - User credentials to retrieved non-public metadata. + - *Use full MEF format* + - *Use change date for comparison* + - *Set category if it exists locally* + - *Category for harvested records* + - *XSL filter name to apply* + - *Validate records before import* + +- **Privileges** - Assign privileges to harvested metadata. diff --git a/docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-4x.md b/docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-4x.md new file mode 100644 index 00000000000..0e2b737d9b7 --- /dev/null +++ b/docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-4x.md @@ -0,0 +1,39 @@ +# GeoNetwork 4.x Harvester {#gn4_harvester} + +GeoNetwork 4.x changed the search engine to Elasticsearch, that is not compatible with previous versions. To harvest +a catalogue based on GeoNetwork 4.x requires this harvesting type. + +## Adding a GeoNetwork 4.x Harvester + +Configuration options: + +- **Identification** - Options describing the remote site. + - *Name* - This is a short description of the remote site. It will be shown in the harvesting main page as the name for this instance of the harvester. + - *Group* - Group that owns the harvested metadata. + - *User* - User that owns the harvested metadata. +- **Schedule** - Schedule configuration to execute the harvester. +- **Configure connection to GeoNetwork**: + - *Catalog URL* - The URL of the GeoNetwork server from which metadata will be harvested. + - *Node name* - GeoNetwork node name to harvest, by default `srv`. + - *Search filter* - This allows you to select metadata records for harvest based on certain criteria: + - *Full text* + - *Title* + - *Abstract* + - *Keyword* + - *Catalog* - Allows to select a source to filter the metadata to harvest. + +- **Configure response processing** + - *Action on UUID collision* - Allows to configure the action when a harvester finds the same uuid on a record collected by another method (another harvester, importer, dashboard editor,...). + - skipped (default) + - overriden + - generate a new UUID + - *Remote authentication* + - *Use full MEF format* + - *Use change date for comparison* + - *Set category if it exists locally* + - *Category for harvested records* + - *XSL filter name to apply* + - *Validate records before import* + +- **Privileges** - Assign privileges to harvested metadata. + diff --git a/docs/manual/docs/user-guide/harvesting/harvesting-geonetwork.md b/docs/manual/docs/user-guide/harvesting/harvesting-geonetwork.md deleted file mode 100644 index b3bbff7fb44..00000000000 --- a/docs/manual/docs/user-guide/harvesting/harvesting-geonetwork.md +++ /dev/null @@ -1,3 +0,0 @@ -# GeoNetwork 2.0 Harvester {#gn2_harvester} - -GeoNetwork 2.1 introduced a new powerful harvesting engine which is not compatible with GeoNetwork version 2.0 based catalogues. Old 2.0 servers can still harvest from 2.1 servers but harvesting metadata from a v2.0 server requires this harvesting type. Due to the fact that GeoNetwork 2.0 was released more than 5 years ago, this harvesting type is deprecated. diff --git a/docs/manual/docs/user-guide/harvesting/index.md b/docs/manual/docs/user-guide/harvesting/index.md index 46f52f782c5..01936a2e1b1 100644 --- a/docs/manual/docs/user-guide/harvesting/index.md +++ b/docs/manual/docs/user-guide/harvesting/index.md @@ -6,7 +6,9 @@ Harvesting is the process of ingesting metadata from remote sources and storing The following sources can be harvested: -- [GeoNetwork 2.0 Harvester](harvesting-geonetwork.md) +- [GeoNetwork 4.x Harvester](harvesting-geonetwork-4x.md) +- [GeoNetwork 2.1-3.x Harvester](harvesting-geonetwork-3x.md) +- [GeoNetwork 2.0 Harvester](harvesting-geonetwork-20.md) - [Harvesting CSW services](harvesting-csw.md) - [Harvesting OGC Services](harvesting-ogcwxs.md) - [Simple URL harvesting (opendata)](harvesting-simpleurl.md) diff --git a/docs/manual/mkdocs.yml b/docs/manual/mkdocs.yml index f418b887e91..eccd1e160f4 100644 --- a/docs/manual/mkdocs.yml +++ b/docs/manual/mkdocs.yml @@ -271,7 +271,9 @@ nav: - user-guide/harvesting/index.md - user-guide/harvesting/harvesting-csw.md - user-guide/harvesting/harvesting-filesystem.md - - user-guide/harvesting/harvesting-geonetwork.md + - user-guide/harvesting/harvesting-geonetwork-4x.md + - user-guide/harvesting/harvesting-geonetwork-3x.md + - user-guide/harvesting/harvesting-geonetwork-20.md - user-guide/harvesting/harvesting-geoportal.md - user-guide/harvesting/harvesting-oaipmh.md - user-guide/harvesting/harvesting-ogcwxs.md