Skip to content

Commit

Permalink
Improve GeoNetwork harvesters documentation pages
Browse files Browse the repository at this point in the history
  • Loading branch information
josegar74 committed Jan 2, 2024
1 parent 31dc0c4 commit 4ac7090
Show file tree
Hide file tree
Showing 6 changed files with 111 additions and 5 deletions.
27 changes: 27 additions & 0 deletions docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-20.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# GeoNetwork 2.0 Harvester {#gn2_harvester}

GeoNetwork 2.1 introduced a new powerful harvesting engine which is not compatible with GeoNetwork version 2.0 based catalogues. Old 2.0 servers can still harvest from 2.1 servers but harvesting metadata from a v2.0 server requires this harvesting type. Due to the fact that GeoNetwork 2.0 was released more than 5 years ago, this harvesting type is deprecated.

## Adding a GeoNetwork 2.0 Harvester

Configuration options:

- **Identification** - Options describing the remote site.
- *Name* - This is a short description of the remote site. It will be shown in the harvesting main page as the name for this instance of the harvester.
- *Group* - Group that owns the harvested metadata.
- *User* - User that owns the harvested metadata.
- **Schedule** - Schedule configuration to execute the harvester.
- **Configure connection to GeoNetwork**:
- *Catalog URL* - The URL of the GeoNetwork server from which metadata will be harvested.
- *Search filter* - This allows you to select metadata records for harvest based on certain criteria:
- *Full text*
- *Title*
- *Abstract*
- *Keyword*
- *Site id* - Identifier of the source to filter the metadata to harvest.

- **Configure response processing**
- *Remote authentication*
- *Validate records before import*

- **Privileges** - Assign privileges to harvested metadata.
39 changes: 39 additions & 0 deletions docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-3x.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# GeoNetwork 2.1-3.x Harvester {#gn3_harvester}

GeoNetwork 2.1 introduced a new powerful harvesting engine which is not compatible with GeoNetwork version 2.0
based catalogues. To harvest GeoNetwork servers based on versions 2.1 or 3.x requires this harvesting type.

## Adding a GeoNetwork 2.1-3.x Harvester

Configuration options:

- **Identification** - Options describing the remote site.
- *Name* - This is a short description of the remote site. It will be shown in the harvesting main page as the name for this instance of the harvester.
- *Group* - Group that owns the harvested metadata.
- *User* - User that owns the harvested metadata.
- **Schedule** - Schedule configuration to execute the harvester.
- **Configure connection to GeoNetwork**:
- *Catalog URL* - The URL of the GeoNetwork server from which metadata will be harvested.
- *Node name* - GeoNetwork node name to harvest, by default `srv`.
- *Search filter* - This allows you to select metadata records for harvest based on certain criteria:
- *Full text*
- *Title*
- *Abstract*
- *Keyword*
- *Custom criteria* - Allows to define whatever criteria are supported by the remote node and not available in the predefined filters (eg. `similarity` set to `1` for non fuzzy search). You may specify multiple criteria separated by `;` (eg. `_schema;siteId` with values `iso19139;7fc45be3-9aba-4198-920c-b8737112d522`).
- *Catalog* - Allows to select a source to filter the metadata to harvest.

- **Configure response processing**
- *Action on UUID collision* - Allows to configure the action when a harvester finds the same uuid on a record collected by another method (another harvester, importer, dashboard editor,...).
- skipped (default)
- overriden
- generate a new UUID
- *Remote authentication* - User credentials to retrieved non-public metadata.
- *Use full MEF format*
- *Use change date for comparison*
- *Set category if it exists locally*
- *Category for harvested records*
- *XSL filter name to apply*
- *Validate records before import*

- **Privileges** - Assign privileges to harvested metadata.
39 changes: 39 additions & 0 deletions docs/manual/docs/user-guide/harvesting/harvesting-geonetwork-4x.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# GeoNetwork 4.x Harvester {#gn4_harvester}

GeoNetwork 4.x changed the search engine to Elasticsearch, that is not compatible with previous versions. To harvest
a catalogue based on GeoNetwork 4.x requires this harvesting type.

## Adding a GeoNetwork 4.x Harvester

Configuration options:

- **Identification** - Options describing the remote site.
- *Name* - This is a short description of the remote site. It will be shown in the harvesting main page as the name for this instance of the harvester.
- *Group* - Group that owns the harvested metadata.
- *User* - User that owns the harvested metadata.
- **Schedule** - Schedule configuration to execute the harvester.
- **Configure connection to GeoNetwork**:
- *Catalog URL* - The URL of the GeoNetwork server from which metadata will be harvested.
- *Node name* - GeoNetwork node name to harvest, by default `srv`.
- *Search filter* - This allows you to select metadata records for harvest based on certain criteria:
- *Full text*
- *Title*
- *Abstract*
- *Keyword*
- *Catalog* - Allows to select a source to filter the metadata to harvest.

- **Configure response processing**
- *Action on UUID collision* - Allows to configure the action when a harvester finds the same uuid on a record collected by another method (another harvester, importer, dashboard editor,...).
- skipped (default)
- overriden
- generate a new UUID
- *Remote authentication*
- *Use full MEF format*
- *Use change date for comparison*
- *Set category if it exists locally*
- *Category for harvested records*
- *XSL filter name to apply*
- *Validate records before import*

- **Privileges** - Assign privileges to harvested metadata.

This file was deleted.

4 changes: 3 additions & 1 deletion docs/manual/docs/user-guide/harvesting/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ Harvesting is the process of ingesting metadata from remote sources and storing

The following sources can be harvested:

- [GeoNetwork 2.0 Harvester](harvesting-geonetwork.md)
- [GeoNetwork 4.x Harvester](harvesting-geonetwork-4x.md)
- [GeoNetwork 2.1-3.x Harvester](harvesting-geonetwork-3x.md)
- [GeoNetwork 2.0 Harvester](harvesting-geonetwork-20.md)
- [Harvesting CSW services](harvesting-csw.md)
- [Harvesting OGC Services](harvesting-ogcwxs.md)
- [Simple URL harvesting (opendata)](harvesting-simpleurl.md)
Expand Down
4 changes: 3 additions & 1 deletion docs/manual/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,9 @@ nav:
- user-guide/harvesting/index.md
- user-guide/harvesting/harvesting-csw.md
- user-guide/harvesting/harvesting-filesystem.md
- user-guide/harvesting/harvesting-geonetwork.md
- user-guide/harvesting/harvesting-geonetwork-4x.md
- user-guide/harvesting/harvesting-geonetwork-3x.md
- user-guide/harvesting/harvesting-geonetwork-20.md
- user-guide/harvesting/harvesting-geoportal.md
- user-guide/harvesting/harvesting-oaipmh.md
- user-guide/harvesting/harvesting-ogcwxs.md
Expand Down

0 comments on commit 4ac7090

Please sign in to comment.