Skip to content

Install

AtesComp edited this page Nov 27, 2024 · 21 revisions

Prerequisites

Java and OpenRefine must be installed.

  • Java 11 to 21 (see notes below and Java JDKs...)
  • OpenRefine 3.6.x to 3.9-SNAPSHOT (as of 11/26/2024)

NOTE: The author has tested the current RDF Transform using OpenJDK 17.
NOTE: For Java Standard Editions after Java 8, you cannot install the JRE separate from the JDK unless you use a site like JustJ and their JRE Downloads. RDF Transform has not been tested using JustJ installs and is beyond the scope of this project.

Additionally, if you need to compile, you will need Maven.

  • Java JDK 11 to 21
  • Apache Maven 3.6 or better
  • OpenRefine 3.x-SNAPSHOT Source (optional)

From Compiled Release

The compiled release file is the "Easy Button" to get RDF Transform installed as an extension to OpenRefine. Follow these instructions to get it running.

  1. If it does not exist, create a folder named extensions under your user workspace directory for OpenRefine. The workspace should be located in the following places depending on your operating system (see the OpenRefine FAQ for more details):
    • Linux ~/.local/share/OpenRefine
    • Windows C:/Documents and Settings/<user>/Application Data/OpenRefine OR C:/Documents and Settings/<user>/Local Settings/Application Data/OpenRefine
    • Mac OSX ~/Library/Application Support/OpenRefine
    As an alternative (but not recommended), use the OpenRefine application's extensions directory instead.
  2. Unzip the downloaded release (ensuring it is a rdf-transform-x.x.x.zip and not a source code .zip or .tar.gz) in the extensions folder (within the directory of step 1). This will create an rdf-transform directory containg the extension.
  3. Start (or restart) OpenRefine (see the OpenRefine User Documentation)

NOTE: It is recommended that you have an active Internet connection when using the extension as it can download ontologies from specified namespaces (such as rdf, rdfs, owl and foaf) to supplement property selection. You can (re)add namespaces and specify whether to download the ontology (or not) from the namespace declaration URL. If you must run OpenRefine from an offline location, you can copy the ontologies to files in your offline space and use the "from file" feature to load the ontologies.

From Source - Build

Source code...for those of you who want more depth...to ply the inner workings of OpenRefine. You still need to install it to test and debug any modifications, so here are those complete instructions.

NOTE: If you have previously installed the extension, you will need to replace it in the extensions directory with the newly built version, e.g., delete rdf-transform directory in the extensions directory and unzip the new file there.

TL;DR:

In general, the Long version will be needed for new installs.

In all cases, a JDK version must be installed and used as the default Java development version. Examples:

sudo apt install openjdk-17-jdk openjdk-17-jdk-headless openjdk-17-dbg openjdk-17-jre openjdk-17-jre-headless openjdk-17-doc
sudo apt install openjdk-21-jdk openjdk-21-jdk-headless openjdk-21-dbg openjdk-21-jre openjdk-21-jre-headless openjdk-21-doc

update-java-alternatives --list
sudo update-java-alternatives --set java-1.17.0-openjdk-amd64

Short:

git clone https://github.com/AtesComp/rdf-transform
cd rdf-transform
mvn clean
mvn dependency:resolve -U
mvn compile
mvn assembly:single
rm -rf ~/.local/share/openrefine/extensions/rdf-transform*
unzip target/rdf-transform-x.x.x.zip -d ~/.local/share/openrefine/extensions
~/path/to/openrefine/refine

Long:

mkdir OpenRefineExtensions
cd OpenRefineExtensions
git clone https://github.com/AtesComp/rdf-transform
cd ..
git clone https://github.com/OpenRefine/OpenRefine
cd OpenRefine
./refine clean
./refine build
mvn package -DskipTests=true

cd ../OpenRefineExtensions/rdf-transform
mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file -Dfile=../../OpenRefine/main/webapp/WEB-INF/lib/openrefine-main.jar -DcreateChecksum=true -DlocalRepositoryPath=./project-repository/
mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file -Dfile=../../OpenRefine/modules/core/target/openrefine-core.jar -DcreateChecksum=true -DlocalRepositoryPath=./project-repository/
mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file -Dfile=../../OpenRefine/modules/grel/target/openrefine-grel.jar -DcreateChecksum=true -DlocalRepositoryPath=./project-repository/
mvn clean
mvn dependency:resolve -U
mvn compile
mvn assembly:single
rm -rf ~/.local/share/openrefine/extensions/rdf-transform*
unzip target/rdf-transform-x.x.x.zip -d ~/.local/share/openrefine/extensions
cd ../../OpenRefine
./refine

Short Steps

A local project repository (see the "project-repository" directory) contains an OpenRefine jar file ready for use by the maven compile process. If you want or need to compile OpenRefine, see the Long Steps below to create the OpenRefine jar file.

  1. From some top level development directory, create a local repository for this RDF Transform extension:
    • Clone the extension at the top level development directory where you want the /rdf-transform sub-directory:
      • git clone https://github.com/AtesComp/rdf-transform
  2. Compile the RDF Transform extension:
    • Change directories to the RDF Transform extension:
      • cd rdf-transform
    • Clean the extension:
      • mvn clean
    • Update extension's dependencies:
      • mvn dependency:resolve -U
    • Compile the extension's dev environment:
      • mvn compile
    • Assemble the extension:
      • mvn assembly:single
    • Copy and unzip the target/rdf-transform-x.x.x.zip file in the extensions directory as documented in From Compiled Release above

Long Steps

Sometimes you just have to do everything yourself. If you want or need to compile OpenRefine, then you'll probably want to create the jar file for RDF Transform to match. From the Short Steps, you'll notice these instructions have several inserted steps.

  1. From some top level development directory, create a local repository for the RDF Transform extension:
    • At the top level development directory, make the OpenRefine Extensions directory--where you want the /rdf-transform sub-directory:
      • mkdir OpenRefineExtensions
    • Change directories to the OpenRefine Extensions directory:
      • cd OpenRefineExtensions
    • To create a new clone of the RDF Transform extension:
      • Clone the extension in the the OpenRefine Extensions directory--where you want the /rdf-transform sub-directory:
        • git clone https://github.com/AtesComp/rdf-transform
      • Change directories to the top level development directory:
        • cd ..
    • Alternatively, to update an existing clone, in the /rdf-transform directory:
      • Change directories to the RDF Transform development directory:
        • cd rdf-transform
      • Update the code:
        • git pull (or git fetch --all; git reset --hard; git pull for a forced refresh)
      • Change directories to the top level development directory:
        • cd ../..
  2. Prepare the OpenRefine jar file:
    • Clone OpenRefine from the same top level development directory to create a local repository:
      • git clone https://github.com/OpenRefine/OpenRefine
    • Create the OpenRefine jars:
      • Change directories to OpenRefine:
        • cd OpenRefine
      • Clean OpenRefine's dev environment:
        • ./refine clean
      • Build OpenRefine:
        • ./refine build
      • Build the OpenRefine jar:
        • mvn package -DskipTests=true (builds the current version)
        • Among many other things, this builds the needed jar files:
          • openrefine-main.jar
          • openrefine-core.jar
          • openrefine-grel.jar
      • Change directories up one level:
        • cd ..
  3. Process the OpenRefine jar files for the RDF Transform extension:
    • Change directories to the RDF Transform extension:
      • cd OpenRefineExtensions/rdf-transform
    • Adjust the pom.xml file to use the proper OpenRefine version ID
    • Install the OpenRefine jars in the Maven library for RDF Transform:
      • mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file -Dfile=../../OpenRefine/main/webapp/WEB-INF/lib/openrefine-main.jar -DcreateChecksum=true -DlocalRepositoryPath=./project-repository/
      • mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file -Dfile=../../OpenRefine/modules/core/target/openrefine-core.jar -DcreateChecksum=true -DlocalRepositoryPath=./project-repository/
      • mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file -Dfile=../../OpenRefine/modules/grel/target/openrefine-grel.jar -DcreateChecksum=true -DlocalRepositoryPath=./project-repository/
  4. Compile the RDF Transform extension:
    • Clean the extension:
      • mvn clean
    • Update the extension's dependencies:
      • mvn dependency:resolve -U
    • Compile the extension:
      • mvn compile
    • Assemble the extension:
      • mvn assembly:single
    • Copy and unzip the target/rdf-transform-x.x.x.zip file in the extensions directory as documented in From Compiled Release above

Java JDKs, JREs, and JVMs! Oh, My!

Java does not supply simple JRE installs for versions after 8 (1.8), so you might want to create your own. However, many Linux distribution provide a JRE package install that are fairly inclusive and may work with OpenRefine...YMMV.

You can create your own JRE from a late model JDK install (9+) that includes all the required modules by performing the following command:

jlink --compress=2 --strip-debug --add-modules=java.base,java.compiler,java.datatransfer,java.logging,java.desktop,java.instrument,java.management,java.management.rmi,java.naming,java.net.http,java.prefs,java.rmi,java.scripting,java.se,java.security.jgss,java.security.sasl,java.smartcardio,java.sql,java.sql.rowset,java.transaction.xa,java.xml,java.xml.crypto --output ~/JRE

Change the output ~/JRE to whatever directory you like (it will create it if it doesn't exist).

The --add-modules parameters get its modules from:

java --list-modules

using whatever Java version you have currently selected. The jlink command is just using the listed "java" modules and ignoring the "jdk" modules.

You can run OpenRefine using this newly created JRE directory by setting the JAVA_HOME environment variable to it and running the OpenRefine script file. A one-liner for Linux, while in the OpenRefine directory, is:

JAVA_HOME=~/JRE ./refine

To recreate the JRE, remove the JRE directory, adjust the jlink command, and re-execute it. For Linux, to remove the JRE directory, do:

rm -rf ~/JRE
Clone this wiki locally