-
-
Notifications
You must be signed in to change notification settings - Fork 8
Install
Java and OpenRefine must be installed.
- Java 11 to 21 (see notes below and Java JDKs...)
- OpenRefine 3.6.x to 3.9-SNAPSHOT (as of 11/26/2024)
NOTE: The author has tested the current RDF Transform using OpenJDK 17.
NOTE: For Java Standard Editions after Java 8, you cannot install the JRE separate from the JDK unless you use a site like JustJ and their JRE Downloads. RDF Transform has not been tested using JustJ installs and is beyond the scope of this project.
Additionally, if you need to compile, you will need Maven.
- Java JDK 11 to 21
- Apache Maven 3.6 or better
- OpenRefine 3.x-SNAPSHOT Source (optional)
The compiled release file is the "Easy Button" to get RDF Transform installed as an extension to OpenRefine. Follow these instructions to get it running.
- If it does not exist, create a folder named extensions under your user workspace directory for OpenRefine. The workspace should be located in the following places depending on your operating system (see the OpenRefine FAQ for more details):
- Linux
~/.local/share/OpenRefine
- Windows
C:/Documents and Settings/<user>/Application Data/OpenRefine
ORC:/Documents and Settings/<user>/Local Settings/Application Data/OpenRefine
- Mac OSX
~/Library/Application Support/OpenRefine
- Linux
- Unzip the downloaded release (ensuring it is a
rdf-transform-x.x.x.zip
and not a source code.zip
or.tar.gz
) in the extensions folder (within the directory of step 1). This will create an rdf-transform directory containg the extension. - Start (or restart) OpenRefine (see the OpenRefine User Documentation)
NOTE: It is recommended that you have an active Internet connection when using the extension as it can download ontologies from specified namespaces (such as rdf, rdfs, owl and foaf) to supplement property selection. You can (re)add namespaces and specify whether to download the ontology (or not) from the namespace declaration URL. If you must run OpenRefine from an offline location, you can copy the ontologies to files in your offline space and use the "from file" feature to load the ontologies.
Source code...for those of you who want more depth...to ply the inner workings of OpenRefine. You still need to install it to test and debug any modifications, so here are those complete instructions.
NOTE: If you have previously installed the extension, you will need to replace it in the extensions directory with the newly built version, e.g., delete rdf-transform directory in the extensions directory and unzip the new file there.
In general, the Long
version will be needed for new installs.
In all cases, a JDK version must be installed and used as the default Java development version. Examples:
sudo apt install openjdk-17-jdk openjdk-17-jdk-headless openjdk-17-dbg openjdk-17-jre openjdk-17-jre-headless openjdk-17-doc
sudo apt install openjdk-21-jdk openjdk-21-jdk-headless openjdk-21-dbg openjdk-21-jre openjdk-21-jre-headless openjdk-21-doc
update-java-alternatives --list
sudo update-java-alternatives --set java-1.17.0-openjdk-amd64
Short:
git clone https://github.com/AtesComp/rdf-transform
cd rdf-transform
mvn clean
mvn dependency:resolve -U
mvn compile
mvn assembly:single
rm -rf ~/.local/share/openrefine/extensions/rdf-transform*
unzip target/rdf-transform-x.x.x.zip -d ~/.local/share/openrefine/extensions
~/path/to/openrefine/refine
Long:
mkdir OpenRefineExtensions
cd OpenRefineExtensions
git clone https://github.com/AtesComp/rdf-transform
cd ..
git clone https://github.com/OpenRefine/OpenRefine
cd OpenRefine
./refine clean
./refine build
mvn package -DskipTests=true
cd ../OpenRefineExtensions/rdf-transform
mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file -Dfile=../../OpenRefine/main/webapp/WEB-INF/lib/openrefine-main.jar -DcreateChecksum=true -DlocalRepositoryPath=./project-repository/
mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file -Dfile=../../OpenRefine/modules/core/target/openrefine-core.jar -DcreateChecksum=true -DlocalRepositoryPath=./project-repository/
mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file -Dfile=../../OpenRefine/modules/grel/target/openrefine-grel.jar -DcreateChecksum=true -DlocalRepositoryPath=./project-repository/
mvn clean
mvn dependency:resolve -U
mvn compile
mvn assembly:single
rm -rf ~/.local/share/openrefine/extensions/rdf-transform*
unzip target/rdf-transform-x.x.x.zip -d ~/.local/share/openrefine/extensions
cd ../../OpenRefine
./refine
A local project repository (see the "project-repository" directory) contains an OpenRefine jar file ready for use by the maven compile process. If you want or need to compile OpenRefine, see the Long Steps below to create the OpenRefine jar file.
- From some top level development directory, create a local repository for this RDF Transform extension:
- Clone the extension at the top level development directory where you want the /rdf-transform sub-directory:
git clone https://github.com/AtesComp/rdf-transform
- Clone the extension at the top level development directory where you want the /rdf-transform sub-directory:
- Compile the RDF Transform extension:
- Change directories to the RDF Transform extension:
cd rdf-transform
- Clean the extension:
mvn clean
- Update extension's dependencies:
mvn dependency:resolve -U
- Compile the extension's dev environment:
mvn compile
- Assemble the extension:
mvn assembly:single
- Copy and unzip the
target/rdf-transform-x.x.x.zip
file in the extensions directory as documented in From Compiled Release above
- Change directories to the RDF Transform extension:
Sometimes you just have to do everything yourself. If you want or need to compile OpenRefine, then you'll probably want to create the jar file for RDF Transform to match. From the Short Steps, you'll notice these instructions have several inserted steps.
- From some top level development directory, create a local repository for the RDF Transform extension:
- At the top level development directory, make the OpenRefine Extensions directory--where you want the /rdf-transform sub-directory:
mkdir OpenRefineExtensions
- Change directories to the OpenRefine Extensions directory:
cd OpenRefineExtensions
- To create a new clone of the RDF Transform extension:
- Clone the extension in the the OpenRefine Extensions directory--where you want the /rdf-transform sub-directory:
git clone https://github.com/AtesComp/rdf-transform
- Change directories to the top level development directory:
cd ..
- Clone the extension in the the OpenRefine Extensions directory--where you want the /rdf-transform sub-directory:
- Alternatively, to update an existing clone, in the /rdf-transform directory:
- Change directories to the RDF Transform development directory:
cd rdf-transform
- Update the code:
-
git pull
(orgit fetch --all; git reset --hard; git pull
for a forced refresh)
-
- Change directories to the top level development directory:
cd ../..
- Change directories to the RDF Transform development directory:
- At the top level development directory, make the OpenRefine Extensions directory--where you want the /rdf-transform sub-directory:
- Prepare the OpenRefine jar file:
- Clone OpenRefine from the same top level development directory to create a local repository:
git clone https://github.com/OpenRefine/OpenRefine
- Create the OpenRefine jars:
- Change directories to OpenRefine:
cd OpenRefine
- Clean OpenRefine's dev environment:
./refine clean
- Build OpenRefine:
./refine build
- Build the OpenRefine jar:
-
mvn package -DskipTests=true
(builds the current version)
- Among many other things, this builds the needed jar files:
- openrefine-main.jar
- openrefine-core.jar
- openrefine-grel.jar
-
- Change directories up one level:
cd ..
- Change directories to OpenRefine:
- Clone OpenRefine from the same top level development directory to create a local repository:
- Process the OpenRefine jar files for the RDF Transform extension:
- Change directories to the RDF Transform extension:
cd OpenRefineExtensions/rdf-transform
- Adjust the
pom.xml
file to use the proper OpenRefine version ID - Install the OpenRefine jars in the Maven library for RDF Transform:
mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file -Dfile=../../OpenRefine/main/webapp/WEB-INF/lib/openrefine-main.jar -DcreateChecksum=true -DlocalRepositoryPath=./project-repository/
mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file -Dfile=../../OpenRefine/modules/core/target/openrefine-core.jar -DcreateChecksum=true -DlocalRepositoryPath=./project-repository/
mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file -Dfile=../../OpenRefine/modules/grel/target/openrefine-grel.jar -DcreateChecksum=true -DlocalRepositoryPath=./project-repository/
- Change directories to the RDF Transform extension:
- Compile the RDF Transform extension:
- Clean the extension:
mvn clean
- Update the extension's dependencies:
mvn dependency:resolve -U
- Compile the extension:
mvn compile
- Assemble the extension:
mvn assembly:single
- Copy and unzip the
target/rdf-transform-x.x.x.zip
file in the extensions directory as documented in From Compiled Release above
- Clean the extension:
Java does not supply simple JRE installs for versions after 8 (1.8), so you might want to create your own. However, many Linux distribution provide a JRE package install that are fairly inclusive and may work with OpenRefine...YMMV.
You can create your own JRE from a late model JDK install (9+) that includes all the required modules by performing the following command:
jlink --compress=2 --strip-debug --add-modules=java.base,java.compiler,java.datatransfer,java.logging,java.desktop,java.instrument,java.management,java.management.rmi,java.naming,java.net.http,java.prefs,java.rmi,java.scripting,java.se,java.security.jgss,java.security.sasl,java.smartcardio,java.sql,java.sql.rowset,java.transaction.xa,java.xml,java.xml.crypto --output ~/JRE
Change the output ~/JRE
to whatever directory you like (it will create it if it doesn't exist).
The --add-modules
parameters get its modules from:
java --list-modules
using whatever Java version you have currently selected. The jlink
command is just using the listed "java" modules and ignoring the "jdk" modules.
You can run OpenRefine using this newly created JRE directory by setting the JAVA_HOME environment variable to it and running the OpenRefine script file. A one-liner for Linux, while in the OpenRefine directory, is:
JAVA_HOME=~/JRE ./refine
To recreate the JRE, remove the JRE directory, adjust the jlink
command, and re-execute it. For Linux, to remove the JRE directory, do:
rm -rf ~/JRE