Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allgemeine Anbindung an SRU-Schnittstellen von ALMA-Bibliotheken/Verbünden #169

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
/bnb/BNBDaten/*
/dist/phpcs.phar
*~
*.bak
8 changes: 6 additions & 2 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,12 @@ <h1>malibu - Mannheim library utilities</h1>
<ul>
<li><b>Suche nach einer ISBN</b> in verschiedenen Verbunddatenbanken und weitere Informationen etwa von Buchhändlern: <a href="isbn/suche.html">isbn/suche.html</a><br/></li>
<li>Recherchewerkzeug für die <b>BNB weekly</b>: <a href="bnb/recherche.php">bnb/recherche.php</a><br/></li>
<li><b>Bestandsabgleich</b> anhand einer Liste von ISBNs: <a href="tools/bestandsabgleich.html">tools/bestandsabgleich.html</a> speziell für den Bestand der UB Mannheim oder <a href="tools/bestandsabgleichA.html">tools/bestandsabgleichA.html</a> allgemein.</li>
<li><b>PPN-Liste anreichern</b> mit Werten eines Feldes (z.B. ISBN, Bestand, RVK, Schlagwörter) aus dem gleichen Verbund: <a href="tools/ppnListe.html">tools/ppnListe.html</a></li>
<li><b>Bestandsabgleich</b> anhand einer Liste von ISBNs:<ul>
<li>für den Bestand der UB Mannheim: <a href="tools/bestandsabgleich.html">tools/bestandsabgleich.html</a></li>
<li>für die Bestandsdaten in einem Verbund: <a href="tools/bestandsabgleichA.html">tools/bestandsabgleichA.html</a></li>
<li>für Bibliotheken mit einer ALMA-SRU-Schnittstelle: <a href="tools/bestandsabgleichB.html">tools/bestandsabgleichB.html</a></li></ul>
</li>
<li><b>PPN/ISBN-Liste anreichern</b> mit Werten eines Feldes (z.B. ISBN, Bestand, RVK, Schlagwörter) aus dem gleichen Verbund: <a href="tools/ppnListe.html">tools/ppnListe.html</a></li>
</ul>
<p>malibu ist Open Source Software. Den Quellcode findet man auf GitHub zusammen mit der Möglichkeit eigene Issues oder Pull Requests zu stellen: <a href="https://github.com/UB-Mannheim/malibu/">https://github.com/UB-Mannheim/malibu</a></p>
<p>Ansprechperson: <a href="https://www.bib.uni-mannheim.de/ihre-ub/ansprechpersonen/dr-philipp-zumstein/">Philipp Zumstein</a></p>
Expand Down
285 changes: 285 additions & 0 deletions isbn/alma-sru.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,285 @@
<?php
/*
* Source: https://github.com/UB-Mannheim/malibu/
*
* Copyright (C) 2024 Universitätsbibliothek Mannheim
*
* Author:
* Philipp Zumstein <[email protected]>
*
* This is free software licensed under the terms of the GNU GPL,
* version 3, or (at your option) any later version.
* See <http://www.gnu.org/licenses/> for more details.
*
* Aufruf aus Webbrowser:
* alma-sru?isbn=ISBN
* ISBN ist eine 10- bzw. 13-stellige ISBN mit/ohne Bindestriche/Leerzeichen
* ISBN kann ebenfalls eine Komma-separierte Liste von ISBNs sein
* alma-sru?bibliothek=BIB&isbn=ISBN&format=json
* alma-sru?bibliothek=BIB&isbn=ISBN&format=holdings
* alma-sru?bibliothek=BIB&isbn=ISBN&format=holdings&with=collections
*
* Sucht übergebene ISBN bzw. PPN in der SRU-Schnittstelle einer Alma-Bibliothek
* und gibt maximal 10 Ergebnisse als MARCXML, JSON zurück oder eine
* formattierte Bestandsangabe (eine kurze Zeile und die Details in einer
* Tabelle).
*/

include 'conf.php';
include 'lib.php';

$file = file_get_contents('./srulibraries.json');
$json = json_decode($file, true);
if (isset($_GET['bibliothek']) and isset($json[$_GET['bibliothek']])) {
$urlBase = $json[$_GET['bibliothek']]['sru'];
} else {
echo "Bibliothek nicht gefunden in der Liste der bekannten Alma-SRU-Schnittstellen.\n";
zuphilip marked this conversation as resolved.
Show resolved Hide resolved
exit;
}

$urlBase = $urlBase . '?version=1.2&operation=searchRetrieve&recordSchema=marcxml&query=';

if (!isset($_GET['ppn']) and !isset($_GET['isbn'])) {
echo "Weder isbn noch ppn Parameter für eine Suche angegeben.\n";
exit;
}

$suchString = '';

if (isset($_GET['ppn'])) {
$n = trim($_GET['ppn']);
$searchObject = "ppn";
}
if (isset($_GET['isbn'])) {
$n = trim($_GET['isbn']);
$searchObject = "isbn";
}
$nArray = preg_split("/\s*(or|,|;)\s*/i", $n, -1, PREG_SPLIT_NO_EMPTY);
$suchString = 'alma.all=' . implode('+OR+alma.all=', $nArray);
$filteredSuchString = 'alma.mms_tagSuppressed=false' . '+AND+(' . $suchString . ')';

# work around ExLibris server configuration issue
# and increase timeout (i.e. waiting time)
$contextOptions = [
'ssl' => [
'verify_peer' => true,
'ciphers' => 'DEFAULT@SECLEVEL=1',
],
'http' => [
'timeout' => 10,
],
];
$context = stream_context_create($contextOptions);
$result = file_get_contents($urlBase . $filteredSuchString, false, $context);

if ($result === false) {
header('HTTP/1.1 400 Bad Request');
echo "Verbindung zu SRU-Schnittstelle fehlgeschlagen\n";
var_dump($urlBase . $filteredSuchString);
exit;
}

// Delete namespaces such that we don't need to specify them
// in every xpath query.
$result = str_replace(' xmlns:xs="http://www.w3.org/2001/XMLSchema"', '', $result);
$result = str_replace(' xmlns="http://www.loc.gov/MARC21/slim"', '', $result);

$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
@$doc->loadHTML($result);
$xpath = new DOMXPath($doc);

$records = $xpath->query("//records/record/recorddata/record"); //beachte: kein CamelCase sondern alles klein schreiben

$outputString = "<?xml version=\"1.0\"?>\n";
$outputString .= "<collection>\n";
$outputArray = [];

foreach ($records as $record) {
// Filter out any other results which contain the ISBN but not in the 020 or 776 field
// or the PPN in the 001 or 035 field(s).
$pattern = [
"isbn" => './/datafield[@tag="020" or @tag="776"]/subfield',
"ppn" => './/controlfield[@tag="001"]|.//datafield[@tag="035"]/subfield'
];
$foundMatch = false;
$nodes = $xpath->query($pattern[$searchObject], $record);
foreach ($nodes as $foundNode) {
$foundValue = $foundNode->nodeValue;
foreach ($nArray as $queryValue) {
$testString = $queryValue;
if ($searchObject == "isbn") {
$testString = preg_replace('/[^0-9xX]/', '', $testString);
$foundValue = preg_replace('/[^0-9xX]/', '', $foundValue);
if (strlen($testString) == 13) {
// Delete the 978-prefix and the check value at the end for ISBN13
$testString = substr($testString, 3, -1);
} elseif (strlen($testString) == 10) {
// Delete check value at the end for ISBN10
$testString = substr($testString, 0, -1);
}
// for isbn, check that the test string is part of the found value
if (strpos($foundValue, $testString) !== false) {
$foundMatch = true;
}
} else {
// for ppn (or other ids), skip the possible prefix in paranthesis and then they need to be exactly the same
$foundValue = preg_replace('/^\(.*\)/', '', $foundValue);
if ($foundValue == $testString) {
$foundMatch = true;
}
}
}
}
if ($foundMatch) {
$outputString .= $doc->saveXML($record);
array_push($outputArray, $doc->saveXML($record));
}
}
$outputString .= "</collection>";


$map = STANDARD_MARC_MAP;
$map['bestand'] = '//datafield[@tag="AVA"]/subfield[@code="b"]';
$map['sammlung'] = '//datafield[@tag="AVE"]/subfield[@code="m"]';
// TODO Prüfen ob man die SW nicht allgemeingültig so wie folgt behandeln könnte
// (Feld 689 wird von HBZ und SWISS genutzt und Feld 650 von SWISS;
// Unterfeld 2 hat nur SWISS mit "gnd" gefüllt; aber alle nutzen Unterfeld
// 0 zur Verlinkung mit der GND beginnend mit "(DE-588)". Aber unklar wie dies
// etwa bei Formschlagwörtern ohne Verlinkung aussieht.)
$map['sw'] = array(
'mainPart' => '//datafield[starts-with(@tag,"6") and (starts-with(subfield[@code="0"],"(DE-588)") or subfield[@code="2"]="gnd")]',
'value' => './subfield[@code="a"]',
'subvalues' => './subfield[@code="b" or @code="t"]',
'additional' => './subfield[@code="g" or @code="z"]',
'key' => './subfield[@code="0" and contains(text(), "(DE-588)")]'
);

if (!isset($_GET['format'])) {
header('Content-type: text/xml');
echo $outputString;
} elseif ($_GET['format'] == 'json') {
$outputXml = simplexml_load_string($outputString);

$outputMap = performMapping($map, $outputXml);
$outputIndividualMap = [];
for ($j = 0; $j < count($outputArray); $j++) {
$outputXml = simplexml_load_string($outputArray[$j]);
$outputSingleMap = performMapping($map, $outputXml);
array_push($outputIndividualMap, $outputSingleMap);
}
$outputMap["einzelaufnahmen"] = $outputIndividualMap;


header('Content-type: application/json');
echo json_encode($outputMap, JSON_PRETTY_PRINT);
} elseif ($_GET['format'] == 'holdings') {
echo "<html>\n<head>\n <title>Bestand Alma-SRU zu ISBN-Suche</title>\n <meta http-equiv='content-type' content='text/html; charset=UTF-8' />\n <style type='text/css'>body { font-family: Arial, Verdana, sans-serif; }</style>\n</head>\n<body>\n";
$outputXml = simplexml_load_string($outputString);
$avaNodes = $outputXml->xpath('//datafield[@tag="AVA"]');
$aveNodes = $outputXml->xpath('//datafield[@tag="AVE"]');
$size = strlen($outputString);
if ($avaNodes) {
echo "<table>\n";
$bestand = [];
foreach ($avaNodes as $node) {
echo "<tr>\n";
$subfields = $node->xpath('./subfield');
foreach ($subfields as $subfield) {
$code = $subfield[0]["code"];
$value = getValues($subfield[0]);
echo " <td>" . $value . "</td>";
}
echo "\n</tr>\n";

$location = getValues($node->xpath('./subfield[@code="b"]')[0]);
$sublocation = getValues($node->xpath('./subfield[@code="c"]')[0]);

$node_f = $node->xpath('./subfield[@code="f"]');
$number = count($node_f) ? getValues($node_f[0]) : 0;
if (array_key_exists($location, $bestand)) {
$bestand[$location] += $number;
} else {
$bestand[$location] = $number;
}
}
echo "</table>\n";
echo "<hr/>\n";
if ($aveNodes) {
$collections = [];
echo "<table>\n";
foreach ($aveNodes as $node) {
echo "<tr>\n";
$subfields = $node->xpath('./subfield');
foreach ($subfields as $subfield) {
$code = $subfield[0]["code"];
$value = getValues($subfield[0]);
echo " <td>" . $value . "</td>";
}
echo "\n</tr>\n";
$collection = $node->xpath('./subfield[@code="m"]');
if ($collection) {
$collectionValue = getValues($collection[0]);
$availability = $node->xpath('./subfield[@code="e"]');
if ($availability and getValues($availability[0]) != "Available") {
$collectionValue .= " [" . getValues($availability[0]) . "]";
}
$collections[] = $collectionValue;
}
}
echo "</table>\n";
echo "<hr/>\n";
}


echo '<div>Bestand Alma-SRU: ';
foreach ($bestand as $loc => $n) {
echo $n . "x" . $loc . ", ";
}
if ($aveNodes) {
echo "E";
if ($_GET['with']) {
echo ' (' . implode(" | ", $collections) . ')';
}
}
echo '</div>';
} elseif ($aveNodes and !$avaNodes) {
echo "<table>\n";
$collections = [];
foreach ($aveNodes as $node) {
echo "<tr>\n";
$subfields = $node->xpath('./subfield');
foreach ($subfields as $subfield) {
$code = $subfield[0]["code"];
$value = getValues($subfield[0]);
echo " <td>" . $value . "</td>";
}
echo "\n</tr>\n";
$collection = $node->xpath('./subfield[@code="m"]');
if ($collection) {
$collectionValue = getValues($collection[0]);
$availability = $node->xpath('./subfield[@code="e"]');
if ($availability and getValues($availability[0]) != "Available") {
$collectionValue .= " [" . getValues($availability[0]) . "]";
}
$collections[] = $collectionValue;
}
}
echo "</table>\n";
echo "<hr/>\n";
echo '<div>Bestand Alma-SRU: E';
if ($_GET['with']) {
echo ' (' . implode(" | ", $collections) . ')';
}
echo '</div>';
} elseif ($size > 100) {
//if the isbn/ppn is not found, then the $outputString is a minimal xml document
//of size 48, for larger size something might be found...
$sruUrl = str_replace('format=holdings', '', $_SERVER['REQUEST_URI']);
echo '<div>Bestand Alma-SRU: eventuell da (' . $size . ")</div>\n";
echo '<table><tr><td><a href="' . $sruUrl . '" taget="_blank">See SRU Result</a></td></tr></table>';
} else {
echo 'Es wurde nichts gefunden';
}
echo "\n</body>\n</html>";
}
Loading
Loading