Releases: Birch-san/mecab
Meanings, dictionary lookup
Agglutination of mecab tokens into words
Uses Kimtaro's Ve (ported to JS) to agglutinate Mecab tokens into words. so instead of 見 た
, we get 見た
.
Progressive web app + refactored into ES6 classes
App now has a webmanifest and service worker, so it can be saved as a PWA and used offline (see https://birchlabs.co.uk/mecab-web/)
Decoupled the useful parts into classes; this brings it closer to being a library
Embedded dictionary, furigana fitting, Preact
Deployed: https://birchlabs.co.uk/mecab-web/
The view is now managed by Preact, htm and Unistore. The project now uses ES modules.
You can click on words to lookup their definition (thanks to EDICT2 and ENAMDICT).
MeCab's de-conjugation of words is now exposed (and used for dictionary lookup).
Better furigana (thanks to KANJIDIC). New algorithm for fitting furigana given knowledge of kanjis' possible readings.
Archive contains:
.htaccess
index.html
src/**.*
style.css
LICENSE
licenses.html
mecab.data
mecab.js
mecab.wasm
edict2.utf8.txt
enamdict.utf8.txt
kanjidic2-lf.utf8.txt
package.json
package-lock.json
web_modules/**.*
You can run just by opening index.html.
It's even possible to serve directly from filesystem (no web server).
You can npm install if you want to benefit from the source maps in web_modules.
You can use the .htaccess
and serve from a web server if you want to optimize the file transfer.
If you want good performance serving these files, I recommend making and uploading gzipped distributions of the files alongside the originals. The .htaccess
will tell Apache to use your pre-compressed files instead of compressing them anew every time.
gzip -kf mecab.data mecab.wasm mecab.js edict2.utf8.txt enamdict.utf8.txt kanjidic2-lf.utf8.txt
Initial release including WebAssembly bytecode
Includes a build of MeCab compiled to WebAssembly, plus the NAIST-jdic dictionary. Also includes mecab-web, and the WanaKana library which supports mecab-web.
index.html // my helper page with a form to invoke functionality from MeCab + WanaKana
lib/wanakana.min.js // Wanakana (transliteration, additional tokenization, classification)
mecab.js // bootstraps WASM, exports functionality, handles lifecycle, preloads assets
mecab.wasm // the compiled MeCab CLI executable (incl. libmecab)
mecab.data // preloaded assets (mostly the NAIST-jdic dictionary)