Skip to content
This repository has been archived by the owner on Apr 10, 2019. It is now read-only.

Latest commit

 

History

History
46 lines (29 loc) · 1.58 KB

README.md

File metadata and controls

46 lines (29 loc) · 1.58 KB

maccha

maccha is a project that calculate sentence similarity by word mover's distance.

So far, only in Japanese.

Install

To install required modules, simply:

$ pip install -r requirements.txt

maccha needs to install NEologd. Please install it.

Setup

First, you should download word vector and vocabulary's dictionary and store them into data directory.

For downloading files, please access qiita_vectors.zip.

If you finish downloading the file, please unzip it into maccha/data.

Execution

Please run the test to see if it works correctly:

$ python -m unittest tests.word_mover

If following messages are displayed, everything is fine!

Distance between "JavaScript" and "JavaScript 2014" is 2.087188959121704.
Distance between "DexIndexOverflowExceptionと戦った話" and "AWS×Imagick×facedetectで困った話" is 2.034774008499384.
Distance between "ゆるっとローカル環境を作る" and "ローカル環境を作る。" is 0.0.
Distance between "PHP5.6のインストール" and "PHP5.4をインストール" is 0.0.

License

MIT

Contact

[email protected]