On November 21-23 2022, the "Thunder Hack" hackathon was held in our city and we, 5 students from the HSE university, joined a team and went to participate in this event. We were tasked with developing
a search engine mechanism on the supplier's portal
.
We were tasked with developing
a search engine mechanism on the supplier's portal
.
And we got two excel tables: first file — 49.6Mb, second file — 94.8Mb (the largest table had 400k rows and 10 columns 😰).
-
Solution: In our solution we used
meilisearch
library, which helps us to do search.- We have :
- frontend on
HTML
,CSS
,JavaScript -> ReactJS
- backend on
Java -> Spring Framework
- convertor
.xlsx
tojson
onPython
- frontend on
- Solution scheme :
- We have :
-
Features :
- real-time search
- any(random) register of characters
- search with multiple errors
-
Result (demonstration of multiple requests) :
- Request ->
бекант письменный стол
(small register)
- Request ->
jump мяч волейбольный
(contains different languages)
- Request ->
ЭлЕкТроГИтара
(random register)
- Request ->
ЕФРОСИНА ЛИТЕРАТУРА
(large register and errors in spelling of last name)
- Request ->
шлогбауум
(several errors)
- Request ->
If you want to install and run this mechanism, you should:
- clone repository
- install docker container
meilisearch
- run docker container
- add documents
- build and run program
$ git clone https://github.com/TheTeamOfCrowsFromHSE/search-engine.git
$ docker pull getmeili/meilisearch:v0.27.0
$ docker run -it --rm \
-p 7700:7700 \
-v $(pwd)/meili_data:/meili_data \
getmeili/meilisearch:v0.27.0 \
meilisearch --env="development"
If we want to add data that will be searched, we should add them to meilisearch.
❗ data is
json
file.
❗file name ismovies.json
.
$ curl \
-X POST 'http://localhost:7700/indexes/movies/documents' \
-H 'Content-Type: application/json' \
--data-binary @movies.json
P.S. If you receive an error message
{"message":"JSON payload (16178973 bytes)is larger than allowed (limit:2097152 bytes).","code":"internal","type":"internal","link":"https://docs.meilisearch.com/errors#internal"}
you should write the flag
--http-payload-size-limit=300000000
when running the docker container
After all the above steps, you can build and run the program.
After run program go to http://localhost:8080/
in browser.
If everything went well, you will see the cards:
P.S. In the future, if you want to run this mechanism, you should repeat steps:
run docker container
,
build and run program
.
search-engine
is distributed under the MIT License, on behalf of TheTeamOfCrowsFromHSE.