Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High memory usage ES on 180 mio index #49

Open
adibaba opened this issue Jan 23, 2023 · 7 comments
Open

High memory usage ES on 180 mio index #49

adibaba opened this issue Jan 23, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@adibaba
Copy link
Member

adibaba commented Jan 23, 2023

Config: New index dbpedia_wikidata_full with 179,706,494 entities
Issue: The server throws an error.

@adibaba
Copy link
Member Author

adibaba commented Jan 23, 2023

Check: High memory for ES process

top - 11:57:56 up 20 days, 23:08,  1 user,  load average: 0.28, 0.57, 0.34
Tasks:   9 total,   1 running,   8 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.2 us,  0.1 sy,  0.0 ni, 99.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  32112.4 total,    242.7 free,  16084.2 used,  15785.6 buff/cache
MiB Swap:   2048.0 total,      0.1 free,   2047.9 used.  15504.4 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                                       
   1398 wilke     20   0  122.7g  28.1g  13.1g S   0.0  89.7   4678:49 java                                                                                                                                          
   1426 wilke     20   0  108168   1320   1320 S   0.0   0.0   0:00.00 controller                                                                                                                                    
  96322 wilke     20   0   15412   6368   4944 S   0.0   0.0   0:16.81 systemd                                                                                                                                       
  96350 wilke     20   0   16972   1684   1532 S   0.0   0.0   0:04.23 screen                                                                                                                                        
  96351 wilke     20   0   18228   3104   3100 S   0.0   0.0   0:00.02 bash                                                                                                                                          
  96358 wilke     20   0    7296   2824   2820 S   0.0   0.0   0:00.00 run-webservice-                                                                                                                               
  96376 wilke     20   0  253504  78148   5432 S   0.0   0.2 646:15.25 flask                                                                                                                                         
1668341 wilke     20   0   18336   5748   3844 S   0.0   0.0   0:00.05 bash                                                                                                                                          
1668400 wilke     20   0   19668   3816   3304 R   0.0   0.0   0:00.03 top  
ps -Flww -p 1398
F S UID          PID    PPID  C PRI  NI ADDR SZ WCHAN    RSS PSR STIME TTY          TIME CMD
0 S wilke       1398       1 15  80   0 - 32157535 -   29490108 0 Jan02 ?       3-05:58:49 /data/elasticsearch-8.3.1/jdk/bin/java -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -Djava.security.manager=allow -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Dlog4j2.formatMsgNoLookups=true -Djava.locale.providers=SPI,COMPAT --add-opens=java.base/java.io=ALL-UNNAMED -XX:+UseG1GC -Djava.io.tmpdir=/tmp/elasticsearch-14826311203129578427 -XX:+HeapDumpOnOutOfMemoryError -XX:+ExitOnOutOfMemoryError -XX:HeapDumpPath=data -XX:ErrorFile=logs/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m -Xms16056m -Xmx16056m -XX:MaxDirectMemorySize=8417968128 -XX:InitiatingHeapOccupancyPercent=30 -XX:G1ReservePercent=25 -Des.distribution.type=tar --module-path /data/elasticsearch-8.3.1/lib -m org.elasticsearch.server/org.elasticsearch.bootstrap.Elasticsearch

@adibaba
Copy link
Member Author

adibaba commented Jan 23, 2023

Set "num_candidates" from 1000 to 100.

Code:

"num_candidates": 1000

Edit:
cd webservice_public/ ; cp es.py es.py.backup ; nano es.py

Restart:

kill -15 1398
/data/elasticsearch-8.3.1/bin/elasticsearch -d -p /data/elasticsearch-8.3.1/pid.txt

@adibaba
Copy link
Member Author

adibaba commented Jan 23, 2023

@adibaba
Copy link
Member Author

adibaba commented Jan 23, 2023

/data/elasticsearch-8.3.1/logs/gc.log
web: https://pastebin.com/UE0mRu7P
raw: https://pastebin.com/raw/UE0mRu7P

@adibaba
Copy link
Member Author

adibaba commented Jan 23, 2023

path.logs: /data/es8-logs

configured here:
https://github.com/dice-group/embeddings.cc/blob/5908fd66eec2b0737872c9f4c7284e0bb2b4d8a5/docs/vm.md#elasticsearch-831

non ".gz" files:
embcc_audit.json
embcc_deprecation.json
embcc_index_indexing_slowlog.json
embcc_index_search_slowlog.json
embcc.log
embcc_server.json

@adibaba
Copy link
Member Author

adibaba commented Jan 23, 2023

state: for some uris, it works (see above). mem usage stays at nearly 87% (seems to increase a bit)

@adibaba
Copy link
Member Author

adibaba commented Jan 27, 2023

Can maybe solved with Faiss, see issue #50

@adibaba adibaba added the bug Something isn't working label Jan 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant