Skip to content
This repository has been archived by the owner on Jan 31, 2022. It is now read-only.

Python tools killed on the CTP7 with the address table for 12 OH's #41

Open
1 of 2 tasks
lpetre-ulb opened this issue Nov 23, 2018 · 6 comments
Open
1 of 2 tasks

Comments

@lpetre-ulb
Copy link
Contributor

When trying to launch the gbt.py tool on the CTP7 with the address table for 12 OH's in order to perform a phase scan, the process is killed.

Brief summary of issue

During the tests of the new CTP7 release (version 3.7.0) with 12 OH's, the python tools on the CTP7 stopped working. Some of the tools can be used from the DAQ machine, but others, such as gbt.py, must currently must be called from the CTP7.

Each tool using the address table is killed because an out-of-memory issue during the pickle file loading :

fname = ADDRESS_TABLE_TOP[:-3] + "pickle"
try:
gc.disable()
f = open(fname, 'r')
global nodes
nodes = pickle.load(f)
f.close()
gc.enable()

The precise error is the following :

eagle63:~$ gbt.py 0 0 v3b-phase-scan /mnt/persistent/gemdaq/gbt/OHv3b/20180314/GBTX_OHv3b_GBT_0__2018-03-14_FINAL.txt
Open pickled address table if available  /mnt/persistent/gemdaq/xml/gem_amc_top.pickle...
Killed

Types of issue

  • Bug report (report an issue with the code)
  • Feature request (request for change which adds functionality)

Expected Behavior

I expected the gbt.py tool to perform a phase scan without any error.

Current Behavior

The gbt.py is currently killed.

eagle63:~$ gbt.py 0 0 v3b-phase-scan /mnt/persistent/gemdaq/gbt/OHv3b/20180314/GBTX_OHv3b_GBT_0__2018-03-14_FINAL.txt
Open pickled address table if available  /mnt/persistent/gemdaq/xml/gem_amc_top.pickle...
Killed

Steps to Reproduce (for bugs)

  1. Connect to a CTP7 with an address table for 12OH's, e.g. ssh gemuser@eagle63
  2. Launch the phase scan command : gbt.py 0 0 v3b-phase-scan /mnt/persistent/gemdaq/gbt/OHv3b/20180314/GBTX_OHv3b_GBT_0__2018-03-14_FINAL.txt
  3. The process is killed.

Possible Solution (for bugs)

Enabling the GC did not help. The gbt.py tools could be refactored to run from the DAQ machine.

Your Environment

  • CTP7 build ID : CTP7-GENERIC-20180529T153916-0500-4935611
  • CTP7 firmware version : 3.7.0
  • Pickle file for 12OH's
  • Version used : No package appears on rpm -qa output. No mention of version in the /mnt/persistent/gemdaq/python/reg_interface/ files. The parseXML() function is the same as in the current repository.
  • Shell used : /bin/sh

Default environment :

TERM=xterm-256color
SHELL=/bin/sh
USER=gemuser
LD_LIBRARY_PATH=:/mnt/persistent/gemdaq/lib:/mnt/persistent/rpcmodules
PATH=/mnt/persistent/gemuser/bin:/mnt/persistent/gemdaq/python/reg_interface:/usr/local/bin:/usr/bin:/bin:/mnt/persistent/gemdaq/scripts:/mnt/persistent/gemdaq/bin
PWD=/mnt/persistent/gemuser
EDITOR=vi
LANG=en_US.UTF-8
TZ=UTC
PS1=\h:\w\$
SHLVL=1
HOME=/mnt/persistent/gemuser
LANGUAGE=en_US.UTF-8
GREP_OPTIONS=--color=auto
LS_OPTIONS=--color=auto
LOGNAME=gemuser
GEM_PATH=/mnt/persistent/gemdaq
_=/usr/bin/env
@bdorney
Copy link
Contributor

bdorney commented Nov 23, 2018

Can you try the following:

  • Revert the address table and pickle file to the 4 OH case,
  • Update the LMDB to the 4 OH case,
  • Program and configure the front-end to establish communication,
  • Change the address table and pickle file back to the 12 OH case,
  • Update the LMDB to the 12 OH case,

The from the DAQ machine try to call confChamber.py with --run and --vt1=X for some X not equal to 100. This will use the LMDB. Does the configuration succeed? i.e. does the out of memory error also occur when trying to read the LMDB?

@bdorney
Copy link
Contributor

bdorney commented Nov 23, 2018

If there's no issue when using the LMDB then this means we need to either:

  1. Understand memory limits and if it's possible to reduce the pickle file size, or
  2. Migrate the python tools on the CTP7 need to be migrated to dedicated rpcmodules.

We cannot increase the memory of the card.

@lpetre-ulb
Copy link
Contributor Author

So, I tried to revert the address table, pickle file and the LMDB to the 4 OH's case. In that case, everything works as expected : python tools on the CTP7 are not killed and the chamber can be configured from the DAQ machine with the confChamber.py script.

When coming back to the 12 OH's case, the issue is back. The python tools on the CTP7 are killed, but the confChamber.py succeed. The LMDB does not create out-of-memory issue.

I'll investigate the memory limits more carefully, but here are a few informations :

  • The data.mdb file size is :
    • 4 OH's : ~12.6MB
    • 12 OH's : ~37.6MB
  • The pickle file size is :
    • 4 OH's : ~20.2MB
    • 12 OH's : ~60.5MB
  • The heap of gem_reg.py of the DAQ machine :
    • 4 OH's : ~190MB
    • 12 OH's : ~550MB

I think it is possible to reduce the pickle file size which is sent to the CTP7, but that would require to create a lightened Node python class. That might not be the best solution... Migrating to dedicated rpcmodules looks more future proof.

Anyway, I'll try to understand what is the limit on the node number/pickle file size on the CTP7.

@lpetre-ulb
Copy link
Contributor Author

The reported size of the OrderedDict nodes, by pympler asizeof module, is ~429 MiB, too much to fit on the CTP7.

Reducing the size of the `Node´ class seems a waste of time and would lead to the maintenance of two address tables. It would be better to move to RPC modules. See this issue for following up.

@mexanick
Copy link
Contributor

I guess we want this eventually https://lmdb.readthedocs.io/en/release/

@lpetre-ulb
Copy link
Contributor Author

Instead of packaging an external Python package for the CTP7 and redeveloping the register parsing code, I would more simply write a small Python wrapper (boost::python or pybind11) around the few useful functions in our code (readReg/writeReg).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants