memory error #22

anarucu · 2015-07-28T21:13:44Z

hi everyone,
I use copy-feats binary from kaldi, to convert my ascii features in .ark and .scp
Then I copied all the independent .scp files into a unique one which I called SmallSet0.scp:

SESS0003BLOCKA_06 /home/ana/DB/SmallSet0/feat/SESS0003BLOCKA_06.ark:18
SESS0003BLOCKA_07 /home/ana/DB/SmallSet0/feat/SESS0003BLOCKA_07.ark:18
SESS0003BLOCKA_08 /home/ana/DB/SmallSet0/feat/SESS0003BLOCKA_08.ark:18
SESS0003BLOCKA_09 /home/ana/DB/SmallSet0/feat/SESS0003BLOCKA_09.ark:18
SESS0003BLOCKA_10 /home/ana/DB/SmallSet0/feat/SESS0003BLOCKA_10.ark:18
SESS0003BLOCKA_11 /home/ana/DB/SmallSet0/feat/SESS0003BLOCKA_11.ark:18

Then I tryed to train 4 stacked RBM using run_RBM.py and got the following memory error:

ana@ana-HP-EliteBook-Folio-9470m:~/PDNN/pdnn$ python /home/ana/PDNN/pdnn/cmds/run_RBM.py --train-data "/home/ana/DB/SmallSet0/feat/SmallSet0.scp,partition=600m,stream=true,random=true" --nnet-spec "215:1024:1024:43:1024" --wdir ./ --ptr-layer-number 4 --epoch-number 10 --batch-size 128 --learning-rate 0.08 --gbrbm-learning-rate 0.005 --momentum 0.5:0.9:5 --first_layer_type gb --param-output-file /home/ana/PDNN/Working_dir/rbm.mdl
[2015-07-28 23:06:57.528732] > ... initializing the model
Traceback (most recent call last):
File "/home/ana/PDNN/pdnn/cmds/run_RBM.py", line 62, in
cfg.init_data_reading(train_data_spec)
File "/home/ana/PDNN/pdnn/utils/rbm_config.py", line 65, in init_data_reading
self.train_sets, self.train_xy, self.train_x, self.train_y = read_dataset(train_dataset, train_dataset_args)
File "/home/ana/PDNN/pdnn/io_func/data_io.py", line 92, in read_dataset
data_reader.initialize_read(first_time_reading = True)
File "/home/ana/PDNN/pdnn/io_func/kaldi_io.py", line 102, in initialize_read
utt_id, utt_mat = self.read_next_utt()
File "/home/ana/PDNN/pdnn/io_func/kaldi_io.py", line 89, in read_next_utt
tmp_mat = numpy.frombuffer(ark_read_buffer.read(rows * cols * 4), dtype=numpy.float32)
MemoryError

what did I do wrong?
best regards
ana

vipular · 2015-10-12T13:43:12Z

Hi,
I am also getting the same memory error.
The minimal python code is:

from io_func.kaldi_feat import KaldiReadIn
in_scp_file = '/data/raw_mfcc_test.1.scp'
kaldiread = KaldiReadIn(in_scp_file)
utt_number = 0
while True:
    uttid, in_matrix = kaldiread.read_next_utt()
    if uttid == '':
        break

On debugging, I found that in kaldi_feat.py, the following lines:

m, rows = struct.unpack('<bi', ark_read_buffer.read(5))
n, cols = struct.unpack('<bi', ark_read_buffer.read(5))

give rows and cols to be extremely large numbers.
The next line of uses rows*cols to form a numpy array, and hence raises the error.

I am using mac OS X Yosemite 10.10.3

Sincerely,
-Vipul

a00achild1 · 2016-10-31T07:19:58Z

Hi,
I also have this problem while I was trying to train a simple digits speech recognition by using DNN.
After I got the mfcc features from Kaldi in .scp format, I was trying to use the command below:

run_DNN.py --train-data "./mfcc/raw_mfcc_train.1.scp,partition=600m,random=true" \
           --valid-data "./mfcc/raw_mfcc_test.1.scp,partition=600m,random=true" \
           --nnet-spec "250:1024:1024:1024:1024:1024:10" --wdir ./ \
           --output-format kaldi \
           --lrate "D:0.08:0.5:0.05,0.05:15" \
           --output-file dnn.nnet >& dnn.training.log

But I got the error in log file:

Traceback (most recent call last):
  File "/home/cssp/pdnn-master/cmds/run_DNN.py", line 56, in <module>
    cfg.init_data_reading(train_data_spec, valid_data_spec)
  File "/home/cssp/pdnn-master/utils/network_config.py", line 94, in init_data_reading
    self.train_sets, self.train_xy, self.train_x, self.train_y = read_dataset(train_dataset, train_dataset_args)
  File "/home/cssp/pdnn-master/io_func/data_io.py", line 92, in read_dataset
    data_reader.initialize_read(first_time_reading = True)
  File "/home/cssp/pdnn-master/io_func/kaldi_io.py", line 102, in initialize_read
    utt_id, utt_mat = self.read_next_utt()
  File "/home/cssp/pdnn-master/io_func/kaldi_io.py", line 89, in read_next_utt
    tmp_mat = numpy.frombuffer(ark_read_buffer.read(rows * cols * 4), dtype=numpy.float32)
MemoryError

Did anyone has solutions for this?
Thanks,
-a00a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory error #22

memory error #22

anarucu commented Jul 28, 2015

vipular commented Oct 12, 2015

a00achild1 commented Oct 31, 2016 •

edited

Loading

memory error #22

memory error #22

Comments

anarucu commented Jul 28, 2015

vipular commented Oct 12, 2015

a00achild1 commented Oct 31, 2016 • edited Loading

a00achild1 commented Oct 31, 2016 •

edited

Loading