Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hi, I have met errors during preprocessing. #1

Open
diebridge opened this issue Jul 20, 2019 · 9 comments
Open

Hi, I have met errors during preprocessing. #1

diebridge opened this issue Jul 20, 2019 · 9 comments

Comments

@diebridge
Copy link

The error code like this:
Traceback (most recent call last):
File "/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/anaconda3/lib/python3.6/multiprocessing/pool.py", line 103, in worker
initializer(*initargs)
File "/anaconda3/lib/python3.6/multiprocessing/pool.py", line 103, in worker
initializer(*initargs)
File "preprocess.py", line 23, in init_tokenizer
TOK = CoreNLPTokenizer(annota File "preprocess.py", line 23, in init_tokenizer
TOK = CoreNLPTokenizer(annotators=annotators)
tors=annotators)
File "/Users/fzy/question-answering/reader/data/tokenizer.py", line 50, in init
self._launch()
File "/Users/fzy/question-answering/reader/data/tokenizer.py", line 73, in _launch
self.corenlp.expect_exact('NLP>', searchwindowsize=100)
File "/Users/fzy/question-answering/reader/data/tokenizer.py", line 50, in init
self._launch()
File "/anaconda3/lib/python3.6/site-packages/pexpect/spawnbase.py", line 390, in expect_exact
return exp.expect_loop(timeout)
File "/Users/fzy/question-answering/reader/data/tokenizer.py", line 73, in _launch
self.corenlp.expect_exact('NLP>', searchwindowsize=100)
File "/anaconda3/lib/python3.6/site-packages/pexpect/expect.py", line 107, in expect_loop
return self.timeout(e)
File "/anaconda3/lib/python3.6/site-packages/pexpect/spawnbase.py", line 390, in expect_exact
return exp.expect_loop(timeout)
File "/anaconda3/lib/python3.6/site-packages/pexpect/expect.py", line 70, in timeout
raise TIMEOUT(msg)
Traceback (most recent call last):
File "/anaconda3/lib/python3.6/site-packages/pexpect/expect.py", line 107, in expect_loop
return self.timeout(e)
File "/anaconda3/lib/python3.6/site-packages/pexpect/expect.py", line 70, in timeout
raise TIMEOUT(msg)
pexpect.exceptions.TIMEOUT: Timeout exceeded.

How could I debug them? Thanks in advance!

@wzq016
Copy link

wzq016 commented Feb 6, 2020

do you solve this problem? i met same error as yours.

@wzq016
Copy link

wzq016 commented Feb 6, 2020

Oh, i solve this problem by downgrade java version from 11 to 8 :).

@mady143
Copy link

mady143 commented Mar 2, 2020

Hi @wzq016 ,
I didn't resolve this by downgrade java version from 11 to 8 again i am getting the same error

(my_qa3) launchship@launchship-ML:~/Downloads/question-answering$ python preprocess.py --data data/squad --embed-path wordvec/glove/glove.840B.300d.txt --restrict-vocab --num-characters 300
[2020-03-02 12:43:19] COMMAND: preprocess.py --data data/squad --embed-path wordvec/glove/glove.840B.300d.txt --restrict-vocab --num-characters 300
[2020-03-02 12:43:19] Arguments: {'data': 'data/squad', 'dest_dir': 'data', 'tokenizer': 'corenlp', 'num_workers': None, 'threshold': 0, 'num_words': -1, 'num_characters': 300, 'embed_path': 'wordvec/glove/glove.840B.300d.txt', 'restrict_vocab': True}
[2020-03-02 12:43:19] Loaded dataset data/squad/train-v1.1.json (87599 questions, 18896 contexts)
Process ForkPoolWorker-2:
Process ForkPoolWorker-4:
Traceback (most recent call last):
Traceback (most recent call last):
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 99, in expect_loop
incoming = spawn.read_nonblocking(spawn.maxread, timeout)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 99, in expect_loop
incoming = spawn.read_nonblocking(spawn.maxread, timeout)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/pty_spawn.py", line 462, in read_nonblocking
raise TIMEOUT('Timeout exceeded.')
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/pty_spawn.py", line 462, in read_nonblocking
raise TIMEOUT('Timeout exceeded.')
pexpect.exceptions.TIMEOUT: Timeout exceeded.
pexpect.exceptions.TIMEOUT: Timeout exceeded.

During handling of the above exception, another exception occurred:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/multiprocessing/pool.py", line 103, in worker
initializer(*initargs)
File "/usr/lib/python3.6/multiprocessing/pool.py", line 103, in worker
initializer(*initargs)
File "preprocess.py", line 23, in init_tokenizer
TOK = CoreNLPTokenizer(annotators=annotators)
File "preprocess.py", line 23, in init_tokenizer
TOK = CoreNLPTokenizer(annotators=annotators)
File "/home/launchship/Downloads/question-answering/reader/data/tokenizer.py", line 50, in init
self._launch()
File "/home/launchship/Downloads/question-answering/reader/data/tokenizer.py", line 50, in init
self._launch()
File "/home/launchship/Downloads/question-answering/reader/data/tokenizer.py", line 73, in _launch
self.corenlp.expect_exact('NLP>', searchwindowsize=100)
File "/home/launchship/Downloads/question-answering/reader/data/tokenizer.py", line 73, in _launch
self.corenlp.expect_exact('NLP>', searchwindowsize=100)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/spawnbase.py", line 390, in expect_exact
return exp.expect_loop(timeout)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/spawnbase.py", line 390, in expect_exact
return exp.expect_loop(timeout)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 107, in expect_loop
return self.timeout(e)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 107, in expect_loop
return self.timeout(e)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 70, in timeout
raise TIMEOUT(msg)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 70, in timeout
raise TIMEOUT(msg)
pexpect.exceptions.TIMEOUT: Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x7f0d52790630>
command: /bin/bash
args: ['/bin/bash']
buffer (last 100 chars): b'estion-answering\x07\x1b[01;32mlaunchship@launchship-ML\x1b[00m:\x1b[01;34m~/Downloads/question-answering\x1b[00m$ '
before (last 100 chars): b'estion-answering\x07\x1b[01;32mlaunchship@launchship-ML\x1b[00m:\x1b[01;34m~/Downloads/question-answering\x1b[00m$ '
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 19267
child_fd: 12
closed: False
timeout: 60
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 100000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_string:
0: "b'NLP>'"
pexpect.exceptions.TIMEOUT: Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x7f0d52790400>
command: /bin/bash
args: ['/bin/bash']
buffer (last 100 chars): b'estion-answering\x07\x1b[01;32mlaunchship@launchship-ML\x1b[00m:\x1b[01;34m~/Downloads/question-answering\x1b[00m$ '
before (last 100 chars): b'estion-answering\x07\x1b[01;32mlaunchship@launchship-ML\x1b[00m:\x1b[01;34m~/Downloads/question-answering\x1b[00m$ '
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 19260
child_fd: 10
closed: False
timeout: 60
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 100000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_string:
0: "b'NLP>'"
Process ForkPoolWorker-1:
Traceback (most recent call last):
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 99, in expect_loop
incoming = spawn.read_nonblocking(spawn.maxread, timeout)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/pty_spawn.py", line 462, in read_nonblocking
raise TIMEOUT('Timeout exceeded.')
pexpect.exceptions.TIMEOUT: Timeout exceeded.

@wzq016
Copy link

wzq016 commented Mar 2, 2020

Hi @wzq016 ,
I didn't resolve this by downgrade java version from 11 to 8 again i am getting the same error

(my_qa3) launchship@launchship-ML:~/Downloads/question-answering$ python preprocess.py --data data/squad --embed-path wordvec/glove/glove.840B.300d.txt --restrict-vocab --num-characters 300
[2020-03-02 12:43:19] COMMAND: preprocess.py --data data/squad --embed-path wordvec/glove/glove.840B.300d.txt --restrict-vocab --num-characters 300
[2020-03-02 12:43:19] Arguments: {'data': 'data/squad', 'dest_dir': 'data', 'tokenizer': 'corenlp', 'num_workers': None, 'threshold': 0, 'num_words': -1, 'num_characters': 300, 'embed_path': 'wordvec/glove/glove.840B.300d.txt', 'restrict_vocab': True}
[2020-03-02 12:43:19] Loaded dataset data/squad/train-v1.1.json (87599 questions, 18896 contexts)
Process ForkPoolWorker-2:
Process ForkPoolWorker-4:
Traceback (most recent call last):
Traceback (most recent call last):
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 99, in expect_loop
incoming = spawn.read_nonblocking(spawn.maxread, timeout)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 99, in expect_loop
incoming = spawn.read_nonblocking(spawn.maxread, timeout)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/pty_spawn.py", line 462, in read_nonblocking
raise TIMEOUT('Timeout exceeded.')
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/pty_spawn.py", line 462, in read_nonblocking
raise TIMEOUT('Timeout exceeded.')
pexpect.exceptions.TIMEOUT: Timeout exceeded.
pexpect.exceptions.TIMEOUT: Timeout exceeded.

During handling of the above exception, another exception occurred:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/multiprocessing/pool.py", line 103, in worker
initializer(*initargs)
File "/usr/lib/python3.6/multiprocessing/pool.py", line 103, in worker
initializer(*initargs)
File "preprocess.py", line 23, in init_tokenizer
TOK = CoreNLPTokenizer(annotators=annotators)
File "preprocess.py", line 23, in init_tokenizer
TOK = CoreNLPTokenizer(annotators=annotators)
File "/home/launchship/Downloads/question-answering/reader/data/tokenizer.py", line 50, in init
self._launch()
File "/home/launchship/Downloads/question-answering/reader/data/tokenizer.py", line 50, in init
self._launch()
File "/home/launchship/Downloads/question-answering/reader/data/tokenizer.py", line 73, in _launch
self.corenlp.expect_exact('NLP>', searchwindowsize=100)
File "/home/launchship/Downloads/question-answering/reader/data/tokenizer.py", line 73, in _launch
self.corenlp.expect_exact('NLP>', searchwindowsize=100)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/spawnbase.py", line 390, in expect_exact
return exp.expect_loop(timeout)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/spawnbase.py", line 390, in expect_exact
return exp.expect_loop(timeout)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 107, in expect_loop
return self.timeout(e)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 107, in expect_loop
return self.timeout(e)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 70, in timeout
raise TIMEOUT(msg)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 70, in timeout
raise TIMEOUT(msg)
pexpect.exceptions.TIMEOUT: Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x7f0d52790630>
command: /bin/bash
args: ['/bin/bash']
buffer (last 100 chars): b'estion-answering\x07\x1b[01;32mlaunchship@launchship-ML\x1b[00m:\x1b[01;34m~/Downloads/question-answering\x1b[00m$ '
before (last 100 chars): b'estion-answering\x07\x1b[01;32mlaunchship@launchship-ML\x1b[00m:\x1b[01;34m~/Downloads/question-answering\x1b[00m$ '
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 19267
child_fd: 12
closed: False
timeout: 60
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 100000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_string:
0: "b'NLP>'"
pexpect.exceptions.TIMEOUT: Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x7f0d52790400>
command: /bin/bash
args: ['/bin/bash']
buffer (last 100 chars): b'estion-answering\x07\x1b[01;32mlaunchship@launchship-ML\x1b[00m:\x1b[01;34m~/Downloads/question-answering\x1b[00m$ '
before (last 100 chars): b'estion-answering\x07\x1b[01;32mlaunchship@launchship-ML\x1b[00m:\x1b[01;34m~/Downloads/question-answering\x1b[00m$ '
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 19260
child_fd: 10
closed: False
timeout: 60
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 100000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_string:
0: "b'NLP>'"
Process ForkPoolWorker-1:
Traceback (most recent call last):
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/expect.py", line 99, in expect_loop
incoming = spawn.read_nonblocking(spawn.maxread, timeout)
File "/home/launchship/Downloads/my_qa3/lib/python3.6/site-packages/pexpect/pty_spawn.py", line 462, in read_nonblocking
raise TIMEOUT('Timeout exceeded.')
pexpect.exceptions.TIMEOUT: Timeout exceeded.

Yes, I find such problem will still occur under Java8 sometimes.
I finally solve this by using Spacy tokenizer instead corenlp, i.e. try this command
python preprocess.py --data data/squad --embed-path wordvec/glove/glove.840B.300d.txt --restrict-vocab --num-characters 300 --tokenizer spacy

@mady143
Copy link

mady143 commented Mar 2, 2020

Hi @wzq016 ,
Thank you its worked for me pre-processing is completed successfully without any error but while in training process getting an error that no a directory logs/drqa-log but i run according comannd given in the README.md file

NotADirectoryError: [Errno 20] Not a directory: '/home/launchship/Downloads/question-answering/logs/drqa.log'

@wzq016
Copy link

wzq016 commented Mar 2, 2020

@mady143 this is because you need to make a directory first, i.e. mkdir logs under question-answering folder.

@mady143
Copy link

mady143 commented Mar 2, 2020

yeah already i had logs i run the command like
python train.py --arch drqa --embed-path wordvec/glove/glove.840B.300d.txt --checkpoint-dir checkpoints/drqa --log-file logs

getting an error

word, count = line.rstrip().rsplit(' ', 1)
ValueError: not enough values to unpack (expected 2, got 1)

@wzq016
Copy link

wzq016 commented Mar 2, 2020

I don't meet such problem, so i'm not sure what is going wrong here, I only run this code on bidaf with Squad.

Seems like this is because of data problem, maybe you could try printing results here to find out the error.

@SarangSanjayGujar-lilly

I have the same issues both with bidaf and drqa, would you please check ?
@mady143 did you find any solution or work around ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants