Different results between testing in the mid terms of training and at the end of training #14

marcwww · 2018-08-16T13:42:19Z

Dear author,

I've fork a repo(at https://github.com/marcwww/pytorch-ntm) from your work, mainly expecting to test the model on longer sequences(for example, training on sequences of length ranging from 1 to 10, and testing on seqs of length ranging from 11 to 20).

The question is that the final testing result after training without testing in the middle terms of the training process is different from that with testing in the middle terms. The experiment setting of the repo is the latter one (at https://github.com/marcwww/pytorch-ntm/blob/1d0595e165a6790219df76e0b7f13b48e406b4d9/train_test.py#L236).

In the forked repo batches for testing are sampled in the same way of ones for training (at https://github.com/marcwww/pytorch-ntm/blob/1d0595e165a6790219df76e0b7f13b48e406b4d9/tasks/copytask_test.py#L16). Actually, I've tried to see whether the result are from the intertwined sampling of training and testing by loading a pre-generated test set, and it does not help.

Could you please help me with this? Thanks a lot.

loudinthecloud · 2018-08-16T16:38:43Z

Hi, I'm not sure what are you trying to achieve... the notebooks shows how the model can generalize for sequences lengths that are longer than 20, like 80, while the models were trained on sequences of ranges 1 to 20.

marcwww · 2018-08-17T07:40:54Z

Sorry for my pool English...

My central question is that I found whether the testing procedure is added in the middle terms of training process influences the final result, which is, of course, ridiculous.

Well, let me put it in code directly, one could just clone the repo https://github.com/marcwww/pytorch-ntm.git , and run the two files with default hyperparameters

python train_test_mid.py
python train_test_end.py

Finally, when both of them are finished, the results are:

, which are different, say, the 'train_test_end.py' could reach 1.00 for accuracy and 'train_test_mid.py' could only reach 0.91. However the two files solely differ in the 'train_model' method(one does testing in the end of training and the other one does it every serval batches):

Is it clearer for u this time? It will be really appreciated if u could help me with this.

loudinthecloud · 2019-01-23T08:01:37Z

Hi @marcwww ,

I'm sorry but I don't have enough capacity to check your results, but I'm interested to know whether you managed to resolve this issue and whether it is still relevant?

Thanks.

loudinthecloud closed this as completed Jan 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different results between testing in the mid terms of training and at the end of training #14

Different results between testing in the mid terms of training and at the end of training #14

marcwww commented Aug 16, 2018 •

edited

Loading

loudinthecloud commented Aug 16, 2018

marcwww commented Aug 17, 2018 •

edited

Loading

loudinthecloud commented Jan 23, 2019

Different results between testing in the mid terms of training and at the end of training #14

Different results between testing in the mid terms of training and at the end of training #14

Comments

marcwww commented Aug 16, 2018 • edited Loading

loudinthecloud commented Aug 16, 2018

marcwww commented Aug 17, 2018 • edited Loading

loudinthecloud commented Jan 23, 2019

marcwww commented Aug 16, 2018 •

edited

Loading

marcwww commented Aug 17, 2018 •

edited

Loading