Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate_data() is called every 10th iteration and - single entry validation set #2

Open
alisiahkoohi opened this issue Jul 17, 2018 · 38 comments

Comments

@alisiahkoohi
Copy link

Why do you generate new training data pairs just every 10th iteration? For instance here.

Also this line suggests that the validation error is only being evaluated over a single data pair. So technically you validation set contains a single data pair.

@adler-j
Copy link
Owner

adler-j commented Jul 17, 2018

Both observations are true.

I did try generating data in every iteration, but it made no/very little difference at some performance cost. Now that you mention it I'm not sure if this made it into the article. Regardless, you can try generating at each iteration and see what happens.

Regarding the second point, the numerical values were indeed only evaluated for the image displayed in the article (the shepp-logan phantom). Notably, this is not a "random sample" from the prior but a rather special case. I picked it to show that the method can generalize quite well. This should be covered in the article.

@alisiahkoohi
Copy link
Author

alisiahkoohi commented Jul 18, 2018

I ran learned_primal_dual.py for ellipses with generate_data() being called every iteration. I double checked to make sure I haven't changed anything else but the training loss blew up after ~5k steps.
Will try the original code.

screenshot from 2018-07-18 11-33-02

@adler-j
Copy link
Owner

adler-j commented Jul 21, 2018

Very interesting. Long time since I ran this code, does it work well if you call it every 10:th iteration?

@adler-j
Copy link
Owner

adler-j commented Jul 31, 2018

Did you ever come through to running the original code? Did it work?

@alisiahkoohi
Copy link
Author

Sorry for the late response. I should mention that I was trying to run the code on CPU so maybe that is why I see this issue. If that is the case it may also explain why the original code didn't work for me.

I haven't got a chance to install ODL extensions to run it on GPU yet.

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

你有没有通过运行原始代码?它有用吗?

Running source code does not get the results in the article,What is the cause? I hope you can give me some advice.

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

Did you ever come through to running the original code? Did it work?

image

I'm sorry to have some problems uploading pictures. This is my result. I look forward to your reply. Thank you.

@adler-j
Copy link
Owner

adler-j commented Nov 28, 2018

Which file are you running to get those results? I've tried the current master on my machine and I get reasonable results. I also need to know e.g. how many iterations you ran.

Also, what version of ODL are you using?

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

Which file are you running to get those results? I've tried the current master on my machine and I get reasonable results. I also need to know e.g. how many iterations you ran.

Thank you for your reply.I running the learned_primal_dual.py and.learned_chambolle_pock.py The screenshot is the result of running the latter, and I stopped training because it collapsed at about 3000 steps. I'm sorry I didn't save the screenshots of loss and psnr.

@adler-j
Copy link
Owner

adler-j commented Nov 28, 2018

Sadly a tensorflow bug that hasn't been fixed in half a year (tensorflow/tensorflow#16864) is causing this code to run extremely slowly on my machine, so it's hard for me to debug.

Are you running on master?

What version of ODL are you using?

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

Which file are you running to get those results? I've tried the current master on my machine and I get reasonable results. I also need to know e.g. how many iterations you ran.

Also, what version of ODL are you using?

Which file are you running to get those results? I've tried the current master on my machine and I get reasonable results. I also need to know e.g. how many iterations you ran.

Thank you for your reply.I running the learned_primal_dual.py and.learned_chambolle_pock.py The screenshot is the result of running the latter, and I stopped training because it collapsed at about 3000 steps. I'm sorry I didn't save the screenshots of loss and psnr.

image
This is the result of my operation about learned_primal_dual.py. Seen from the chart, the result is not up to expectation. Is it due to inadequate training?

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

Sadly a tensorflow bug that hasn't been fixed in half a year (tensorflow/tensorflow#16864) is causing this code to run extremely slowly on my machine, so it's hard for me to debug.

Are you running on master?

What version of ODL are you using?

yes ,odl-1.0.0.dev0

@adler-j
Copy link
Owner

adler-j commented Nov 28, 2018

Are you getting the same problem with learned_primal_dual.py? E.g. can you show a loss curve

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

Are you getting the same problem with learned_primal_dual.py? E.g. can you show a loss curve

image
I think so.There is no tendency for the results to improve.

@adler-j
Copy link
Owner

adler-j commented Nov 28, 2018

What implementation of the raytransform are you using?

E.g.

>>> operator.impl
'astra_cuda'

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

image
I just use the original code.

@adler-j
Copy link
Owner

adler-j commented Nov 28, 2018

I would very strongly recommend you install astra, try

conda install -c astra-toolbox astra-toolbox

@adler-j
Copy link
Owner

adler-j commented Nov 28, 2018

My learning curves look like this:

image

E.g. seems to be improving just fine.

The above is for "learned_chambolle_pock.py"

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

I would very strongly recommend you install astra, try

conda install -c astra-toolbox astra-toolbox

image
Our results are different.

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

My learning curves look like this:

image

E.g. seems to be improving just fine.

The above is for "learned_chambolle_pock.py"

Maybe you can send me the code you're running now. Let me have a look. Linux encountered problems in installation astra-toolbox

@adler-j
Copy link
Owner

adler-j commented Nov 28, 2018

What problem did you encounter? I'm literally just running the code in this repo.

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

What problem did you encounter? I'm literally just running the code in this repo.

It frustrates me that I can't reproduce your results using source code.

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

My learning curves look like this:

image

E.g. seems to be improving just fine.

The above is for "learned_chambolle_pock.py"

Why are there two curves on each graph, but mine has only one?

@adler-j
Copy link
Owner

adler-j commented Nov 28, 2018

training and testing losses (blue is train, orange is test)

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

training and testing losses (blue is train, orange is test)

Oh, So the code you use is different from mine?

@adler-j
Copy link
Owner

adler-j commented Nov 28, 2018

The "learned_chambolle_pock.py" file includes both training and testing losses. See e.g.

test_summary_writer = tf.summary.FileWriter(adler.tensorflow.util.default_tensorboard_dir(name) + '/test',
sess.graph)
train_summary_writer = tf.summary.FileWriter(adler.tensorflow.util.default_tensorboard_dir(name) + '/train')

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

training and testing losses (blue is train, orange is test)

image
image
I got terrible results and I don't know why.(primal_dual.py)

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

The "learned_chambolle_pock" file includes both training and testing losses. See e.g.

learned_primal_dual/ellipses/learned_chambolle_pock.py

Lines 152 to 154 in 64901e8

test_summary_writer = tf.summary.FileWriter(adler.tensorflow.util.default_tensorboard_dir(name) + '/test',
sess.graph)
train_summary_writer = tf.summary.FileWriter(adler.tensorflow.util.default_tensorboard_dir(name) + '/train')

Well, thank you

@adler-j
Copy link
Owner

adler-j commented Nov 28, 2018

It's very hard for me to debug remotely, but my best guess right now is that you need astra.

Except for that, make sure that you have downloaded the laster version of this repo and ODL (e.g. re-install them).

Finally, what TF version do you run?

@ChengV0
Copy link

ChengV0 commented Nov 28, 2018

Ok, thank you very much for your patient answer. I will try again. TF:(1.8.0)

@adler-j
Copy link
Owner

adler-j commented Nov 28, 2018

It's great to get feedback. I try to make sure the code is runnable by everyone. please report any progress.

@ChengV0
Copy link

ChengV0 commented Nov 29, 2018

It's great to get feedback. I try to make sure the code is runnable by everyone. please report any progress.

image
image
The result is so bad(primal_dual.py)If it's convenient for you, I'd like to see your training summaries/images.Thank you.

@adler-j
Copy link
Owner

adler-j commented Nov 29, 2018

Did you re-install this library and ODL as advised? Did you install ASTRA? If you do not follow my advice it's hard to help.

My GPU is currently quite busy, but my training curve is a rather smooth convergence towards ~37 PSNR, nothing like what you are seeing.

@ChengV0
Copy link

ChengV0 commented Nov 29, 2018

Did you re-install this library and ODL as advised? Did you install ASTRA? If you do not follow my advice it's hard to help.

My GPU is currently quite busy, but my training curve is a rather smooth convergence towards ~37 PSNR, nothing like what you are seeing.

image
Thank you for your reply.When I installed astra_toolbox, I met difficulties.Because the server did not install conda but used pip3.
May I have a look at your training pictures?(summaries/images)
In training, is your GPU-Util similar to mine?What is the reason for this?Thank you again.

@adler-j
Copy link
Owner

adler-j commented Nov 29, 2018

My curves look like this:

image

image

image

My GPU is way more busy than that, e.g. not 100% but far higher than 1%. I guess mine is busy because i have installed ASTRA, without it all the time is spent in scikit-image.

@ChengV0
Copy link

ChengV0 commented Nov 29, 2018

My curves look like this:

image

image

image

My GPU is way more busy than that, e.g. not 100% but far higher than 1%. I guess mine is busy because i have installed ASTRA, without it all the time is spent in scikit-image.

Thank you very much. You have helped me a lot.

@AceCoooool
Copy link

AceCoooool commented Dec 10, 2018

I would very strongly recommend you install astra, try

conda install -c astra-toolbox astra-toolbox

I also meet similar curve as @ChengV0 meets when run ellipses/learned_primal.py and ellipses/learned_primal_dual.py

I think astra-toolbox is faster than skimage, however, it will not influence the results too much. For example: (addition: oh no!!! it will infulence the results!!!!! However, I did not know why? --- Using astra can get the results as author @adler-j 's learning curve. And using skimage get learning curve as ChengV0 meets. )

import astra
import numpy as np
from skimage import measure
import scipy.io

P = scipy.io.loadmat('phantom.mat')['phantom256']

# astra
vol_geom = astra.create_vol_geom(256, 256)
proj_geom = astra.create_proj_geom('parallel', 1.0, 384, np.linspace(0, np.pi, 180, False))

proj_id = astra.create_projector('cuda', proj_geom, vol_geom)
sinogram_id, sinogram = astra.create_sino(P, proj_id)
rec_id = astra.data2d.create('-vol', vol_geom)
cfg = astra.astra_dict('FBP_CUDA')
cfg['ReconstructionDataId'] = rec_id
cfg['ProjectionDataId'] = sinogram_id
cfg['option'] = {'FilterType': 'Ram-Lak'}
alg_id = astra.algorithm.create(cfg)
astra.algorithm.run(alg_id)
rec = astra.data2d.get(rec_id)

print("psnr: ", measure.compare_psnr(P.astype(np.float32), rec))

astra.algorithm.delete(alg_id)
astra.data2d.delete(rec_id)
astra.data2d.delete(sinogram_id)
astra.projector.delete(proj_id)

# skimage
from skimage.transform import radon, iradon

img = P.astype(np.float64)
theta = np.linspace(0., 180., 180, endpoint=False)
sinogram = radon(img, theta=theta, circle=True)
reconstruction_fbp = iradon(sinogram, theta=theta, circle=True)

print("psnr: ", measure.compare_psnr(img, reconstruction_fbp))

psnr of astra: 34.125
psnr of skimage: 34.116

@AceCoooool
Copy link

@adler-j I have some questions(not open new issue for convience):

  1. the default geometry (odl.tomo.parallel_beam_geometry) using proj_space (geometry.det_partition.cell_sides) not equal to one. Is this better ? Due to many implementation (e.g. sklearn radon function) using 1.
  2. The projection's derivation is back projection. However, many library (e.g. astra-toolbox and skimage radon) implement these not in full accord .

Thank you @adler-j !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants