-
-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive memory usage when using source grouping #1808
Comments
Here's the report for 8, 16, 32:
and 64 is still going
|
64gb eventually failed:
|
That last failure may indicate a problem with the source parameter validation in IterativePSFPhotometry |
The question is why a source landed on a completely masked region. |
One OOM explanation is the compound models: models = []
x = 1000000
from tqdm.auto import tqdm
for i in tqdm(range(x)):
models.append(dao_psf_model.copy())
if i == 0:
psf_model = dao_psf_model
else:
psf_model += dao_psf_model (example suggested by Larry on Slack) gives me
|
Confirmed that grouping causes the problem; without the grouper, 8GB works:
|
The issue is due to excessive memory used by compound Astropy models: astropy/astropy#16701 |
I was looking into this issue (since it has been an issue for me while running psf photometry on 250x250 cutout with 100-400 sources: often requiring more than 16GB, sometimes 32GB on cluster) and noticed a behavior that the memory allocation (heap size) keeps increasing during a loop in which I'm newly initializing and calling IterativePSFPhotometry (as a wrapped function call). Maybe I'm not understanding how python memory and garbage collection works, but I was expecting to see a rise and drop in memory usage for each iteration. Is this a system-specific behavior, the upstream Astropy issue, or because of my object structure? Not a MWE, but a structure is something like this: class MyClass():
def __init__(self,data,error):
self.data = data
self.error = error
def do_photometry(self,**kwargs):
# . . .
# initialize Source Grouper, LocalBackground, and DAOStarFinder here...
self.do_some_stuff()
self.do_more_stuff()
# . . .
# newly initialize IterativePSFPhotometry object
psf_iter = IterativePSFPhotometry(**some_kwargs)
phot_result = psf_iter(self.data, error=self.error)
# don't models and other large memory-eating objects get deleted upon return?
return phot_result
if __name__ == "__main__":
my_class = MyClass(data,error)
# iterating over some kwargs to test photometry with different settings
for kwargs in list_of_kwargs[:10]:
myclass.do_photometry(**kwargs)
# memory heap size keeps increasing during the whole loop
# resident size drops once or twice, but generally keeps increasing too |
@SterlingYM The excessive memory issue is triggered when using source grouping. Source grouping creates a compound Astropy model. Every source in the group contributes part of the compound model so that the group can be fit simultaneously. If the number of sources in the group gets large, the compound Astropy model requires a huge amount of memory. When Astropy issue astropy/astropy#16701 is fixed, this will no longer be a problem. In the meantime, you can try limiting your group sizes with a larger separation if you are running into this issue. |
I've changed the title of the issue to indicate that this is specifically triggered by source grouping. |
Adjusting group size helped with memory usage. Thank you for the suggestion! |
I have an MWE that fails reliably now. It's not all that minimal, but minimal enough I hope.
This is running on a 400x400 image. It runs out of memory on an 8GB node. I'm running with memory profiler on 16GB, 32GB, 64GB to see what the peak usage is / see if I can get it to complete.
I think peak memory usage is happening somewhere in the n'th iteration.
The text was updated successfully, but these errors were encountered: