Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added batching in transductive setting #128

Open
wants to merge 29 commits into
base: main
Choose a base branch
from
Open

Added batching in transductive setting #128

wants to merge 29 commits into from

Conversation

Coerulatus
Copy link
Collaborator

Hello everyone,

I have added the possibility of batching the data in the transductive setting.
When working with large graphs, selecting a subset of the graph while keeping the model's performance unchanged for the desired nodes can drastically reduce the memory requirements during training and inference.
In torch_geometric, the NeighborLoader performs neighbor sampling to achieve this. This can be done because, in the normal message-passing framework, the information propagates only as far as the number of message-passing steps performed.
The newly added NeighborCellsLoader works similarly but it also selects the relevant higher-order cells, by sequentially reducing all the incidences.
In the loader, you can also specify the rank to consider, meaning that you can perform batching over the nodes, edges, or any higher-order cell.

I have also added a tutorial that shows the basic functionality of NeighborCellsLoader. It also tests that the approach works as expected by comparing the model's outputs working with the full graph or with the batched one. Interestingly the number of hops needed is not necessarily equal to the number of layers in the higher-order networks. Information, at each layer, can in general travel further than the 1-neighborhood when working with these models.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

codecov bot commented Dec 18, 2024

Codecov Report

Attention: Patch coverage is 92.72727% with 16 lines in your changes missing coverage. Please review.

Project coverage is 90.24%. Comparing base (5151fe0) to head (ad86bca).

Files with missing lines Patch % Lines
topobenchmark/data/batching/cell_loader.py 85.18% 8 Missing ⚠️
topobenchmark/data/batching/utils.py 94.95% 6 Missing ⚠️
...pobenchmark/data/batching/neighbor_cells_loader.py 96.15% 1 Missing ⚠️
topobenchmark/nn/readouts/propagate_signal_down.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #128      +/-   ##
==========================================
+ Coverage   89.61%   90.24%   +0.62%     
==========================================
  Files         129      133       +4     
  Lines        3670     3884     +214     
==========================================
+ Hits         3289     3505     +216     
+ Misses        381      379       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants