[NDTensors] Testing Dagger.jl integration #1535

kmp5VT · 2024-10-07T21:09:01Z

Description

Dagger is a tool for distributed tensor operations. If we assume that Dagger.jl will be in charge of all of the data blocking and distribution and, at the moment, ignore blockwise sparsity we should be able to put a DArray (dagger array) as the underlying data storage in a Tensor. Right now I think dagger can only specify one block extent for an entire mode (for example blocks of [2,2,2,2,2] = 1 block of [10] and it would not be possible to do blocks of [2,3,5] = [10]). I also do not know if the Dagger.jl supports blockwise sparsity or just assumes all blocks are there.

So far with these changes one is able to do naive operations like tensor addition and tensor *. Operations like contract or linear algebra are not supported. I also do not know if this is actually distributing data and work.

Checklist:

ITensors can be constructed with DArray type
All canonical tensor operations work with DArray
DArray can perform distributed dense tensor operations

mtfishman · 2024-10-08T00:23:01Z

Very cool, I'm excited to see this started!

emstoudenmire · 2024-10-08T03:00:41Z

Very interested to see how well this might work

mtfishman · 2024-10-08T14:55:25Z

Some comments on your original post:

Interesting that you can't make a DArray with non-uniform block sizes within a dimension/mode. I also looked and couldn't find it in the docs, and tried to delve into the source code but there isn't even a simple private API for it. They convert the Blocks struct into range information for each block and then use that range information to construct the DArray but that logic is hidden pretty far into the code. I imagine it wouldn't be hard to add that feature if we need it. I wish that DArray would use the block array interface from BlockArrays.jl since all of that is standardize there and better integrated into the Julia AbstractArray interface through the use of blocked axes.
I was picturing that for block sparse tensors, we would just have each block of the block sparse tensor be a DArray of the size of the block, with its own partitioning. It will then be up to Dagger to decide on which process each distributed block of each block of the block sparse tensor goes, where operations occur, etc. (or we can try to direct it if it isn't distributing things in a good way). As you know, that isn't so easy with the current design of the BlockSparse tensor storage type since it points to offsets in a single underlying piece of data, but would be easier with the new BlockSparseArrays design. So for now we could just focus on the Dense case for exploratory work and hopefully BlockSparseArrays will be ready by the time we might be interested in trying out distributed block sparse operations.
I see a roadmap for distributed array operations in Dagger here: https://github.com/JuliaParallel/Dagger.jl/blob/master/FEATURES_ROADMAP.md#darrays-1. So LU is implemented, QR is in progress, and SVD is planned but there isn't work on it that I can see. As alternatives, we can try out techniques like https://arxiv.org/abs/2204.05693 or https://arxiv.org/abs/2212.09782 where they propose different ways of performing the tensor factorization involved in algorithms like DMRG and TEBD which are more amenable to distributed or GPU environments.
For dense tensor contraction, probably you know, but if DArray implements matrix multiplication and tensor transpose/dimension permutation, then ideally we can stick a Tensor with Dense storage wrapping DArray data into the generic TTGT dense contraction code and in that way dense distributed tensor contraction will "just work" (though of course the devil is in the details and the performance of that may not be ideal).

mtfishman · 2024-11-19T22:44:31Z

I'll close this since we will test this out with the new NamedDimsArray design instead.

kmp5VT added 4 commits October 7, 2024 16:44

Small PR to start getting Dagger to work in the Dense framework

113ad3c

format

b6da28b

Updates to get contract working

aab2e4a

format

9eb2705

kmp5VT requested a review from mtfishman October 7, 2024 21:09

kmp5VT marked this pull request as draft October 7, 2024 21:09

kmp5VT requested a review from emstoudenmire October 7, 2024 21:09

Merge branch 'main' into kmp5/feature/dagger_ext

555ffbb

Add missing using statements

0c59245

Merge branch 'main' into kmp5/feature/dagger_ext

11f0bf2

mtfishman closed this Nov 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NDTensors] Testing Dagger.jl integration #1535

[NDTensors] Testing Dagger.jl integration #1535

kmp5VT commented Oct 7, 2024

mtfishman commented Oct 8, 2024

emstoudenmire commented Oct 8, 2024

mtfishman commented Oct 8, 2024 •

edited

Loading

mtfishman commented Nov 19, 2024

[NDTensors] Testing Dagger.jl integration #1535

[NDTensors] Testing Dagger.jl integration #1535

Conversation

kmp5VT commented Oct 7, 2024

Description

Checklist:

mtfishman commented Oct 8, 2024

emstoudenmire commented Oct 8, 2024

mtfishman commented Oct 8, 2024 • edited Loading

mtfishman commented Nov 19, 2024

mtfishman commented Oct 8, 2024 •

edited

Loading