[ENH] Coiteration with hashed level #25

adam2392 · 2023-06-11T16:00:11Z

Supersedes: #24 and #19

Signed-off-by: Adam Li <[email protected]>

adam2392 · 2023-06-12T14:41:16Z

My current thinking is the following:

advance_iter, this advances the iterator for each level using the ++ operator. We should leave this as is?
in operator++(), we want to advance_iter only for the levels that are ordered since otw, we can just locate into it. I would want something along these lines, but this wouldn't work.

auto levelMask = m_coiterHelper.m_coiterate.ordered_level_mask();
                    std::size_t i = 0;
                    std::apply([&](auto&... args) {
                        ((std::get<i++>(levelMask) ? advance_iter(args) : void()), ...);
                    }, iterators);

dereferencing (i.e. get_PKs): The current function gets the second element of the iterator currently if the iterator is currently at min_ik. In a hashed level, how would we locate into the element after min_ik? Then, we can modify the locate function to either: i) get the second element if the level is ordered, or ii) locate into the next index if the level is unordered. I'm unsure how to best tackle this.
calc_min_ik(): I'm going back and forth on whether this should also get modified. Since it computes the current minimum index, then does it just ignore hashed level in this computation?

hameerabbasi · 2023-06-13T08:27:14Z

In all but the third point, you would do the calculation exactly as-is, just only with the ordered levels. Just the deference would take into account the locate-capable unordered levels.

The reason is that we ensure a conjunctive merge with unordered levels so the minimum/advance (the last of which is the same as operator++) etc will work exactly as with just the ordered levels.

adam2392 · 2023-06-14T21:10:17Z

06/14/23 Notes

Today, @bharath2438 and I met briefly. Here are a few notes that I am currently implementing:

Coiterate::coiteration_helper::get_PKs() should dereference a level iterator that does not have locate, but if the level's iterator has locate, then use locate(min_ik) to return the PK (or nullopt). We check for the locate function using the compile-time function has_locate_v that is implemented in level_capabilities/locate.hpp
Rename Coiterate::coiteration_helper::locate(...) to deref_PKs because naming it "locate" makes it a bit confusing for the get_PKs implementation where we are calling locate.

Once this is done, co-iteration with arbitrary tuple of levels is supported. Now, we have been discussing adding optimizations to the co-iteration when a conjunctive merge takes place. However, Bharath and I couldn't come to a consensus here...

When some levels are unordered, and the conjunctive merge criterion is met, then the current proposal was to iterate entirely through the ordered levels. However, it seems like it could also be highly inefficient(?) because we do not know if an unordered level has reached its end. For example,

say we have A \intersect B and A has 1000 non-zero elements and B has 1 non-zero element. If A is ordered and B is not, we have to iterate over the entirety of A, when in reality, we should be able to exit early.

Signed-off-by: Adam Li <[email protected]>

adam2392 · 2023-06-14T21:56:00Z

A few issues on commit 78d5cbc

has_locate_v does not work as expected because it is trying to apply args.locate when args is a dense level.

In file included from /Users/adam2392/Documents/xsparse/test/source/coiteration_test.cpp:17:
/Users/adam2392/Documents/xsparse/include/xsparse/level_capabilities/co_iteration.hpp:191:64: error: no member named 'locate' in 'xsparse::level_capabilities::coordinate_value_iterate<xsparse::levels::dense, std::tuple<>, unsigned long, unsigned long, xsparse::level_properties<true, true, true, false, true>>::iteration_helper::iterator'
                                                        ? args.locate(m_coiterHelper.m_pkm1, min_ik)
                                                          ~~~~ ^

Investigating further, if I add static_assert(has_locate_v<decltype(h)>); to the TEST_CASE("Hashed-BaseCase") test, then this fails, even though h is a hashed level with locate().

/Users/adam2392/Documents/xsparse/test/source/hashed_test.cpp:45:5: error: static_assert failed due to requirement 'has_locate_v<xsparse::levels::hashed<std::tuple<>, unsigned long, unsigned long, xsparse::util::container_traits<std::vector, std::set, std::unordered_map>, xsparse::level_properties<false, false, false, false, false>>>'
    static_assert(has_locate_v<decltype(h)>);
    ^             ~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

It's possible has_locate_v is not implemented correctly, so I'm taking a look at that now.

hameerabbasi · 2023-06-15T07:21:01Z

say we have A \intersect B and A has 1000 non-zero elements and B has 1 non-zero element. If A is ordered and B is not, we have to iterate over the entirety of A, when in reality, we should be able to exit early.

I would be happy to hear about generalised solutions to this issue but sadly I do not immediately see any with unordered levels.

One thing I just thought of is that in advance_iter we'd need to locate into all the locate levels and ensure that they have a value at that location; otherwise advance again (since the merge is conjunctive).

adam2392 · 2023-06-15T18:54:45Z

I would be happy to hear about generalised solutions to this issue but sadly I do not immediately see any with unordered levels.

Would it be too brute-force to have the MergeLattice convert all unordered levels to ordered?

Alternatively, is there a cheap way to inspect the "size" of each level that is co-iterated? So we can then either co-iterate by advancing all iterators, or doing this advance-and-locate strategy?

One thing I just thought of is that in advance_iter we'd need to locate into all the locate levels and ensure that they have a value at that location; otherwise advance again (since the merge is conjunctive).

Why? If one is null opt, we're good to move on? Is this assuming we have a conjunction over only unordered levels?

has_locate_v

I think the error arose from the fact that it checked for member, not a member function with possible arguments. I changed the signature of the has_locate check and the static_assert(has_locate_v<decltype(h)>) now works as expected in hashed_test.hpp.

However, w/ this change, I get the following errors in 5fc8476:

/Users/adam2392/Documents/xsparse/include/xsparse/level_capabilities/co_iteration.hpp:191:64: error: no member named 'locate' in 'xsparse::level_capabilities::coordinate_value_iterate<xsparse::levels::dense, std::tuple<>, unsigned long, unsigned long, xsparse::level_properties<true, true, true, false, true>>::iteration_helper::iterator'
                                                        ? args.locate(m_coiterHelper.m_pkm1, min_ik)
                                                          ~~~~ ^

It's problematic because I think it is looking for locate inside dense... but has_locate_v(args) should evaluate to false when it is dense. There are a few issues:

locate is actually defined as part of the hashed level, not part of the hashed::iterator, so I don't think has_locate_v<decltype(args)> would evaluate to True when args corresponds to the hashed level.
for some reason... args.locate(...) is being called on dense, even tho they do not have a locate function.

Signed-off-by: Adam Li <[email protected]>

hameerabbasi · 2023-06-15T20:04:47Z

Would it be too brute-force to have the MergeLattice convert all unordered levels to ordered?

Yeah, that would be asymptotically worse.

Alternatively, is there a cheap way to inspect the "size" of each level that is co-iterated? So we can then either co-iterate by advancing all iterators, or doing this advance-and-locate strategy?

The merging strategy only works if iterators are ordered, therefore, advancing all iterators won't work. We have to locate into unordered levels.

Why? If one is null opt, we're good to move on? Is this assuming we have a conjunction over only unordered levels?

Ah, no. Advance iterator would advance only the ordered levels, but use the function F over all levels. That should already do what's required.

Signed-off-by: Adam Li <[email protected]>

adam2392 · 2023-06-16T20:49:15Z

NOTES 6/16/23:

Dense level should have locate function
change decltype level properties to static_assert

Signed-off-by: Adam Li <[email protected]>

adam2392 · 2023-06-21T02:52:30Z

Few comments:

snake_case vs CamelCase: the other private members in the level iterators are all snake_case, so I figured parent_type can be snake_case too?
Now there is a segfault when I run the co-iteration test. See details below and in the latest commit: 117baff

The code correctly co-iterates when there is a valid value at indices 0, 1, and 2, but as soon as it reaches index 3, there is a segfault. My guess is that there is a segfault when doing get_PKs_level for the hashed level, where locate returns null opt.

hameerabbasi · 2023-06-21T17:46:08Z

snake_case vs CamelCase: the other private members in the level iterators are all snake_case, so I figured parent_type can be snake_case too?

I'd do that for all types and template parameters.

Now there is a segfault when I run the co-iteration test. See details below and in the latest commit: 117baff

std::optional<T> can be implicitly cast to T. I'd try to figure out if we're doing this somewhere.

adam2392 · 2023-06-22T01:07:11Z

snake_case vs CamelCase: the other private members in the level iterators are all snake_case, so I figured parent_type can be snake_case too?

I'd do that for all types and template parameters.

Okay will do that in a separate PR.

Now there is a segfault when I run the co-iteration test. See details below and in the latest commit: 117baff

std::optional<T> can be implicitly cast to T. I'd try to figure out if we're doing this somewhere.

So after some digging, it turns out this is because the min_ik is getting set to 0 after the hashed level has finished iterating. Inside calc_min_ik, it turns out the min_ik goes from: 0, 1, 2, to 0 again, when it should ignore the hashed level. This causes the segfault.

Update: On further investigation, I think it could be because we are advancing the iterator regardless of hashed, or ordered every time. The iterator advances in an unordered fashion for the hashed level, so it results in some weird behavior.

I suppose the fix would be to not advance iterators corresponding to hashed level. I will add this since this is required anyways for the advance-and-locate iteration scheme (we just eat the cost of certain conjunctive merges e.g. in #25 (comment))

Signed-off-by: Adam Li <[email protected]>

adam2392 · 2023-06-23T18:37:14Z

Leaving notes for the call on Sunday. My blocker for this is some compiler error that I'm trying to understand: b5d0960.

Basically, I am trying to apply a function to a tuple of iterators given a bool condition specified by mask. The mask is the tuple of same length as the tuple of iterators. I thought I got it working on a sample function that just prints out something, but now there's an error, so I need to re-analyze if this is the right approach.

By doing advance_ordered_iter instead of advance_iter, one would only need to advance the iterators corresponding to ordered levels.

[Update]: I guess I am unsure how to best tackle this. I want to basically use template meta-programming to apply advance_iter recursively on all the iterators, but only if the corresponding mask element is true.

Signed-off-by: Adam Li <[email protected]>

bharath2438 · 2023-06-28T14:50:30Z

Hey @adam2392, in get_PKs_level(), passing std::get<I>(iterators) to get_PK_iter fails because it is not possible to bind the temporary iterator with iter&.
So, I would change get_PKs_level() as follows:

                template <std::size_t I>
                inline auto get_PKs_level() const noexcept
                {
                    using iter_type = std::tuple_element_t<I, decltype(iterators)>;
                    iter_type it(std::get<I>(iterators));
                    return get_PK_iter<iter_type, I>(it);
                }

This should compile.

I guess static_assert(false) always results in a static assertion failed? Does removing it results in successful compilation?

adam2392 · 2023-06-28T15:59:07Z

This should compile.

Thanks! This + a few minor fixes made it work and I am back to the runtime segfault, but all other tests pass, so we're good. I will refactor the calc min ik now.

I guess static_assert(false) always results in a static assertion failed? Does removing it results in successful compilation?

Yes this is true. I changed it to throw std::invalid_argument("levels should either be ordered or have locate function.");.

adam2392 · 2023-06-30T00:25:24Z

@hameerabbasi and @bharath2438 the main issue now is that the min_ik is progressing as we might expect with dense and hashed: 0, 1, 2, 3, 4, 4, 4, 4, .... I validated this with some print statements and the debugger. It's just infinite going unless I put a break in there.

The issue is that it continues iterating. My guess is that the coiterator does not realized it's "finished". How should we address this?

bharath2438 · 2023-06-30T10:39:31Z

Just to clarify - since this is a conjunctive merge, it should stop after printing 0, 1, 2. Correct?

adam2392 · 2023-06-30T14:12:41Z

Yes that is a good point that I missed.

It seems that the current coiteration code supports disjunction over ordered levels only? So then I guess there needs to be some if/else on whether or not the coiteration ended depending on the function F?

bharath2438 · 2023-06-30T15:18:12Z

It seems that the current coiteration code supports disjunction over ordered levels only? So then I guess there needs to be some if/else on whether or not the coiteration ended depending on the function F?

The immediate problem I see here is, we are never really advancing hashed, since it is unordered. Hence, hashed stays in the beginning and the whole have hashed and dense both reached their end part fails forever.

bharath2438 · 2023-06-30T15:36:32Z

One solution that I can think of is -

Maintain a max_ik for each of the unordered levels.
In advance_iter, set the iterator of unordered level to its end if (min_ik == max_ik_of_that_level)

hameerabbasi · 2023-06-30T15:44:14Z

Or... Just don't check the unordered levels at all when checking for the end condition.

adam2392 · 2023-07-05T02:05:13Z

Or... Just don't check the unordered levels at all when checking for the end condition.

Then this would imply us altering the end() method of the coiteration iterator? But shouldn't we also then alter begin()?

hameerabbasi · 2023-07-05T04:54:28Z

Or... Just don't check the unordered levels at all when checking for the end condition.

Then this would imply us altering the end() method of the coiteration iterator? But shouldn't we also then alter begin()?

It would imply altering the compare function. Otherwise everything else would break. 😉

hameerabbasi · 2023-07-05T10:15:47Z

Maybe this will help: We can do the filtering just before the comparison.

https://stackoverflow.com/a/50971954

Signed-off-by: Adam Li <[email protected]>

adam2392 · 2023-07-05T17:50:21Z

Maybe this will help: We can do the filtering just before the comparison.

https://stackoverflow.com/a/50971954

I tried a similar solution that was more in-line w/ the rest of the code's design: dc1b236

So the comparison basically outputs:

true if the iterator is associated with an unordered level
it_current == it_end if the iterator is ordered

This tuple of booleans are then fed into the comparisonHelper function F. For example F(true, true, false, false, ...). Is this valid then for me to default to returning true whenever the iterator is unordered?

hameerabbasi · 2023-07-05T18:21:33Z

Ouch... We need to feed in true for every unordered level.

adam2392 · 2023-07-05T18:34:13Z

Ouch... We need to feed in true for every unordered level.

Heh yeah it's kinda janky, but assuming the Coiterator is used with levels that have at least one ordered always and no disjunction containing unordered, this is fine?

I can clean up the PR if so, and make the unit-test pass now. I think everything works as it should.

hameerabbasi · 2023-07-05T18:57:34Z

SGTM

Signed-off-by: Adam Li <[email protected]>

adam2392 · 2023-07-06T02:23:16Z

Kay this is good to review @hameerabbasi @bharath2438

adam2392 · 2023-07-06T02:37:07Z

I plan on handling the compiler-time check for valid conjunctive merge using the comparisonHelper in a follow-on PR.

Signed-off-by: Adam Li <[email protected]>

bharath2438 · 2023-07-06T04:15:57Z

@adam2392, in the test case, why are we checking if (it1 == end1) == true (checking if dense has reached its end). As this is a conjunctive merge, shouldn't we check if hashed has reached its end instead? (since it is smaller than dense)

adam2392 · 2023-07-06T05:33:31Z

Since we don't advance unordered iterators, this is not the case.

Since there's no universally optimal way of advancing the ordered and unordered simultaneously, I just support the advance-and-locate strategy

Signed-off-by: Adam Li <[email protected]>

adam2392 · 2023-07-06T15:31:45Z

Okay all the CIs pass now

hameerabbasi

This looks excellent to me, thanks @adam2392.

adam2392 added 4 commits June 11, 2023 11:59

Working simple changes

276773c

Signed-off-by: Adam Li <[email protected]>

Add coiteration ordered level mask

84e1471

Signed-off-by: Adam Li <[email protected]>

Fixed coiterations

7a852ae

Signed-off-by: Adam Li <[email protected]>

Wip

3633a67

Signed-off-by: Adam Li <[email protected]>

adam2392 mentioned this pull request Jun 12, 2023

[ENH] Level capabilities refactor for hashed level #24

Closed

WIP after bharath meeting

78d5cbc

Signed-off-by: Adam Li <[email protected]>

Fixed has_locate_v

5fc8476

Signed-off-by: Adam Li <[email protected]>

WIP

2508e94

Signed-off-by: Adam Li <[email protected]>

adam2392 added 4 commits June 17, 2023 11:28

Works -s

f1e6b62

Segfault when accessing data

8269a99

Signed-off-by: Adam Li <[email protected]>

Clean up comments

83997fd

Signed-off-by: Adam Li <[email protected]>

Dissect the test

117baff

Signed-off-by: Adam Li <[email protected]>

Trying to advance only ordered iterators

b5d0960

Signed-off-by: Adam Li <[email protected]>

adam2392 added 3 commits June 24, 2023 17:18

WIP

0214423

Signed-off-by: Adam Li <[email protected]>

Try refactor get_PKs

0473b64

Signed-off-by: Adam Li <[email protected]>

WIP with Bharath

5f56d71

Signed-off-by: Adam Li <[email protected]>

This seems to work?

dc1b236

Signed-off-by: Adam Li <[email protected]>

adam2392 added 3 commits July 5, 2023 16:04

Fix naming of template parameters and make unit-test pass

625aec4

Signed-off-by: Adam Li <[email protected]>

Update PR to make tests pass

69febeb

Signed-off-by: Adam Li <[email protected]>

Fix unit-tests

e8c7bdb

Signed-off-by: Adam Li <[email protected]>

Fix diff

7794317

Signed-off-by: Adam Li <[email protected]>

Try again

e7a1ba4

Signed-off-by: Adam Li <[email protected]>

adam2392 mentioned this pull request Jul 7, 2023

[ENH] Check validity of coiteration #26

Merged

hameerabbasi approved these changes Jul 8, 2023

View reviewed changes

hameerabbasi merged commit b27b680 into hameerabbasi:main Jul 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Coiteration with hashed level #25

[ENH] Coiteration with hashed level #25

adam2392 commented Jun 11, 2023

adam2392 commented Jun 12, 2023

hameerabbasi commented Jun 13, 2023

adam2392 commented Jun 14, 2023

adam2392 commented Jun 14, 2023

hameerabbasi commented Jun 15, 2023

adam2392 commented Jun 15, 2023 •

edited

Loading

hameerabbasi commented Jun 15, 2023

adam2392 commented Jun 16, 2023

adam2392 commented Jun 21, 2023

hameerabbasi commented Jun 21, 2023

adam2392 commented Jun 22, 2023 •

edited

Loading

adam2392 commented Jun 23, 2023 •

edited

Loading

bharath2438 commented Jun 28, 2023

adam2392 commented Jun 28, 2023

adam2392 commented Jun 30, 2023

bharath2438 commented Jun 30, 2023 •

edited

Loading

adam2392 commented Jun 30, 2023

bharath2438 commented Jun 30, 2023

bharath2438 commented Jun 30, 2023

hameerabbasi commented Jun 30, 2023

adam2392 commented Jul 5, 2023

hameerabbasi commented Jul 5, 2023

hameerabbasi commented Jul 5, 2023

adam2392 commented Jul 5, 2023

hameerabbasi commented Jul 5, 2023

adam2392 commented Jul 5, 2023

hameerabbasi commented Jul 5, 2023

adam2392 commented Jul 6, 2023

adam2392 commented Jul 6, 2023

bharath2438 commented Jul 6, 2023

adam2392 commented Jul 6, 2023

adam2392 commented Jul 6, 2023

hameerabbasi left a comment

[ENH] Coiteration with hashed level #25

[ENH] Coiteration with hashed level #25

Conversation

adam2392 commented Jun 11, 2023

adam2392 commented Jun 12, 2023

hameerabbasi commented Jun 13, 2023

adam2392 commented Jun 14, 2023

06/14/23 Notes

adam2392 commented Jun 14, 2023

hameerabbasi commented Jun 15, 2023

adam2392 commented Jun 15, 2023 • edited Loading

hameerabbasi commented Jun 15, 2023

adam2392 commented Jun 16, 2023

adam2392 commented Jun 21, 2023

hameerabbasi commented Jun 21, 2023

adam2392 commented Jun 22, 2023 • edited Loading

adam2392 commented Jun 23, 2023 • edited Loading

bharath2438 commented Jun 28, 2023

adam2392 commented Jun 28, 2023

adam2392 commented Jun 30, 2023

bharath2438 commented Jun 30, 2023 • edited Loading

adam2392 commented Jun 30, 2023

bharath2438 commented Jun 30, 2023

bharath2438 commented Jun 30, 2023

hameerabbasi commented Jun 30, 2023

adam2392 commented Jul 5, 2023

hameerabbasi commented Jul 5, 2023

hameerabbasi commented Jul 5, 2023

adam2392 commented Jul 5, 2023

hameerabbasi commented Jul 5, 2023

adam2392 commented Jul 5, 2023

hameerabbasi commented Jul 5, 2023

adam2392 commented Jul 6, 2023

adam2392 commented Jul 6, 2023

bharath2438 commented Jul 6, 2023

adam2392 commented Jul 6, 2023

adam2392 commented Jul 6, 2023

hameerabbasi left a comment

Choose a reason for hiding this comment

adam2392 commented Jun 15, 2023 •

edited

Loading

adam2392 commented Jun 22, 2023 •

edited

Loading

adam2392 commented Jun 23, 2023 •

edited

Loading

bharath2438 commented Jun 30, 2023 •

edited

Loading