Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Coiteration with hashed level #25

Merged
merged 23 commits into from
Jul 8, 2023
Merged

Conversation

adam2392
Copy link
Contributor

Supersedes: #24 and #19

adam2392 added 4 commits June 11, 2023 11:59
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
@adam2392
Copy link
Contributor Author

My current thinking is the following:

  • advance_iter, this advances the iterator for each level using the ++ operator. We should leave this as is?
  • in operator++(), we want to advance_iter only for the levels that are ordered since otw, we can just locate into it. I would want something along these lines, but this wouldn't work.
auto levelMask = m_coiterHelper.m_coiterate.ordered_level_mask();
                    std::size_t i = 0;
                    std::apply([&](auto&... args) {
                        ((std::get<i++>(levelMask) ? advance_iter(args) : void()), ...);
                    }, iterators);
  • dereferencing (i.e. get_PKs): The current function gets the second element of the iterator currently if the iterator is currently at min_ik. In a hashed level, how would we locate into the element after min_ik? Then, we can modify the locate function to either: i) get the second element if the level is ordered, or ii) locate into the next index if the level is unordered. I'm unsure how to best tackle this.
  • calc_min_ik(): I'm going back and forth on whether this should also get modified. Since it computes the current minimum index, then does it just ignore hashed level in this computation?

@hameerabbasi
Copy link
Owner

In all but the third point, you would do the calculation exactly as-is, just only with the ordered levels. Just the deference would take into account the locate-capable unordered levels.

The reason is that we ensure a conjunctive merge with unordered levels so the minimum/advance (the last of which is the same as operator++) etc will work exactly as with just the ordered levels.

@adam2392
Copy link
Contributor Author

06/14/23 Notes

Today, @bharath2438 and I met briefly. Here are a few notes that I am currently implementing:

  1. Coiterate::coiteration_helper::get_PKs() should dereference a level iterator that does not have locate, but if the level's iterator has locate, then use locate(min_ik) to return the PK (or nullopt). We check for the locate function using the compile-time function has_locate_v that is implemented in level_capabilities/locate.hpp
  2. Rename Coiterate::coiteration_helper::locate(...) to deref_PKs because naming it "locate" makes it a bit confusing for the get_PKs implementation where we are calling locate.

Once this is done, co-iteration with arbitrary tuple of levels is supported. Now, we have been discussing adding optimizations to the co-iteration when a conjunctive merge takes place. However, Bharath and I couldn't come to a consensus here...

When some levels are unordered, and the conjunctive merge criterion is met, then the current proposal was to iterate entirely through the ordered levels. However, it seems like it could also be highly inefficient(?) because we do not know if an unordered level has reached its end. For example,

say we have A \intersect B and A has 1000 non-zero elements and B has 1 non-zero element. If A is ordered and B is not, we have to iterate over the entirety of A, when in reality, we should be able to exit early.

@adam2392
Copy link
Contributor Author

A few issues on commit 78d5cbc

  1. has_locate_v does not work as expected because it is trying to apply args.locate when args is a dense level.
In file included from /Users/adam2392/Documents/xsparse/test/source/coiteration_test.cpp:17:
/Users/adam2392/Documents/xsparse/include/xsparse/level_capabilities/co_iteration.hpp:191:64: error: no member named 'locate' in 'xsparse::level_capabilities::coordinate_value_iterate<xsparse::levels::dense, std::tuple<>, unsigned long, unsigned long, xsparse::level_properties<true, true, true, false, true>>::iteration_helper::iterator'
                                                        ? args.locate(m_coiterHelper.m_pkm1, min_ik)
                                                          ~~~~ ^
  1. Investigating further, if I add static_assert(has_locate_v<decltype(h)>); to the TEST_CASE("Hashed-BaseCase") test, then this fails, even though h is a hashed level with locate().
/Users/adam2392/Documents/xsparse/test/source/hashed_test.cpp:45:5: error: static_assert failed due to requirement 'has_locate_v<xsparse::levels::hashed<std::tuple<>, unsigned long, unsigned long, xsparse::util::container_traits<std::vector, std::set, std::unordered_map>, xsparse::level_properties<false, false, false, false, false>>>'
    static_assert(has_locate_v<decltype(h)>);
    ^             ~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

It's possible has_locate_v is not implemented correctly, so I'm taking a look at that now.

@hameerabbasi
Copy link
Owner

say we have A \intersect B and A has 1000 non-zero elements and B has 1 non-zero element. If A is ordered and B is not, we have to iterate over the entirety of A, when in reality, we should be able to exit early.

I would be happy to hear about generalised solutions to this issue but sadly I do not immediately see any with unordered levels.

One thing I just thought of is that in advance_iter we'd need to locate into all the locate levels and ensure that they have a value at that location; otherwise advance again (since the merge is conjunctive).

@adam2392
Copy link
Contributor Author

adam2392 commented Jun 15, 2023

I would be happy to hear about generalised solutions to this issue but sadly I do not immediately see any with unordered levels.

Would it be too brute-force to have the MergeLattice convert all unordered levels to ordered?

Alternatively, is there a cheap way to inspect the "size" of each level that is co-iterated? So we can then either co-iterate by advancing all iterators, or doing this advance-and-locate strategy?

One thing I just thought of is that in advance_iter we'd need to locate into all the locate levels and ensure that they have a value at that location; otherwise advance again (since the merge is conjunctive).

Why? If one is null opt, we're good to move on? Is this assuming we have a conjunction over only unordered levels?

has_locate_v

I think the error arose from the fact that it checked for member, not a member function with possible arguments. I changed the signature of the has_locate check and the static_assert(has_locate_v<decltype(h)>) now works as expected in hashed_test.hpp.

However, w/ this change, I get the following errors in 5fc8476:

/Users/adam2392/Documents/xsparse/include/xsparse/level_capabilities/co_iteration.hpp:191:64: error: no member named 'locate' in 'xsparse::level_capabilities::coordinate_value_iterate<xsparse::levels::dense, std::tuple<>, unsigned long, unsigned long, xsparse::level_properties<true, true, true, false, true>>::iteration_helper::iterator'
                                                        ? args.locate(m_coiterHelper.m_pkm1, min_ik)
                                                          ~~~~ ^

It's problematic because I think it is looking for locate inside dense... but has_locate_v(args) should evaluate to false when it is dense. There are a few issues:

  1. locate is actually defined as part of the hashed level, not part of the hashed::iterator, so I don't think has_locate_v<decltype(args)> would evaluate to True when args corresponds to the hashed level.
  2. for some reason... args.locate(...) is being called on dense, even tho they do not have a locate function.

Signed-off-by: Adam Li <[email protected]>
@hameerabbasi
Copy link
Owner

Would it be too brute-force to have the MergeLattice convert all unordered levels to ordered?

Yeah, that would be asymptotically worse.

Alternatively, is there a cheap way to inspect the "size" of each level that is co-iterated? So we can then either co-iterate by advancing all iterators, or doing this advance-and-locate strategy?

The merging strategy only works if iterators are ordered, therefore, advancing all iterators won't work. We have to locate into unordered levels.

Why? If one is null opt, we're good to move on? Is this assuming we have a conjunction over only unordered levels?

Ah, no. Advance iterator would advance only the ordered levels, but use the function F over all levels. That should already do what's required.

Signed-off-by: Adam Li <[email protected]>
@adam2392
Copy link
Contributor Author

NOTES 6/16/23:

  1. Dense level should have locate function
  2. change decltype level properties to static_assert

@adam2392
Copy link
Contributor Author

Few comments:

  1. snake_case vs CamelCase: the other private members in the level iterators are all snake_case, so I figured parent_type can be snake_case too?
  2. Now there is a segfault when I run the co-iteration test. See details below and in the latest commit: 117baff

The code correctly co-iterates when there is a valid value at indices 0, 1, and 2, but as soon as it reaches index 3, there is a segfault. My guess is that there is a segfault when doing get_PKs_level for the hashed level, where locate returns null opt.

@hameerabbasi
Copy link
Owner

  1. snake_case vs CamelCase: the other private members in the level iterators are all snake_case, so I figured parent_type can be snake_case too?

I'd do that for all types and template parameters.

  1. Now there is a segfault when I run the co-iteration test. See details below and in the latest commit: 117baff

std::optional<T> can be implicitly cast to T. I'd try to figure out if we're doing this somewhere.

@adam2392
Copy link
Contributor Author

adam2392 commented Jun 22, 2023

  1. snake_case vs CamelCase: the other private members in the level iterators are all snake_case, so I figured parent_type can be snake_case too?

I'd do that for all types and template parameters.

Okay will do that in a separate PR.

  1. Now there is a segfault when I run the co-iteration test. See details below and in the latest commit: 117baff

std::optional<T> can be implicitly cast to T. I'd try to figure out if we're doing this somewhere.

So after some digging, it turns out this is because the min_ik is getting set to 0 after the hashed level has finished iterating. Inside calc_min_ik, it turns out the min_ik goes from: 0, 1, 2, to 0 again, when it should ignore the hashed level. This causes the segfault.

Update: On further investigation, I think it could be because we are advancing the iterator regardless of hashed, or ordered every time. The iterator advances in an unordered fashion for the hashed level, so it results in some weird behavior.

I suppose the fix would be to not advance iterators corresponding to hashed level. I will add this since this is required anyways for the advance-and-locate iteration scheme (we just eat the cost of certain conjunctive merges e.g. in #25 (comment))

@adam2392
Copy link
Contributor Author

adam2392 commented Jun 23, 2023

Leaving notes for the call on Sunday. My blocker for this is some compiler error that I'm trying to understand: b5d0960.

Basically, I am trying to apply a function to a tuple of iterators given a bool condition specified by mask. The mask is the tuple of same length as the tuple of iterators. I thought I got it working on a sample function that just prints out something, but now there's an error, so I need to re-analyze if this is the right approach.

By doing advance_ordered_iter instead of advance_iter, one would only need to advance the iterators corresponding to ordered levels.

[Update]: I guess I am unsure how to best tackle this. I want to basically use template meta-programming to apply advance_iter recursively on all the iterators, but only if the corresponding mask element is true.

adam2392 added 3 commits June 24, 2023 17:18
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
@bharath2438
Copy link
Collaborator

Hey @adam2392, in get_PKs_level(), passing std::get<I>(iterators) to get_PK_iter fails because it is not possible to bind the temporary iterator with iter&.
So, I would change get_PKs_level() as follows:

                template <std::size_t I>
                inline auto get_PKs_level() const noexcept
                {
                    using iter_type = std::tuple_element_t<I, decltype(iterators)>;
                    iter_type it(std::get<I>(iterators));
                    return get_PK_iter<iter_type, I>(it);
                }

This should compile.

I guess static_assert(false) always results in a static assertion failed? Does removing it results in successful compilation?

@adam2392
Copy link
Contributor Author

This should compile.

Thanks! This + a few minor fixes made it work and I am back to the runtime segfault, but all other tests pass, so we're good. I will refactor the calc min ik now.

I guess static_assert(false) always results in a static assertion failed? Does removing it results in successful compilation?

Yes this is true. I changed it to throw std::invalid_argument("levels should either be ordered or have locate function.");.

@adam2392
Copy link
Contributor Author

@hameerabbasi and @bharath2438 the main issue now is that the min_ik is progressing as we might expect with dense and hashed: 0, 1, 2, 3, 4, 4, 4, 4, .... I validated this with some print statements and the debugger. It's just infinite going unless I put a break in there.

The issue is that it continues iterating. My guess is that the coiterator does not realized it's "finished". How should we address this?

@bharath2438
Copy link
Collaborator

bharath2438 commented Jun 30, 2023

Just to clarify - since this is a conjunctive merge, it should stop after printing 0, 1, 2. Correct?

@adam2392
Copy link
Contributor Author

Yes that is a good point that I missed.

It seems that the current coiteration code supports disjunction over ordered levels only? So then I guess there needs to be some if/else on whether or not the coiteration ended depending on the function F?

@bharath2438
Copy link
Collaborator

It seems that the current coiteration code supports disjunction over ordered levels only? So then I guess there needs to be some if/else on whether or not the coiteration ended depending on the function F?

The immediate problem I see here is, we are never really advancing hashed, since it is unordered. Hence, hashed stays in the beginning and the whole have hashed and dense both reached their end part fails forever.

@bharath2438
Copy link
Collaborator

One solution that I can think of is -

  1. Maintain a max_ik for each of the unordered levels.
  2. In advance_iter, set the iterator of unordered level to its end if (min_ik == max_ik_of_that_level)

@hameerabbasi
Copy link
Owner

Or... Just don't check the unordered levels at all when checking for the end condition.

@adam2392
Copy link
Contributor Author

adam2392 commented Jul 5, 2023

Or... Just don't check the unordered levels at all when checking for the end condition.

Then this would imply us altering the end() method of the coiteration iterator? But shouldn't we also then alter begin()?

@hameerabbasi
Copy link
Owner

Or... Just don't check the unordered levels at all when checking for the end condition.

Then this would imply us altering the end() method of the coiteration iterator? But shouldn't we also then alter begin()?

It would imply altering the compare function. Otherwise everything else would break. 😉

@hameerabbasi
Copy link
Owner

Maybe this will help: We can do the filtering just before the comparison.

https://stackoverflow.com/a/50971954

Signed-off-by: Adam Li <[email protected]>
@adam2392
Copy link
Contributor Author

adam2392 commented Jul 5, 2023

Maybe this will help: We can do the filtering just before the comparison.

https://stackoverflow.com/a/50971954

I tried a similar solution that was more in-line w/ the rest of the code's design: dc1b236

So the comparison basically outputs:

  • true if the iterator is associated with an unordered level
  • it_current == it_end if the iterator is ordered

This tuple of booleans are then fed into the comparisonHelper function F. For example F(true, true, false, false, ...). Is this valid then for me to default to returning true whenever the iterator is unordered?

@hameerabbasi
Copy link
Owner

Ouch... We need to feed in true for every unordered level.

@adam2392
Copy link
Contributor Author

adam2392 commented Jul 5, 2023

Ouch... We need to feed in true for every unordered level.

Heh yeah it's kinda janky, but assuming the Coiterator is used with levels that have at least one ordered always and no disjunction containing unordered, this is fine?

I can clean up the PR if so, and make the unit-test pass now. I think everything works as it should.

@hameerabbasi
Copy link
Owner

SGTM

@adam2392
Copy link
Contributor Author

adam2392 commented Jul 6, 2023

Kay this is good to review @hameerabbasi @bharath2438

@adam2392
Copy link
Contributor Author

adam2392 commented Jul 6, 2023

I plan on handling the compiler-time check for valid conjunctive merge using the comparisonHelper in a follow-on PR.

Signed-off-by: Adam Li <[email protected]>
@bharath2438
Copy link
Collaborator

@adam2392, in the test case, why are we checking if (it1 == end1) == true (checking if dense has reached its end). As this is a conjunctive merge, shouldn't we check if hashed has reached its end instead? (since it is smaller than dense)

@adam2392
Copy link
Contributor Author

adam2392 commented Jul 6, 2023

Since we don't advance unordered iterators, this is not the case.

Since there's no universally optimal way of advancing the ordered and unordered simultaneously, I just support the advance-and-locate strategy

Signed-off-by: Adam Li <[email protected]>
@adam2392
Copy link
Contributor Author

adam2392 commented Jul 6, 2023

Okay all the CIs pass now

Copy link
Owner

@hameerabbasi hameerabbasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks excellent to me, thanks @adam2392.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants