Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should queues track size? #25

Open
treeowl opened this issue Nov 16, 2020 · 6 comments
Open

Should queues track size? #25

treeowl opened this issue Nov 16, 2020 · 6 comments

Comments

@treeowl
Copy link
Collaborator

treeowl commented Nov 16, 2020

It's possible to calculate the size in amortized O(log n) time. Once #24 is finished, that will improve to worst-case O(log n) time (except in the presence of pathological nesting of mapMonotonic calls). Is it really worth tracking the size to get that to O(1)? It's not terribly expensive, but it's one more word of allocation on insert.

@konsumlamm
Copy link
Collaborator

Imo size should stay O(1). That's what people expect and relaxing it to O(log n) doesn't improve any other complexities (and I doubt the performance gains in other functions would be significant), so I don't think it's worth it.

@konsumlamm
Copy link
Collaborator

Superseded by #33.

@konsumlamm konsumlamm closed this as not planned Won't fix, can't repro, duplicate, stale Jul 25, 2022
@treeowl
Copy link
Collaborator Author

treeowl commented Apr 2, 2023

I think we should open this idea back up. I disagree that people generally expect size to be $O(1)$. It isn't for IntMap or HashMap, for example. I would actually be surprised if a substantial number of algorithms needed the size at all.

@treeowl treeowl reopened this Apr 2, 2023
@konsumlamm
Copy link
Collaborator

I think we should open this idea back up. I disagree that people generally expect size to be O(1). It isn't for IntMap or HashMap, for example. I would actually be surprised if a substantial number of algorithms needed the size at all.

Well, I do generally expect size to be O(1). I'd also expect IntMap and HashMap to have O(1) size, fwiw (and I'm not the only one, see haskell/containers#763 and haskell-unordered-containers/unordered-containers#138). I also don't know any other language where size for standard collections isn't O(1) (save for lists in FP languages).

For IntMap and HashMap, you could argue that it makes operations like union, difference & intersection more complicated, but I don't see that being the case here (e.g. union can just add the sizes, since duplicate keys are allowed).

A few examples where you want fast size, that I can think of off the top of my head:

  • wanting to check that the queue doesn't exceed a maximum size
  • checking if an operation like take/drop will "succeed (i.e. if the operation actually takes/drops the desired amount of elements)
  • choosing different strategies depending on the number of elements (e.g. in a scheduler)
  • generally keeping track of how many items you have (e.g. for debugging, UI)

@treeowl
Copy link
Collaborator Author

treeowl commented Apr 25, 2023

FWIW, I've dug into all the packages on Hackage that show a dependency on pqueue. Here are the results:

Real dependencies

exference: USES SIZE. However, to my (admittedly untrained) eye, it looks like it's being used in contexts where the other things going on are going to be considerably more expensive than a log-time size calculation.

reactive-banana: does not use size
chr-core: does not use size
beam-migrate: does not use size
LPPaver: does not use size
patch-image: does not use size

False dependencies

uhc-util — false dependency; opened PR to remove
time-warp — abandoned project; archived Git repo
second-transfer — Github master no longer depends on pqueue. The project also looks abandoned.
magic-wormhole — false dependency; opened PR to remove


If, as this limited survey suggests, using size is rare in practice, it would seem potentially reasonable to make "sized queues" that store their size separately. This will be less efficient for applications that use those (except where they can unpack the queues), but more efficient for other applications.

@konsumlamm
Copy link
Collaborator

TBH, I think this primarily shows that not many packages depend on pqueue (at least on Hackage). I'm not convinced that removing the size field is worth it. Not having O(1) size just feels wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants