Limit TimeConstrained to one per eval. #1202

rocky · 2024-12-02T16:14:31Z

Allowing a TimeConstrained evaluation to contain another TimeConstrained evalution is tricky. We would have on the second Timeconstrained while one is already pending would force us to not blindly set SIGALRM but instead figure out which ALARM comes earlier and when that is hit handle the TimeConstrained and queue another SIGLARM the amout of time of the second evaluation.

So for now, punt and just don't allow a second TimeConstrained to run but instead return failure.

mmatera · 2024-12-02T21:04:44Z

mathics/builtin/datentime.py

                done = False
+                if self.is_running_TimeConstrained:


A way to check if we have already set a time constraint is to check the length of evaluation.timeout_queue. I remember that with the previous implementation, this kind of nested constraint used to work.

mmatera · 2024-12-02T21:16:55Z

At least locally, with the version in the master branch,

TimeConstrained[TimeConstrained[Integrate[Sin[x] ^ 1000, {x,0,Pi}],.1],.1],
TimeConstrained[TimeConstrained[Integrate[Sin[x] ^ 1000, {x,0,Pi}],10],.1],

TimeConstrained[TimeConstrained[Integrate[Sin[x] ^ 1000, {x,0,Pi}],0.1],10]
and
TimeConstrained[TimeConstrained[Integrate[Sin[x] ^ 1000, {x,0,Pi}],10],10]

work as expected. The issue seems to happen when both wall times are similar, and coincides with the time that
the evaluation actually takes. For example, I see in my system that

In[2]:= TimeConstrained[TimeConstrained[Integrate[Sin[x] ^ 1000, {x,0,Pi}],.1],.1]
Exception ignored in: <function WeakSet.__init__.<locals>._remove at 0x760efa204dc0>
Traceback (most recent call last):
  File "/home/mauricio/.conda/envs/pystonmathics/lib/python3.8-pyston2.3/_weakrefset.py", line 38, in _remove
stopit.utils.TimeoutException: 
Out[2]= 4223253764772446398681479587905863679627375131977316984490513673541022323523784279665820061320349533833790720078461657887534527344541873711717097182804976178494773191621120833690038501245904489755696471267492150071422577902441998133040640534678543005733060259424013478163819189207764154121872206505 Pi / 167423219872854268898191413915625282900219501828989626163085998182867351738271269139562246689952477832436667643367679191435491450889424069312259024604665231311477621481628609147204290704099549091843034096141351171618467832303105743111961624157454108040174944963852221369694216119572256044331338563584

produces the exception one-third of the time. Probably, the way to handle this is just by capturing that exception.

rocky · 2024-12-02T21:40:08Z

At least locally, with the version in the master branch,

"At least locally" isn't good enough. Recall that we have also been seeing failures in CI tests with TimeConstraint.

Right now, my goal is to have something that unblocks @aravindh-krishnamoorthy. For this, there does not have to be a long-term solution. If this branch works, it can stay a branch and not get merge.

If checking evaluation.timeout_queue instead of the variable is_running_TimeConstrained works better, great! let's use that then. It is probably more reliable.

produces the exception one-third of the time. Probably, the way to handle this is just by capturing that exception.

Looking for specific exceptions is hacky and fragile. A solution like this is likely to break on different Python versions, Python implementations, and Operating Systems. This kind of thinking (it seems to work here - I just need to hack some special cases) leads us down a rabbit hole that I don't think we'll be ever able to climb back out of.

What I'd like to see is something general and simple, and that can work with Rubi. If we have to sacrifice powerfulness, and Rubi searching quality that's okay. Once we have something working we can try to improve things.

mmatera · 2024-12-02T23:00:53Z

@rocky , what I was trying to understand is what is the actual problem with nested TimeConstraint expressions. Is an issue of the stopit module, or is about how we use it? Is it related with this random failures in the tests?

rocky · 2024-12-02T23:25:50Z

@rocky , what I was trying to understand is what is the actual problem with nested TimeConstraint expressions. Is an issue of the stopit module, or is about how we use it? Is it related with this random failures in the tests?

glenfant/stopit#17 suggests that others have had a problem when nesting in the same thread.

aravindh-krishnamoorthy · 2024-12-03T13:36:56Z

Hello @rocky. Firstly, thank you very much for identifying the underlying cause and unblocking me. Indeed, with this fix, Rubi 1 Algebraic functions tests run within a reasonable timeframe. However, unfortunately, there's still a problem.

This fix breaks the function ValidAntiderivative in Test.m, which validates "suboptimal" results that have a higher leafcount. (Presently, we generate many "suboptimal" antiderivatives and I see optimising them as Step #2. So a fix will take some time). These "suboptimal" results are currently marked as invalid.

I'll try to get ValidAntiderivative working based on your fix in this PR. Once I find a solution, I'll push my changes to this PR for review.

rocky · 2024-12-03T14:55:45Z

This fix breaks the function ValidAntiderivative in Test.m, which validates "suboptimal" results that have a higher leafcount. (Presently, we generate many "suboptimal" antiderivatives and I see optimising them as Step #2. So a fix will take some time). These "suboptimal" results are currently marked as invalid.

I suspect we can get a better TimeConstrained function by having it create a new thread (up to some limit) for each expression that is to be time constrained. However, please let's not do this right now but, as you suggest, leave this for later as a second step.

We have lots out-right bugs in the code and missing functionality in certain built-in functions. If we could remove more of those first and get some sort of Rubi subset going, this would be great.

I'll try to get ValidAntiderivative working based on your fix in this PR. Once I find a solution, I'll push my changes to this PR for review.

Thanks. But also please, look for how we can break this large task into smaller well-defined pieces. Maybe just the first two or three sections of 1 Algebraic functions. The list in
Mathics3/Mathics3-Rubi#2 is already enough to start working on.

mmatera · 2024-12-03T17:38:10Z

@rocky , what I was trying to understand is what is the actual problem with nested TimeConstraint expressions. Is an issue of the stopit module, or is about how we use it? Is it related with this random failures in the tests?

glenfant/stopit#17 suggests that others have had a problem when nesting in the same thread.

Yep. The old code I wrote handle this by keeping a queue of calls, and using different threads on each call. Maybe during my holidays I can try to propose a fix for stopit, but I need to study better how is currently implemented.

aravindh-krishnamoorthy · 2024-12-03T18:16:36Z

In the above commits (16de5ec, be4fe46), during a recursive call, instead of returning failexpr when a TimeConstrained call is already underway, the time constraint is ignored and expr is evaluated as usual. Hence, only the outermost time constraint is honoured.

With this change, a mini-test with $76$ cases runs to completion as expected.

I'll now run the full 1 Algebraic functions suit (will take a day or two to finish) and continue with the rest of the items on the Rubi PR.

rocky · 2024-12-03T18:46:03Z

Yep. The old code I wrote handle this by keeping a queue of calls, and using different threads on each call. Maybe during my holidays I can try to propose a fix for stopit, but I need to study better how is currently implemented.

Getting a fix/PR for in the stopit repository would be awesome. That code hasn't significantly changed in about 6 years or when Python 3.6 was around. Python has probably changed threading since then and different kinds of threads used in Python is on the horizon.

aravindh-krishnamoorthy · 2024-12-04T11:36:16Z

Unfortunately, there still seems to be an issue. The issue is not immediately apparent.

When running long Rubi tests, TimeConstrained randomly? stops timing out, even if it's the top level call. I am not sure if this is related to missing the exception due to the "unraisable exception" issue or due to self.is_running_TimeConstrained = False being missed in some rare branch of execution...

…hics-core into at-most-one-TimeConstrained

rocky · 2024-12-04T11:59:59Z

Unfortunately, there still seems to be an issue. The issue is not immediately apparent.

When running long Rubi tests, TraceEvaluation randomly? stops timing out, even if it's the top level call. I am not sure if this is related to missing the exception due to the "unraisable exception" issue or due to self.is_running_TimeConstrained = False being missed in some rare branch of execution...

I need more information here. If you have logs that show both what what you invoked and what you got, that would be helpful.

Why is TraceEvaluation needed? Finally, I should mention that signal handling was added to the Mathics debugger.

Here again is how to use this:

Without the debugger, but with trepan3k installed, you can use Breakpoint[], and issue the handle command. You won't get as nice of a traceback, but it should still work.

If you want to go into the debugger and look at is_running_TimeConstained change no-stop to stop.

aravindh-krishnamoorthy · 2024-12-04T12:07:03Z

Why is TraceEvaluation needed? Finally, I should mention that signal handling was added to the Mathics debugger.

Very sorry for this, @rocky. I actually meant TimeConstrained[] and not TraceEvaluation[]. I mixed them up. So, the right paragraph is...

When running long Rubi tests, TimeConstrained randomly? stops timing out, even if it's the top level call. I am not sure if this is related to missing the exception due to the "unraisable exception" issue or due to self.is_running_TimeConstrained = False being missed in some rare branch of execution...

I'm working on finding a small reproducible example. But this seems to happen randomly, which makes debugging a bit difficult.

axkr · 2024-12-04T18:55:30Z

I'm working on finding a small reproducible example. But this seems to happen randomly, which makes debugging a bit difficult.

An idea which I haven't tested yet. Maybe you can implement TimeRemaining[] and test nested TimeConstrained like this:

https://reference.wolfram.com/language/ref/TimeRemaining.html

rocky requested a review from mmatera December 2, 2024 16:14

rocky force-pushed the at-most-one-TimeConstrained branch from 0cd97b0 to af3fc4b Compare December 2, 2024 17:22

mmatera reviewed Dec 2, 2024

View reviewed changes

rocky marked this pull request as draft December 2, 2024 21:40

rocky added 2 commits December 2, 2024 16:47

Limit TimeConstrained to one per eval.

2eec05e

Update docstring

390f7ea

rocky force-pushed the at-most-one-TimeConstrained branch from af3fc4b to 390f7ea Compare December 2, 2024 21:47

aravindh-krishnamoorthy added 2 commits December 3, 2024 18:45

Cleanup for nested TimeConstrained[]

16de5ec

Black/iSort runs.

be4fe46

Merge branch 'master' into at-most-one-TimeConstrained

e7b9735

rocky added 2 commits December 4, 2024 06:49

Merge github.com:Mathics3/mathics-core into at-most-one-TimeConstrained

7ded9a4

Merge branch 'at-most-one-TimeConstrained' of github.com:Mathics3/mat…

01e7fca

…hics-core into at-most-one-TimeConstrained

rocky and others added 4 commits December 11, 2024 11:59

Merge branch 'master' into at-most-one-TimeConstrained

746c1b2

With black 23.12.1

7256d47

Merge branch 'master' into at-most-one-TimeConstrained

d138594

Merge branch 'master' into at-most-one-TimeConstrained

e8446a2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit TimeConstrained to one per eval. #1202

Limit TimeConstrained to one per eval. #1202

rocky commented Dec 2, 2024

mmatera Dec 2, 2024

mmatera commented Dec 2, 2024 •

edited

Loading

rocky commented Dec 2, 2024 •

edited

Loading

mmatera commented Dec 2, 2024

rocky commented Dec 2, 2024

aravindh-krishnamoorthy commented Dec 3, 2024

rocky commented Dec 3, 2024

mmatera commented Dec 3, 2024

aravindh-krishnamoorthy commented Dec 3, 2024 •

edited

Loading

rocky commented Dec 3, 2024

aravindh-krishnamoorthy commented Dec 4, 2024 •

edited

Loading

rocky commented Dec 4, 2024 •

edited

Loading

aravindh-krishnamoorthy commented Dec 4, 2024

axkr commented Dec 4, 2024 •

edited

Loading

Limit TimeConstrained to one per eval. #1202

Are you sure you want to change the base?

Limit TimeConstrained to one per eval. #1202

Conversation

rocky commented Dec 2, 2024

mmatera Dec 2, 2024

Choose a reason for hiding this comment

mmatera commented Dec 2, 2024 • edited Loading

rocky commented Dec 2, 2024 • edited Loading

mmatera commented Dec 2, 2024

rocky commented Dec 2, 2024

aravindh-krishnamoorthy commented Dec 3, 2024

rocky commented Dec 3, 2024

mmatera commented Dec 3, 2024

aravindh-krishnamoorthy commented Dec 3, 2024 • edited Loading

rocky commented Dec 3, 2024

aravindh-krishnamoorthy commented Dec 4, 2024 • edited Loading

rocky commented Dec 4, 2024 • edited Loading

aravindh-krishnamoorthy commented Dec 4, 2024

axkr commented Dec 4, 2024 • edited Loading

mmatera commented Dec 2, 2024 •

edited

Loading

rocky commented Dec 2, 2024 •

edited

Loading

aravindh-krishnamoorthy commented Dec 3, 2024 •

edited

Loading

aravindh-krishnamoorthy commented Dec 4, 2024 •

edited

Loading

rocky commented Dec 4, 2024 •

edited

Loading

axkr commented Dec 4, 2024 •

edited

Loading