Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RecursionError: maximum recursion depth exceeded in comparison #57

Open
xanderdunn opened this issue Mar 30, 2021 · 6 comments
Open

RecursionError: maximum recursion depth exceeded in comparison #57

xanderdunn opened this issue Mar 30, 2021 · 6 comments

Comments

@xanderdunn
Copy link

I have a loss function and I'm trying to make a graph visualization on the calculation of the loss:

class MVP_Loss(Module):
    def __init__(self, crit=None, model=None):
        self.crit = ifnone(crit, MSELossFlat())
        self.model = model
        self.mask = slice(None)

    def forward(self, preds, target):
        # print(self.mask)
        loss_forward = self.crit(preds, target)
        print("Making dot...")
        dot = make_dot(loss_forward, params=dict(self.model.named_parameters()))
        dot.format = "png"
        dot.render("mseloss_dot")
        print("Made dot.")
        return loss_forward
        # return self.crit(preds[self.mask], target[self.mask])

I see this output:

Making dot...
Traceback (most recent call last):
  File "/home/xander/anaconda3/envs/my_model/lib/python3.7/site-packages/torchviz/dot.py", line 154, in add_base_tensor
    add_base_tensor(var._base, color='darkolivegreen3')
  File "/home/xander/anaconda3/envs/my_model/lib/python3.7/site-packages/torchviz/dot.py", line 154, in add_base_tensor
    add_base_tensor(var._base, color='darkolivegreen3')
  File "/home/xander/anaconda3/envs/my_model/lib/python3.7/site-packages/torchviz/dot.py", line 154, in add_base_tensor
    add_base_tensor(var._base, color='darkolivegreen3')
  [Previous line repeated 991 more times]
  File "/home/xander/anaconda3/envs/my_model/lib/python3.7/site-packages/torchviz/dot.py", line 149, in add_base_tensor
    dot.node(str(id(var)), get_var_name(var), fillcolor=color)
  File "/home/xander/anaconda3/envs/my_model/lib/python3.7/site-packages/graphviz/dot.py", line 131, in node
    attr_list = self._attr_list(label, attrs, _attributes)
  File "/home/xander/anaconda3/envs/my_model/lib/python3.7/site-packages/graphviz/lang.py", line 136, in attr_list
    content = a_list(label, kwargs, attributes)
  File "/home/xander/anaconda3/envs/my_model/lib/python3.7/site-packages/graphviz/lang.py", line 107, in a_list
    result = ['label=%s' % quote(label)] if label is not None else []
  File "/home/xander/anaconda3/envs/my_model/lib/python3.7/site-packages/graphviz/lang.py", line 75, in quote
    return '"%s"' % escape_unescaped_quotes(identifier)
  File "/home/xander/anaconda3/envs/my_model/lib/python3.7/re.py", line 311, in _subx
    template = _compile_repl(template, pattern)
RecursionError: maximum recursion depth exceeded in comparison

Python 3.7. torch 1.7.7. torchviz 0.0.2.

Do you have any thoughts on what might be causing this? If it's a blackbox I can try to make a minimal reproducing example.

@albanD
Copy link
Contributor

albanD commented Apr 6, 2021

That's interesting...
Could you share a code sample that we can use to reproduce that (on colab for example)?

@nihirv
Copy link

nihirv commented Jul 25, 2022

I have a similar usecase - I'm getting this error too.

Maybe our models are too big to be visualised?

  File "multi30k_main.py", line 67, in training_step
    make_dot(loss_lm, params=dict(self.model.model.named_parameters())).render("model_lm", format="png")
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/torchviz/dot.py", line 163, in make_dot
    add_base_tensor(var)
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/torchviz/dot.py", line 151, in add_base_tensor
    add_nodes(var.grad_fn)
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/torchviz/dot.py", line 134, in add_nodes
    add_nodes(u[0])
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/torchviz/dot.py", line 134, in add_nodes
    add_nodes(u[0])
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/torchviz/dot.py", line 134, in add_nodes
    add_nodes(u[0])
  [Previous line repeated 951 more times]
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/torchviz/dot.py", line 127, in add_nodes
    dot.node(str(id(fn)), get_fn_name(fn, show_attrs, max_attr_chars))
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/graphviz/_tools.py", line 171, in wrapper
    return func(*args, **kwargs)
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/graphviz/dot.py", line 196, in node
    attr_list = self._attr_list(label, kwargs=attrs, attributes=_attributes)
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/graphviz/_tools.py", line 171, in wrapper
    return func(*args, **kwargs)
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/graphviz/quoting.py", line 152, in attr_list
    content = a_list(label, kwargs=kwargs, attributes=attributes)
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/graphviz/_tools.py", line 171, in wrapper
    return func(*args, **kwargs)
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/graphviz/quoting.py", line 123, in a_list
    result = [f'label={quote(label)}'] if label is not None else []
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/graphviz/_tools.py", line 171, in wrapper
    return func(*args, **kwargs)
  File "/data/nv419/anaconda3/envs/gnmt/lib/python3.8/site-packages/graphviz/quoting.py", line 82, in quote
    if is_html_string(identifier) and not isinstance(identifier, NoHtml):
RecursionError: maximum recursion depth exceeded while calling a Python object

Is this a known issue with large/complicated models? I can create a reproducible example if not

@albanD
Copy link
Contributor

albanD commented Jul 25, 2022

Hi,

It is not very common to have big enough models for this to happen no.
I'm sure we could refactor the current recursive algorithm into an iterative one though if this is a blocker on your end.

@nihirv
Copy link

nihirv commented Jul 26, 2022

Hi @albanD. It's not a blocker (though a nice to have!). I'm not sure it's worth the effort of doing that for what seems to be 3 people.

However if you have any hints on why this might happen then it could be useful. The model I'm running this on has 3 full BERT-base encoders within it

@albanD
Copy link
Contributor

albanD commented Jul 26, 2022

My guess, based on the stack you shared, is just that the graph we're trying to build has too much depth. And because we use recursion every time to get to the next Node, the stack end up being pretty deep. In your case almost 1000 deep and python doesn't like that.

That's why the proposed fix was to limit the recursion depth by using iterative algorithms instead here.

@tanayag
Copy link

tanayag commented Oct 19, 2023

You increase the recursion depth by doing the following(change the depth as per your need):

import sys
sys.setrecursionlimit(1000000)

But the make_dot function took forever to run, we had a very deep network, with tons of multi-headed attention nodes and what not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants