Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Py2Puml work with Annotated (e.g. such that pydantic dataclasses work with automatic range checks) #65

Open
Quaditz opened this issue Oct 22, 2023 · 0 comments

Comments

@Quaditz
Copy link

Quaditz commented Oct 22, 2023

Hello!

I am an eager py2puml user and thanks to py2puml it is possible to create complex data structures without losing the overview!

I recently switched to use pydantic dataclasses instead of the "normal" dataclasses, because pydantic has a HUGE list of advantages over the standard dataclasses. The main advantage for me is:

  1. The syntax is the same as the standard dataclass
  2. You can add numeric ranges that are automatically checked.

The following example works already:

from pydantic.dataclasses import dataclass

@dataclass
class A:
    number: int

Beautiful! Now, let's add a constraint to that:

from pydantic.dataclasses import dataclass
from pydantic import Field

@dataclass
class A:
    positiveNumber: int = Field(gt=0)

Also that works! Although it's a pydantic specific feature, py2puml can still create the puml because pydantic uses the normal dataclass in the background. That is great!

But now, pydantic also provides types directly, like PositiveInt. Lets try it:

from pydantic.dataclasses import dataclass
from pydantic import Field, PositiveInt

@dataclass
class A:
    positiveNumber: PositiveInt

This crashes!

File "C:\src\pkvgkv\.venv\Lib\site-packages\py2puml\parsing\compoundtypesplitter.py", line 60, in __init__
    raise ValueError(f'{compound_type_annotation} seems to be an invalid type annotation')      
ValueError: typing.Annotated[int, Gt(gt=0)] seems to be an invalid type annotation

Why is this a problem? I could just use the first method? Not in this example:

from pydantic.dataclasses import dataclass
from pydantic import Field, PositiveInt

@dataclass
class A:
    positiveNumbers: list[PositiveInt]

Here I have to use the PositiveInt from pydantic, otherwise I can't constraint the elements of the list :(

In the background, pydantic creates these types with the following vanilla python functionality:

from annotated_types import Gt
from typing_extensions import Annotated

PositiveInt = Annotated[int, Gt(0)]

This means, if the following example compiles (without any pydantic code), then the pydantic code would also compile:

from annotated_types import Gt
from typing_extensions import Annotated


@dataclass
class A:
    positiveNumber: Annotated[int, Gt(0)]

This means, if py2puml would support "Annotated" from typing_extensions, then we could use all the nice pydantic features with robust dataclasses and automatic range checking and everything and STILL have the great py2puml overview!

For me, it is not so important that the puml also shows the range etc. It would be totally sufficient as first goal to still compile and show that it is an int.

So:

  1. Goal: py2puml should still compile the puml even when "Annotated" is used.
  2. Goal: py2puml could additionally show the information inside the Annotated in the puml

That would be amazing!

Update 1: Quick and dirty solution:

I implemented a quick and dirty solution locally. In the compoundtypesplitter.py I imported

from re import sub as re_sub

and then in line 58 I added:

def __init__(self, compound_type_annotation: str, module_name: str):
    ## NEW CODE:
    # Iterate over the annotation, replace all occurances of typing.Annotated[type, extraInfo]
    # with type (drop the extra info for now to achieve goal 1)
    while('typing.Annotated' in compound_type_annotation):
        compound_type_annotation = re_sub(r'typing\.Annotated\[(.*?)\,(.*?)]', r'\1', compound_type_annotation)
    
    # END OF NEW CODE    
    resolved_type_annotations = ...

This resolves goal 1 for me, but I think this is not a nice solution. Maybe the mapping of the typing.Annotated string to the internal type should not be done inside the CompoundTypeSplitter, as the CompoundTypeSplitter should already recieve the "clean" input. Also it may be luck that the CompoundTypeSplitter recieved the string in the first place, as the typing.Annotation itself is not a compound.

Anyway, this solution even works with more complicated pydantic structures like:

from pydantic import PositiveInt

@dataclass
class A:
    number: Optional[list[list[PositiveInt]]]

Maybe you could take this example as idea and implement it correctly at the correct place? :)

Update 2: "Less dirty" solution created as PR:

I implemented now a version that is a bit less dirty (e.g. separate function) and it also not only works for pydantic types but also for custom types, e.g. these two variants will both work:

from pydantic import Field, PositiveInt
from typing_extensions import Annotated

@dataclass
class A:
    number: list[Optional[PositiveInt]]

CrazyIntWithCustomConstraints = Annotated[int, Field(ge=2, lt=25)]

@dataclass
class B:
    number: list[Optional[CrazyIntWithCustomConstraints]]

Please check #66 , thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant