Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML generation and parsing #43

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

HUG0-D
Copy link
Contributor

@HUG0-D HUG0-D commented Nov 19, 2024

The overall goal of this PR is to add a profile parser and a profile writer.

To do so, the Base class is extended with 3 main functions:

  • .from_xml(): creates an instance of the class by parsing an xml fragment
  • .updates_from_xml(): updates an instance of the class by parsing an xml fragment
  • .to_xml() : allows to export, for a specified profile, the python object as an etree.Element.

reader.py leverages .from_xml() and .updates_from_xml() to parse different profile (CGMES or custom) to create a dictionary of objects of the form {"mRID": corresponding_python_object}

writer.py exports the python objects to generate the xml Profiles

This PR builds upon the work of #39

@HUG0-D HUG0-D force-pushed the xml-generation-and-parsing branch from 598b920 to da4ec07 Compare November 25, 2024 08:12
@HUG0-D HUG0-D marked this pull request as ready for review November 25, 2024 08:19
tom-hg57

This comment was marked as outdated.

@HUG0-D
Copy link
Contributor Author

HUG0-D commented Nov 27, 2024

You are correct wrapped_class_comment shoud be used in enum_template. In the same way wrapped_comment should be implemented for enum instances as it is not the case currently in cimgen.py

Concerning your test in test_load_all.py, it tries to instantiate all classes in resource folder. However, classes generated with primitive_template.mustache, dataclass_template.mustache and enum_template.mustache cannot be instantiated.
One of the goals of the previous PR was to avoid creating classes that are then never instantiated.

  • Classes such as Integer or Float are therefore instances of the class Primitive.
  • Datatype classes such as CurrentFlow are instances of the class CIMDatatype. Theses classes provide information on a value (unit, multiplier, ...).
  • Enum classes such as UnitSymbol cannot be instantiated either.

To test these classes the best way is to test their attributes directly.

I fixed the pyright type test error by setting the attribute_class to the name of the class instead of the object itself.
It will make it less straightforward to retrieve the unit, multipler or type of the attribute_class but fixes the issue and simplifies the class_template.mustache

@HUG0-D HUG0-D requested a review from tom-hg57 November 29, 2024 07:58
@m-mirz
Copy link
Contributor

m-mirz commented Dec 27, 2024

@tom-hg57 Can you update your review based on the most recent commits of @HUG0-D ?

Signed-off-by: HUG0-D <[email protected]>

add attribute_namespace to json_extra to handle custom attributes

Signed-off-by: HUG0-D <[email protected]>

Extended function cgmes_attributes_in_profile to get json_extras of the attributes

Signed-off-by: HUG0-D <[email protected]>

Added function to export the python object to an xml fragment

Signed-off-by: HUG0-D <[email protected]>

Added function in base to create a class instance from an etree element. Useful to create a profile reader

Signed-off-by: HUG0-D <[email protected]>

update_from_xml allows to update a class instance by parsing additionnal profiles

Signed-off-by: HUG0-D <[email protected]>

Added reader to parse profiles and create python object using the base.py functions

Signed-off-by: HUG0-D <[email protected]>

Added writer to export profiles using the .to_xml base function

Signed-off-by: HUG0-D <[email protected]>

Improved readability of Reader

Signed-off-by: HUG0-D <[email protected]>

Improved readability of Writer
Added ability to add attributes to profile Model header

Signed-off-by: HUG0-D <[email protected]>

Fix sonar issues

Signed-off-by: HUG0-D <[email protected]>

Fix sonar issues

Signed-off-by: HUG0-D <[email protected]>

Fix sonar issues

Signed-off-by: HUG0-D <[email protected]>

Refactor functions to_xml and _parse_xml_fragment to split in different subfunction and improve readability

Signed-off-by: HUG0-D <[email protected]>

Fix sonar issues

Signed-off-by: HUG0-D <[email protected]>

Fix comments

Signed-off-by: HUG0-D <[email protected]>

Removed extra indentation levels

Signed-off-by: HUG0-D <[email protected]>

Add enum wrapped_comments

Signed-off-by: HUG0-D <[email protected]>

Fix json_dump error by setting attribute_class as str
Fix attribute_namespace to namespace to be coherant with pycgmes

Signed-off-by: HUG0-D <[email protected]>

Changed var name for writer

Signed-off-by: HUG0-D <[email protected]>

Added subfolder within resources to store primitives, enum and datatypes

Signed-off-by: HUG0-D <[email protected]>

Added function to get main profile of an attribute (useful in writer)
Removed addition of "_" in uuid

Signed-off-by: HUG0-D <[email protected]>
@HUG0-D HUG0-D force-pushed the xml-generation-and-parsing branch from 275da4c to 6ce97ec Compare January 8, 2025 13:29
cimgen/cimgen.py Outdated Show resolved Hide resolved
cimgen/cimgen.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/lang_pack.py Show resolved Hide resolved
Comment on lines +5 to +11
from .constants import NAMESPACES

from ..resources.Currency import Currency
from ..resources.UnitMultiplier import UnitMultiplier
from ..resources.UnitSymbol import UnitSymbol
from .config import cgmes_resource_config
from .constants import NAMESPACES
from .profile import BaseProfile
from ..resources.types.UnitMultiplier import UnitMultiplier
from ..resources.types.UnitSymbol import UnitSymbol
from ..resources.types.Currency import Currency
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order of imports is checked by ruff in pycgmes!
There are other ruff warnings in reader.py and writer.py + pyright errors.
Please check out pycgmes, include your new generated sources and fix the problems.

tom-hg57

This comment was marked as outdated.

Copy link
Collaborator

@tom-hg57 tom-hg57 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, Base should only contain the things we need as a base class for the generated classes, as in all the other languages.
At least all after apparent_name is only for import/export from/to xml. These could be moved as static functions to the reader or writer or to a common file for both, maybe xml.py or similar.

cimgen/languages/modernpython/utils/reader.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/reader.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/writer.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/base.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/base.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/base.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/base.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/base.py Outdated Show resolved Hide resolved
Comment on lines 326 to 329
elif class_attribute["is_primitive_attribute"] or class_attribute["is_datatype_attribute"]:
attr_value = xml_attribute.text
if self.__dataclass_fields__[attr_name].type == bool:
attr_value = {"true": True, "false": False}.get(attr_value, None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
elif class_attribute["is_primitive_attribute"] or class_attribute["is_datatype_attribute"]:
attr_value = xml_attribute.text
if self.__dataclass_fields__[attr_name].type == bool:
attr_value = {"true": True, "false": False}.get(attr_value, None)
elif class_attribute["is_primitive_attribute"] or class_attribute["is_datatype_attribute"]:
attr_value = eval(str(xml_attribute.text))
if self.__dataclass_fields__[attr_name].type == bool:
assert xml_attribute.text
attr_value = {"true": True, "false": False}.get(attr_value, None)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attr_value = eval(str(xml_attribute.text)) fails if a string contains -

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused.
At first, this change is not what I've committed. My correct change was:

         elif class_attribute["is_list_attribute"]:
-            attr_value = eval(xml_attribute.text)
+            attr_value = eval(str(xml_attribute.text))
         elif class_attribute["is_primitive_attribute"] or class_attribute["is_datatype_attribute"]:
+            assert xml_attribute.text
             attr_value = xml_attribute.text

But maybe that doesn't change your problem with attr_value = eval(str(xml_attribute.text)).
Pyright says that xml_attribute.text is str | None, but eval doesn' take a None.
Maybe we could include assert xml_attribute.text like in the datatype case:

         elif class_attribute["is_list_attribute"]:
+            assert xml_attribute.text
             attr_value = eval(xml_attribute.text)
        elif class_attribute["is_primitive_attribute"] or class_attribute["is_datatype_attribute"]:
+            assert xml_attribute.text
             attr_value = xml_attribute.text

But if xml_attribute.text is not None then it's a string and str() shouldn't change the value. If eval() has a problem with "-" that would also be a problem in a case with a list attribute that has a "-".
BTW: What is the purpose of this eval()?

cimgen/languages/modernpython/utils/base.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/reader.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/writer.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/reader.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/reader.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/base.py Show resolved Hide resolved
cimgen/languages/modernpython/utils/base.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/reader.py Outdated Show resolved Hide resolved
cimgen/languages/modernpython/utils/writer.py Outdated Show resolved Hide resolved
@HUG0-D

This comment was marked as outdated.

@@ -91,6 +91,8 @@ def _extract_classname_uuid(elem, class_namespace: str, namespace_rdf: str) -> t
uuid = elem.get("{%s}about" % namespace_rdf)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not guaranted that this is a UUID. I know you don't try to convert the string to a UUID. But the name of the variable is misleading.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change it to rdfid

@@ -52,7 +52,7 @@ def _get_type_and_default(text: str, render: Callable[[str], str]) -> tuple[str,
attribute_txt = render(text)
attribute_json = eval(attribute_txt)
if attribute_json["is_class_attribute"]:
return ("Optional[str]", "default=None")
return ("str | None", "default=None")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be better if all are optional (float | None, int | None ...), so the writer could avoid to give them out. Currently your writer writes all primitive values to xml even if they were not in the input data of the reader.
My chevron_writer avoids this by checking the value. But that leads to another problem: it skips 0 and false values, because it could not distinguish between no value and 0/false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, at first I implemented something similar to you and faced the same issue.
Having None by default would solve the problem for the writer but could this create issues when using the objects for other applications ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These attributes are optional. When the reader reads a CIM file the result should show which attribute was in the file and which was miising. And the writer should only write attributes which are really set.

@chicco785
Copy link
Collaborator

In my opinion, Base should only contain the things we need as a base class for the generated classes, as in all the other languages.
At least all after apparent_name is only for import/export from/to xml. These could be moved as static functions to the reader or writer or to a common file for both, maybe xml.py or similar.

@tom-hg57 will you still keep such static functions in this repo? in my opinion, writer and reader should be part of the code base. changes to the generated code may always required updates to them, making very hard to maintain them otherwise.

@m-mirz
Copy link
Contributor

m-mirz commented Jan 15, 2025

In my opinion, Base should only contain the things we need as a base class for the generated classes, as in all the other languages.
At least all after apparent_name is only for import/export from/to xml. These could be moved as static functions to the reader or writer or to a common file for both, maybe xml.py or similar.

@tom-hg57 will you still keep such static functions in this repo? in my opinion, writer and reader should be part of the code base. changes to the generated code may always required updates to them, making very hard to maintain them otherwise.

I guess we need to ask ourselves where we draw the line between the projects. Do we want to copy all code from cimpy, pycgmes etc. into cimgen? If not, what do we exclude and why? Where do we test code if most of it is already included in cimgen?
Having Reader and Writer code in cimgen and cimpy, pycgmes… could also be very confusing.

Another direction could be to include cimgen as a submodule in these projects or to include a script that checks out a specific version of cimgen, generates the code and copies it to the right folders. We need to find a structure that facilitates development and is not too confusing for users.

@tom-hg57
Copy link
Collaborator

In my opinion, Base should only contain the things we need as a base class for the generated classes, as in all the other languages.
At least all after apparent_name is only for import/export from/to xml. These could be moved as static functions to the reader or writer or to a common file for both, maybe xml.py or similar.

@tom-hg57 will you still keep such static functions in this repo? in my opinion, writer and reader should be part of the code base. changes to the generated code may always required updates to them, making very hard to maintain them otherwise.

I guess we need to ask ourselves where we draw the line between the projects. Do we want to copy all code from cimpy, pycgmes etc. into cimgen? If not, what do we exclude and why? Where do we test code if most of it is already included in cimgen? Having Reader and Writer code in cimgen and cimpy, pycgmes… could also be very confusing.

Another direction could be to include cimgen as a submodule in these projects or to include a script that checks out a specific version of cimgen, generates the code and copies it to the right folders. We need to find a structure that facilitates development and is not too confusing for users.

In think, reader and writer should be part of cimgen. On the other hand, I test the generated code + reader/writer with pytest in pycgmes. So Markus idea to automatically take over the cimgen changes to pycgmes is not bad. There we have github workflows including pytest.

I started the discussion with the question if it would be better to take the xml code out of the base class. So we had a clean class hierarchy without lxml dependency: only Base , generated code and some handwritten primitive classes. We could put reader/writer stuff apart from this, maybe in another directory. Or really into pycgmes? So I change my opinion, if we establish the automatic take-over to pycgmes.

But we should do this not before this PR is merged. I will test the last changes in the next days.

@chicco785
Copy link
Collaborator

chicco785 commented Jan 16, 2025

In my opinion, Base should only contain the things we need as a base class for the generated classes, as in all the other languages.
At least all after apparent_name is only for import/export from/to xml. These could be moved as static functions to the reader or writer or to a common file for both, maybe xml.py or similar.

@tom-hg57 will you still keep such static functions in this repo? in my opinion, writer and reader should be part of the code base. changes to the generated code may always required updates to them, making very hard to maintain them otherwise.

I guess we need to ask ourselves where we draw the line between the projects. Do we want to copy all code from cimpy, pycgmes etc. into cimgen? If not, what do we exclude and why? Where do we test code if most of it is already included in cimgen? Having Reader and Writer code in cimgen and cimpy, pycgmes… could also be very confusing.

Another direction could be to include cimgen as a submodule in these projects or to include a script that checks out a specific version of cimgen, generates the code and copies it to the right folders. We need to find a structure that facilitates development and is not too confusing for users.

At least in the case of python (then we can see case by case), what I think it would make sense is that cimgen generated code is released as packages that other projects import, and where as much as possible, interfaces expose by cimgen generated code stays unchanged so that impact on other projects are minimal.

Project like cimpy (or pycgmes) should provide additional functionalities and glue logic that is too complex to generate. For example, managing multiple version of cim, validation rules, etc. This would reduce duplications and compatibility issues among projects building on top of cimgen generated code. Which is crucial I believe for avoiding dispersion: the community is already small, duplicates, complexity in version management, ect will end up in making it's long term life complicated.

True is, for example, that is more than one year that cimgen supports 3.0, but cimpy is still with the old code.

@chicco785
Copy link
Collaborator

To better detail (and I know this is ok for python and javascript and may not be for other languages):

  • include a release workflow in the repo.
  • publish packages in relevant registries.
  • update dependent projects to import versioned packages from the registries.

This would simplify maintenance of dependent projects. A project is not updated? Builds will still work importing the correct package version. You don't have time to immediately update the project? Same! You have dependabot in your repo and the update is not causing breaking changes? You can automatically import the new version.

@chicco785
Copy link
Collaborator

chicco785 commented Jan 16, 2025

A consideration about readers and writers: In several languages (and to some extent this is true also for pydantic - given that json parsing and serilisation is native), you get them for free if you properly annotate the generated code. For example this is the case for golang (and we have a branch for that, which was never opened as PR because we were not sure it was of interest: https://github.com/zaphiro-technologies/cimgen/tree/golang/golang). If we assume that no serialisation / parsing functionality is ever covered in cimgen, doing a parser in some languages will become almost an impossible work.

package classes

/*
Generated from the CGMES 3 files via cimgen: https://github.com/sogno-platform/cimgen
*/

//nolint:all
type Analog struct {
	Measurement
	/*
	   Analog represents an analog Measurement.

	   positiveFlowIn: If true then this measurement is an active power, reactive power or current with the convention that
	     a positive value measured at the Terminal means power is flowing into the related
	     PowerSystemResource.
	   AnalogValues: The values connected to this measurement.
	   LimitSets: A measurement may have zero or more limit ranges defined for it.
	*/

	PositiveFlowIn  *bool       `xml:"http://iec.ch/TC57/CIM100# Analog.positiveFlowIn"`
	AnalogValuesIds []Reference `xml:"http://iec.ch/TC57/CIM100# Analog.AnalogValues"`
	LimitSetsIds    []Reference `xml:"http://iec.ch/TC57/CIM100# Analog.LimitSets"`
}

What do you think about organising a call to agree on the matter, possible involving also alliander (pycgmes)?

@tom-hg57
Copy link
Collaborator

In branch https://github.com/tom-hg57/pycgmes/tree/try-cimgen-xml-generation-and-parsing I committed some fixes for ruff/pyright warnings and added example programs for your reader and writer.
In my tests I found only two real problems:

  1. The reader doesn't read TopologicalNodes from TopologicalIsland:
  <cim:TopologicalIsland rdf:ID="N">
    <cim:IdentifiedObject.name>N</cim:IdentifiedObject.name>
    <cim:TopologicalIsland.TopologicalNodes rdf:resource="#N0" />
    <cim:TopologicalIsland.TopologicalNodes rdf:resource="#N1" />
  </cim:TopologicalIsland>
  1. The writer writes a list not correctly (see examples/writer_example.py):
  <cim:TopologicalIsland rdf:ID="N">
    <cim:IdentifiedObject.name>N</cim:IdentifiedObject.name>
    <cim:TopologicalIsland.TopologicalNodes>['N0', 'N1']</cim:TopologicalIsland.TopologicalNodes>
  </cim:TopologicalIsland>

Apart from that, the reader and writer work correctly. :)

The two problems should now be solved.

The reader is okay now. But the writer isn't:

  <cim:TopologicalIsland rdf:ID="_N">
    <cim:IdentifiedObject.name>N</cim:IdentifiedObject.name>
    <cim:TopologicalIsland.TopologicalNodes>N0</cim:TopologicalIsland.TopologicalNodes>
    <cim:TopologicalIsland.TopologicalNodes>N1</cim:TopologicalIsland.TopologicalNodes>
  </cim:TopologicalIsland>

In addition to this added a leading underscore in the rdf ID to follow rdf standards.

I don't think that this is a good idea. If you read and write a CIM file the rdf:IDs shouldn't change.

Copy link
Collaborator

@tom-hg57 tom-hg57 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know most of the changes are only formatting or similar but ruff/pyright are complaining about this. Everytime I test your code with pycgmes I have to change these first.

Why don't you use pycgmes yourselve? The checks and unit tests help to find programming errors.

@@ -131,39 +150,221 @@ def cgmes_attributes_in_profile(self, profile: BaseProfile | None) -> dict[str,
for parent in reversed(self.__class__.__mro__[:-1]):
for f in fields(parent):
shortname = f.name
qualname = f"{parent.apparent_name()}.{shortname}" # type: ignore
qualname = f"{parent.apparent_name()}.{shortname}"
infos = dict()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
infos = dict()
infos = {}

"""Parsing the mRID from etree attributes"""
mrid_dict = {}
for key, value in xml_tree.attrib.items():
if key.endswith("ID") or key.endswith("about"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if key.endswith("ID") or key.endswith("about"):
if key.endswith(("ID", "about")):

Comment on lines +331 to +332
if self.__dataclass_fields__[attr_name].type == bool:
attr_value = {"true": True, "false": False}.get(attr_value, None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self.__dataclass_fields__[attr_name].type == bool:
attr_value = {"true": True, "false": False}.get(attr_value, None)
assert attr_value is not None
if self.__dataclass_fields__[attr_name].type is bool:
attr_value = {"true": True, "false": False}.get(attr_value)

Comment on lines -1 to +9
from typing import List, Optional, Union

from pydantic import Field
from pydantic.dataclasses import dataclass
from .constants import NAMESPACES

from ..resources.Currency import Currency
from ..resources.UnitMultiplier import UnitMultiplier
from ..resources.UnitSymbol import UnitSymbol
from .config import cgmes_resource_config
from .constants import NAMESPACES
from .profile import BaseProfile
from ..resources.types.UnitMultiplier import UnitMultiplier
from ..resources.types.UnitSymbol import UnitSymbol
from ..resources.types.Currency import Currency
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from pydantic import Field
from pydantic.dataclasses import dataclass

from ..resources.types.Currency import Currency
from ..resources.types.UnitMultiplier import UnitMultiplier
from ..resources.types.UnitSymbol import UnitSymbol
from .config import cgmes_resource_config
from .constants import NAMESPACES
from .profile import BaseProfile

"""
if start_dict:
self.import_result = start_dict
self.import_result["meta_info"] = dict(namespaces=self._get_namespaces(xml_files[0]), urls={})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.import_result["meta_info"] = dict(namespaces=self._get_namespaces(xml_files[0]), urls={})
self.import_result["meta_info"] = {"namespaces": self._get_namespaces(xml_files[0]), "urls": {}}

Comment on lines +22 to +23
custom_profiles: list[BaseProfile] = [],
custom_namespaces: dict[str, str] = {},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
custom_profiles: list[BaseProfile] = [],
custom_namespaces: dict[str, str] = {},
custom_profiles: list[BaseProfile] | None = None,
custom_namespaces: dict[str, str] | None = None,

:param custom_namespaces: {"namespace_prefix": "namespace_uri"}

:return: Mapping of profile to outputfile.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""
"""
custom_namespaces = custom_namespaces or {}
custom_profiles = custom_profiles or []

return profile_file_map

def _generate(
self, profile: BaseProfile, model_id: str, custom_namespaces: dict[str, str] = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self, profile: BaseProfile, model_id: str, custom_namespaces: dict[str, str] = {}
self, profile: BaseProfile, model_id: str, custom_namespaces: dict[str, str] | None = None

:param custom_namespaces: {"namespace_prefix": "namespace_uri"}

:return: etree of the profile
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""
"""
custom_namespaces = custom_namespaces or {}

Comment on lines +95 to +98
if count > 0:
output = etree.ElementTree(root)
else:
output = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if count > 0:
output = etree.ElementTree(root)
else:
output = None
output = etree.ElementTree(root) if count > 0 else None

@m-mirz
Copy link
Contributor

m-mirz commented Jan 18, 2025

In my opinion, Base should only contain the things we need as a base class for the generated classes, as in all the other languages.
At least all after apparent_name is only for import/export from/to xml. These could be moved as static functions to the reader or writer or to a common file for both, maybe xml.py or similar.

@tom-hg57 will you still keep such static functions in this repo? in my opinion, writer and reader should be part of the code base. changes to the generated code may always required updates to them, making very hard to maintain them otherwise.

I guess we need to ask ourselves where we draw the line between the projects. Do we want to copy all code from cimpy, pycgmes etc. into cimgen? If not, what do we exclude and why? Where do we test code if most of it is already included in cimgen? Having Reader and Writer code in cimgen and cimpy, pycgmes… could also be very confusing.
Another direction could be to include cimgen as a submodule in these projects or to include a script that checks out a specific version of cimgen, generates the code and copies it to the right folders. We need to find a structure that facilitates development and is not too confusing for users.

At least in the case of python (then we can see case by case), what I think it would make sense is that cimgen generated code is released as packages that other projects import, and where as much as possible, interfaces expose by cimgen generated code stays unchanged so that impact on other projects are minimal.

Project like cimpy (or pycgmes) should provide additional functionalities and glue logic that is too complex to generate. For example, managing multiple version of cim, validation rules, etc. This would reduce duplications and compatibility issues among projects building on top of cimgen generated code. Which is crucial I believe for avoiding dispersion: the community is already small, duplicates, complexity in version management, ect will end up in making it's long term life complicated.

True is, for example, that is more than one year that cimgen supports 3.0, but cimpy is still with the old code.

What you are suggesting is going to have a big impact not only on cimpy but also on pycgmes. In order to avoid code duplication, we would need to remove code from cimpy, pycgmes, etc. For cimpy, we can decide to do it this way. For pycgmes, we would need to align with alliander.

Pulling reader and writer functionality into the cimgen repo means also adding the associated tests into cimgen. Not only for python but also for other languages. I am not opposed to this idea but before we start something like this, we should have a clear idea of what goes where and involve relevant stakeholders.

So, I think it is indeed a good idea to organize a call to discuss this topic and include alliander as you suggest.

Regarding CGMES 3.0, I believe that it is just a matter of focus. We have included CGMES 3.0 in libcimpp and we are also working with CGMES 3.0 in Java. For python and CGMES 3.0, there is modernpython and pycgmes. Since you have decided to create these alternatives rather than contributing to cimpy, we do not feel that it is urgent to increase development speed on the python side. We just implement what is required for our purposes when we need it rather than trying to compete with pycgmes.
In my opinion, it is much more important to facilitate the work with CGMES in other languages rather than pushing two very similar things... Python is not the best choice for all use cases.

@m-mirz
Copy link
Contributor

m-mirz commented Jan 18, 2025

To better detail (and I know this is ok for python and javascript and may not be for other languages):

* include a release workflow in the repo.

* publish packages in relevant registries.

* update dependent projects to import versioned packages from the registries.

This would simplify maintenance of dependent projects. A project is not updated? Builds will still work importing the correct package version. You don't have time to immediately update the project? Same! You have dependabot in your repo and the update is not causing breaking changes? You can automatically import the new version.

I agree. These are all good suggestions. You do not have to convince me ;-) We are lacking the time to implement these points but I would be happy to merge PRs adding these features.

@m-mirz
Copy link
Contributor

m-mirz commented Jan 18, 2025

A consideration about readers and writers: In several languages (and to some extent this is true also for pydantic - given that json parsing and serilisation is native), you get them for free if you properly annotate the generated code. For example this is the case for golang (and we have a branch for that, which was never opened as PR because we were not sure it was of interest: https://github.com/zaphiro-technologies/cimgen/tree/golang/golang). If we assume that no serialisation / parsing functionality is ever covered in cimgen, doing a parser in some languages will become almost an impossible work.

package classes

/*
Generated from the CGMES 3 files via cimgen: https://github.com/sogno-platform/cimgen
*/

//nolint:all
type Analog struct {
	Measurement
	/*
	   Analog represents an analog Measurement.

	   positiveFlowIn: If true then this measurement is an active power, reactive power or current with the convention that
	     a positive value measured at the Terminal means power is flowing into the related
	     PowerSystemResource.
	   AnalogValues: The values connected to this measurement.
	   LimitSets: A measurement may have zero or more limit ranges defined for it.
	*/

	PositiveFlowIn  *bool       `xml:"http://iec.ch/TC57/CIM100# Analog.positiveFlowIn"`
	AnalogValuesIds []Reference `xml:"http://iec.ch/TC57/CIM100# Analog.AnalogValues"`
	LimitSetsIds    []Reference `xml:"http://iec.ch/TC57/CIM100# Analog.LimitSets"`
}

What do you think about organising a call to agree on the matter, possible involving also alliander (pycgmes)?

Of course we would include a PR going in this direction. We are not dogmatic about the reader/writer part. If the reader/writer does not require additional code and we are not duplicating code that is already included in another repo (e.g. cimpy, pycgmes), it is a no-brainer to accept this PR :-)

@tom-hg57 already convinced me some weeks ago to move code from libcimpp into cimgen (not reader/writer related but "base" classes). If there are good reasons to change the approach, we are open for discussion.

I would be happy to join a meeting to discuss the general direction.

@chicco785
Copy link
Collaborator

I agree. These are all good suggestions. You do not have to convince me ;-) We are lacking the time to implement these points but I would be happy to merge PRs adding these features.

Of course, we are happy to open pr to publish libraries for python in a public registry to start with, then we can see if that seems a good way and understand how to do for the rest

@chicco785
Copy link
Collaborator

@tom-hg57 @m-mirz @guillaume-alliander

here is a doodle: https://doodle.com/meeting/organize/id/en7o1PWa

(I am posting here, since I am missing Markus email)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants