Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scope of the limited API #41

Open
encukou opened this issue Nov 13, 2023 · 3 comments
Open

Scope of the limited API #41

encukou opened this issue Nov 13, 2023 · 3 comments

Comments

@encukou
Copy link
Contributor

encukou commented Nov 13, 2023

Victor requested to add the new PyList_Extend and PyList_Clear to limited API: python/cpython#111862

This API can be trivially replaced by PyObject_CallMethod/PyObject_VectorcallMethod -- or in this case, by PyList_SetSlice.

I would prefer to not add API that has such a trivial Python equivalent, unless there's a clear need for it (e.g. for performance) in third-party projects.

@vstinner
Copy link
Contributor

Oh, I just saw this issue 10 min after creating #42 which is similar.

This API can be trivially replaced by PyObject_CallMethod/PyObject_VectorcallMethod -- or in this case, by PyList_SetSlice.

As an user, I would like to write a C extension using the limited C API, I expect a convenient API, and be able to distribute a single wheel binary (per platform-architecture combination). "Convenient" is obviously hard to define here.

For example, for me, PyList_Extend(list, arg) is more convenient than having to call PyList_SetSlice(list, PY_SSIZE_T_MAX, PY_SSIZE_T_MAX, arg) where PY_SSIZE_T_MAX looks like magic constants to me. In Python, I prefer to call list.extend(arg) than having to write list[len(list):] = arg.

If your goal is to write the bare minimum API such as Native Interface API, obviously, all "convenient" API should go. For me, the Native Interface API is more for machines, and the limited C API is more for human who write code manually.


Another criteria is performance: PyObject_CallMethod(list, "extend", "O", arg) is slower than PyList_Extend(), since the bytes string "extend" has to be decoded from UTF-8 to create a temporary Python str object, the "O" format string must be parsed to create an array of C arguments, and at the end the temporary Python str object must be destroyed.

Example on micro-benchmark on _PyLong_AsByteArray() to discuss if this function deserves to become a public C API, rather than calling Python int.to_bytes() in C.

On considering to make _PyLong_GCD() a public function, Serhiy wrote:

If direct call of _PyLong_GCD() makes the code 7% faster than using math.gcd(), it is perhaps not worth. Note that many methods of builtin types (like str.upper()) are not exposed in the C API. General Python object or method call API is the way to use them.


IMO we should also consider to take in account how common an API is used. Very commonly used APIs deserve a public API, whereas having to go through PyObject_Call...() is acceptable for rarely used APIs. That's related to providing a "convenient" API.

By the way, when the PyFrameObject members were removed from Python 3.12 C API, I documented how to update code using PyObject_GetAttrString(frame, "<member name>"). But apparently, performance also matters, so a dedicated getter function was added for each removed member to avoid having to call slow PyObject_GetAttrString() function.


My goal for the long term would be to treat the "C API" basically as the limited C API: that the limited C API becomes the default.

That's why I'm trying to clarifying what's "private" or "internal" in the "public C API".

I don't know if it's possible, and I expect that the "C API" will always be larger than the "limited C API", since some APIs are never going to enter the limited C API by design, such as the PyTypeObject structure members.

For me, there are two clear usages of "the C API":

  • C extensions using the public C API or limited C API to extend Python and accept worse performance than using the internal C API, and in exchange expect to only have to update their code rarely, and (in the case of the limited C API) to ship a single binary (per platform+architecture)
  • C extensions using the internal C API for best performance and accept to have to be updated more frequently.

This separation becomes more visible in Cython which started to support generating C code targeting the limited C API. Cython users can now decide their profile: stability/portability or performance.

My concern is that Python has many API layers:

  • Limited C API
  • Public C API
  • Unstable C API
  • Private C API
  • Internal C API

IMO it's very confusing for everybody :-( It would be better to only have two main layers:

  • Limited C API -- stable and "portable" (accross CPython versions)
  • Internal C API -- unstable and fast

@encukou
Copy link
Contributor Author

encukou commented Nov 14, 2023

In the mean time, people can use pythoncapi-compat to get the macro to define PyList_Extend in terms of PyList_SetSlice. Or define the macro themselves. On any version, limited or unlimited.
There is no need to rush.

@vstinner
Copy link
Contributor

In the mean time, people can use pythoncapi-compat to get the macro to define PyList_Extend in terms of PyList_SetSlice. Or define the macro themselves. On any version, limited or unlimited.

Maybe a guideline can be defined from that? Is it possible and "not too complicated" (how can it be measured? number of lines?) to reimplement the needed feature using existing limited C API?

In the case of PyList_Extend() and PyList_Clear(), there is a way:

#define PyList_Extend(list, arg) PyList_SetSlice((list), PY_SSIZE_T_MAX, PY_SSIZE_T_MAX, (arg))
#define PyList_Clear(list) PyList_SetSlice((list), 0, PY_SSIZE_T_MAX, NULL)

pythoncapi-compat doesn't target the limited C API. Many functions are implemented in pythoncapi-compat with functions which are excluded from the limited C API, especially private functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants