Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Expose the dtype C API #25754

Merged
merged 16 commits into from
Feb 14, 2024
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
13 changes: 13 additions & 0 deletions doc/release/upcoming_changes/25754.c_api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
New Public DType API
--------------------

The C implementation of the NEP 42 DType API is now public. While the DType API
has shipped in NumPy for a few versions, it was only usable in sessions with a
special environment variable set. It is now possible to write custom DTypes
outside of NumPy using the new DType API and the normal ``import_array()``
mechanism for importing the numpy C API.

See :ref:`dtype-api` for more details about the API. As always with a new
feature, please report any bugs you run into implementing or using a new
DType. It is likely that downstream C code that works with dtypes will need to
be updated to work correctly with new DTypes.
574 changes: 565 additions & 9 deletions doc/source/reference/c-api/array.rst

Large diffs are not rendered by default.

214 changes: 205 additions & 9 deletions doc/source/reference/c-api/types-and-structures.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,14 +48,17 @@ supportive role: the :c:data:`PyArrayIter_Type`, the
. The :c:data:`PyArrayIter_Type` is the type for a flat iterator for an
ndarray (the object that is returned when getting the flat
attribute). The :c:data:`PyArrayMultiIter_Type` is the type of the
object returned when calling ``broadcast`` (). It handles iteration
and broadcasting over a collection of nested sequences. Also, the
object returned when calling ``broadcast`` (). It handles iteration and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the ()? (not new, I know)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no idea, deleted it

broadcasting over a collection of nested sequences. Also, the
:c:data:`PyArrayDescr_Type` is the data-type-descriptor type whose
instances describe the data. Finally, there are 21 new scalar-array
instances describe the data and :c:data:`PyArray_DTypeMeta` is the
metaclass for data-type descriptors. There are also new scalar-array
types which are new Python scalars corresponding to each of the
fundamental data types available for arrays. An additional 10 other
types are place holders that allow the array scalars to fit into a
hierarchy of actual Python types.
fundamental data types available for arrays. Additional types are
placeholders that allow the array scalars to fit into a hierarchy of
actual Python types. Finally, the :c:data:`PyArray_DTypeMeta` instances
corresponding to the NumPy built-in data types are also publicly
visible.


PyArray_Type and PyArrayObject
Expand Down Expand Up @@ -591,11 +594,11 @@ PyArrayDescr_Type and PyArray_Descr
A pointer to a function that scans (scanf style) one element
of the corresponding type from the file descriptor ``fd`` into
the array memory pointed to by ``ip``. The array is assumed
to be behaved.
to be behaved.
The last argument ``arr`` is the array to be scanned into.
Returns number of receiving arguments successfully assigned (which
may be zero in case a matching failure occurred before the first
receiving argument was assigned), or EOF if input failure occurs
receiving argument was assigned), or EOF if input failure occurs
before the first receiving argument was assigned.
This function should be called without holding the Python GIL, and
has to grab it for error reporting.
Expand Down Expand Up @@ -714,6 +717,199 @@ The :c:data:`PyArray_Type` can also be sub-typed.
:c:func:`PyUFunc_ReplaceLoopBySignature` The ``tp_str`` and ``tp_repr``
methods can also be altered using :c:func:`PyArray_SetStringFunction`.

.. _arraymethod-structs:

PyArrayMethod_Context and PyArrayMethod_Spec
--------------------------------------------

.. c:type:: PyArrayMethodObject_tag

An opaque struct used to represent the method "self" in ArrayMethod loops.

.. c:type:: PyArrayMethod_Context

A struct that is passed in to ArrayMethod loops to provide context for the
runtime usage of the loop.

.. code-block:: c

typedef struct {
PyObject *caller;
struct PyArrayMethodObject_tag *method;
PyArray_Descr **descriptors;
} PyArrayMethod_Context

.. c:member:: PyObject *caller

The caller, which is typically the ufunc that called the loop. May be
``NULL`` when a call is not from a ufunc (e.g. casts).

.. c:member:: struct PyArrayMethodObject_tag *method

The method "self". Currently this object is an opaque pointer.

.. c:member:: PyArray_Descr **descriptors

An array of descriptors for the ufunc loop, filled in by
``resolve_descriptors``. The length of the array is ``nin`` + ``nout``.

.. c:type:: PyArrayMethod_Spec

A struct used to register an ArrayMethod with NumPy. We use the slots
mechanism used by the Python limited API. See below for the slot definitions.

.. code-block:: c

typedef struct {
const char *name;
int nin, nout;
NPY_CASTING casting;
NPY_ARRAYMETHOD_FLAGS flags;
PyArray_DTypeMeta **dtypes;
PyType_Slot *slots;
} PyArrayMethod_Spec;

.. c:member:: const char *name

The name of the loop.

.. c:member:: int nin

The number of input operands

.. c:member:: int nout

The number of output operands.

.. c:member:: NPY_CASTING casting

Used to indicate how minimally permissive a casting operation should
be. For example, if a cast operation might in some circumstances be safe,
but in others unsafe, then ``NPY_UNSAFE_CASTING`` should be set. Not used
for ufunc loops but must still be set.

.. c:member:: NPY_ARRAYMETHOD_FLAGS flags

The flags set for the method.

.. c:member:: PyArray_DTypeMeta **dtypes

The DTypes for the loop. Must be ``nin`` + ``nout`` in length.

.. c:member:: PyType_Slot *slots

An array of slots for the method. Slot IDs must be one of the values
below.

.. _dtypemeta:

PyArray_DTypeMeta and PyArrayDTypeMeta_Spec
-------------------------------------------

.. c:type:: PyArray_DTypeMeta

A largely opaque struct representing DType classes. Each instance defines a
metaclass for a single NumPy data type. Data types can either be
non-parametric or parametric. For non-parametric types, the DType class has
a one-to-one correspondence with the descriptor instance created from the
DType class. Parametric types can correspond to many different dtype
instances depending on the chosen parameters. This type is available in the
public ``numpy/dtype_api.h`` header. Currently use of this struct is not
supported in the limited CPython API, so if ``Py_LIMITED_API`` is set, this
type is a typedef for ``PyTypeObject``.

.. code-block:: c

typedef struct {
PyHeapTypeObject super;
PyArray_Descr *singleton;
int type_num;
PyTypeObject *scalar_type;
npy_uint64 flags;
void *dt_slots;
void *reserved[3];
} PyArray_DTypeMeta

.. c:member:: PyHeapTypeObject super

The superclass, providing hooks into the python object
API. Set members of this struct to fill in the functions
implementing the ``PyTypeObject`` API (e.g. ``tp_new``).

.. c:member:: PyArray_Descr *singleton

A descriptor instance suitable for use as a singleton
descriptor for the data type. This is useful for
non-parametric types representing simple plain old data type
where there is only one logical descriptor instance for all
data of the type. Can be NULL if a singleton instance is not
appropriate.

.. c:member:: int type_num

Corresponds to the type number for legacy data types. Data
types defined outside of NumPy and possibly future data types
shipped with NumPy will have ``type_num`` set to -1, so this
should be relied on to descriminate between data types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not be relied on?


.. c:member:: PyTypeObject *scalar_type

The type of scalar instances for this data type.

.. c:member:: npy_uint64 flags

Flags can be set to indicate to NumPy that this data type
has optional behavior. See :ref:`dtype-flags` for a listing of
allowed flag values.

.. c:member:: void* dt_slots

An opaque pointer to a private struct containing
implementations of functions in the DType API. This is filled
in from the ``slots`` member of the ``PyArrayDTypeMeta_Spec``
instance used to initialize the DType.

.. c:type:: PyArrayDTypeMeta_Spec

A struct used to initialize a new DType with the
``PyArrayInitDTypeMeta_FromSpec`` function.

.. code-block:: c

typedef struct {
PyTypeObject *typeobj;
int flags;
PyArrayMethod_Spec **casts;
PyType_Slot *slots;
PyTypeObject *baseclass;
}

.. c:member:: PyTypeObject *typeobj

Either ``NULL`` or the type of the python scalar associated with
the DType. Scalar indexing into an array returns an item with this
type.

.. c:member:: int flags

Static flags for the DType class, indicating whether the DType is
parametric, abstract, or represents numeric data. The latter is
optional but is useful to set to indicate to downstream code if
the DType represents data that are numbers (ints, floats, or other
numeric data type) or something else (e.g. a string, unit, or
date).

.. c:member:: PyArrayMethod_Spec **casts;

A ``NULL``-terminated array of ArrayMethod specifications for
casts defined by the DType.

.. c:member:: PyType_Slot *slots;

A ``NULL``-terminated array of slot specifications for implementations
of functions in the DType API. Slot IDs must be one of the
DType slot IDs enumerated in :ref:`dtype-slots`.


PyUFunc_Type and PyUFuncObject
------------------------------
Expand Down Expand Up @@ -1175,7 +1371,7 @@ are ``Py{TYPE}ArrType_Type`` where ``{TYPE}`` can be
**Bool**, **Byte**, **Short**, **Int**, **Long**, **LongLong**,
**UByte**, **UShort**, **UInt**, **ULong**, **ULongLong**,
**Half**, **Float**, **Double**, **LongDouble**, **CFloat**,
**CDouble**, **CLongDouble**, **String**, **Unicode**, **Void**,
**CDouble**, **CLongDouble**, **String**, **Unicode**, **Void**,
**Datetime**, **Timedelta**, and **Object**.

These type names are part of the C-API and can therefore be created in
Expand Down
8 changes: 3 additions & 5 deletions numpy/_core/_add_newdocs.py
Original file line number Diff line number Diff line change
Expand Up @@ -5716,11 +5716,9 @@
} ufunc_call_info;

Note that the first call only fills in the ``context``. The call to
``_get_strided_loop`` fills in all other data.
Please see the ``numpy/experimental_dtype_api.h`` header for exact
call information; the main thing to note is that the new-style loops
return 0 on success, -1 on failure. They are passed context as new
first input and ``auxdata`` as (replaced) last.
``_get_strided_loop`` fills in all other data. The main thing to note is
that the new-style loops return 0 on success, -1 on failure. They are
passed context as new first input and ``auxdata`` as (replaced) last.

Only the ``strided_loop``signature is considered guaranteed stable
for NumPy bug-fix releases. All other API is tied to the experimental
Expand Down
2 changes: 1 addition & 1 deletion numpy/_core/code_generators/cversions.txt
Original file line number Diff line number Diff line change
Expand Up @@ -73,5 +73,5 @@
0x00000011 = ca1aebdad799358149567d9d93cbca09

# Version 18 (NumPy 2.0.0)
0x00000012 = 61b22f14088110916885b5560ad1018d
0x00000012 = e1a344a47c4449f5fd0e3e5c7cd7a4ed

6 changes: 5 additions & 1 deletion numpy/_core/code_generators/genapi.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ def get_processor():
join('multiarray', 'dlpack.c'),
join('multiarray', 'dtypemeta.c'),
join('multiarray', 'einsum.c.src'),
join('multiarray', 'public_dtype_api.c'),
join('multiarray', 'flagsobject.c'),
join('multiarray', 'getset.c'),
join('multiarray', 'item_selection.c'),
Expand All @@ -83,10 +84,13 @@ def get_processor():
join('multiarray', 'stringdtype', 'static_string.c'),
join('multiarray', 'strfuncs.c'),
join('multiarray', 'usertypes.c'),
join('umath', 'dispatching.c'),
join('umath', 'extobj.c'),
join('umath', 'loops.c.src'),
join('umath', 'reduction.c'),
join('umath', 'ufunc_object.c'),
join('umath', 'ufunc_type_resolution.c'),
join('umath', 'reduction.c'),
join('umath', 'wrapping_array_method.c'),
]
THIS_DIR = os.path.dirname(__file__)
API_FILES = [os.path.join(THIS_DIR, '..', 'src', a) for a in API_FILES]
Expand Down
8 changes: 7 additions & 1 deletion numpy/_core/code_generators/generate_numpy_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,12 @@

%s

/*
* The DType classes are inconvenient for the Python generation so exposed
* manualy in the header below (may be moved).
*/
#include "numpy/_public_dtype_api_table.h"

#if !defined(NO_IMPORT_ARRAY) && !defined(NO_IMPORT)
static int
_import_array(void)
Expand Down Expand Up @@ -86,7 +92,7 @@
* We do not support older NumPy versions at all.
*/
if (sizeof(Py_ssize_t) != sizeof(Py_intptr_t) &&
PyArray_GetNDArrayCFeatureVersion() < NPY_2_0_API_VERSION) {
PyArray_RUNTIME_VERSION < NPY_2_0_API_VERSION) {
PyErr_Format(PyExc_RuntimeError,
"module compiled against NumPy 2.0 but running on NumPy 1.x. "
"Unfortunately, this is not supported on niche platforms where "
Expand Down
15 changes: 14 additions & 1 deletion numpy/_core/code_generators/numpy_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,10 @@ def get_annotations():
'PyHalfArrType_Type': (217,),
'NpyIter_Type': (218,),
# End 1.6 API
# NOTE: The Slots 320-360 are defined in `_experimental_dtype_api.h`
# and filled explicitly outside the code generator as the metaclass
# makes them tricky to expose. (This may be refactored.)
# End 2.0 API
}

# define NPY_NUMUSERTYPES (*(int *)PyArray_API[6])
Expand All @@ -102,7 +106,7 @@ def get_annotations():
1, 4, 40, 41, 65, 66, 67, 68, 81, 82, 83,
103, 115, 117, 122, 163, 164, 171, 173, 197,
201, 202, 208, 219, 278, 291, 293, 294, 295,
301],
301] + list(range(320, 361)), # range reserves DType class slots
'PyArray_GetNDArrayCVersion': (0,),
# Unused slot 40, was `PyArray_SetNumericOps`
# Unused slot 41, was `PyArray_GetNumericOps`,
Expand Down Expand Up @@ -388,6 +392,11 @@ def get_annotations():
'NpyString_acquire_allocators': (317, MinVersion("2.0")),
'NpyString_release_allocator': (318, MinVersion("2.0")),
'NpyString_release_allocators': (319, MinVersion("2.0")),
# Slots 320-360 reserved for DType classes (see comment in types)
'PyArray_GetDefaultDescr': (361, MinVersion("2.0")),
'PyArrayInitDTypeMeta_FromSpec': (362, MinVersion("2.0")),
'PyArray_CommonDType': (363, MinVersion("2.0")),
'PyArray_PromoteDTypeSequence': (364, MinVersion("2.0")),
# End 2.0 API
}

Expand Down Expand Up @@ -444,6 +453,10 @@ def get_annotations():
# End 1.8 API
'PyUFunc_FromFuncAndDataAndSignatureAndIdentity': (42, MinVersion("1.16")),
# End 1.16 API
'PyUFunc_AddLoopFromSpec': (43, MinVersion("2.0")),
'PyUFunc_AddPromoter': (44, MinVersion("2.0")),
'PyUFunc_AddWrappingLoop': (45, MinVersion("2.0")),
'PyUFunc_GiveFloatingpointErrors': (46, MinVersion("2.0")),
}

# List of all the dicts which define the C API
Expand Down
4 changes: 2 additions & 2 deletions numpy/_core/include/meson.build
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
installed_headers = [
'numpy/_neighborhood_iterator_imp.h',
'numpy/_public_dtype_api_table.h',
'numpy/arrayobject.h',
'numpy/arrayscalars.h',
'numpy/_dtype_api.h',
'numpy/experimental_dtype_api.h',
'numpy/dtype_api.h',
'numpy/halffloat.h',
'numpy/ndarrayobject.h',
'numpy/ndarraytypes.h',
Expand Down