Add buffer support to c-api by bschoenmaeckers · Pull Request #8071 · RustPython/RustPython · GitHub
Skip to content

Add buffer support to c-api#8071

Draft
bschoenmaeckers wants to merge 1 commit into
RustPython:mainfrom
bschoenmaeckers:c-api-buffer
Draft

Add buffer support to c-api#8071
bschoenmaeckers wants to merge 1 commit into
RustPython:mainfrom
bschoenmaeckers:c-api-buffer

Conversation

@bschoenmaeckers

@bschoenmaeckers bschoenmaeckers commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary by CodeRabbit

  • New Features
    • Added comprehensive buffer protocol support to the C-API, allowing C extensions to acquire buffers from Python objects, manage buffer lifecycles, validate contiguity constraints, compute element pointers, and efficiently copy buffer data across different memory layouts.

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/capi/src/pybuffer.rs`:
- Around line 363-366: The functions PyBuffer_ToContiguous and
PyBuffer_FromContiguous currently bail out when
is_contiguous_for_order(view_ref, ...) returns false and then assume contiguous
memory by using dst.copy_from_slice(view_ref.buf), so they fail for
strided/indirect buffers; update both functions to handle non-contiguous views
by iterating over the view_ref (using its shape/strides/len/offset fields) and
performing element-wise reads/writes to pack/unpack into the contiguous dst
buffer according to the requested order instead of calling dst.copy_from_slice;
keep the existing contiguous fast-path (when is_contiguous_for_order is true)
and implement a fallback that computes the source/destination index using
view_ref.strides and itemsize to correctly handle strided/indirect layouts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 9a37bdf8-bace-453f-b4fd-7fe323270fca

📥 Commits

Reviewing files that changed from the base of the PR and between 7f75ef1 and 9056150.

📒 Files selected for processing (2)
  • crates/capi/src/lib.rs
  • crates/capi/src/pybuffer.rs

Comment on lines +363 to +366
if !is_contiguous_for_order(view_ref, (order as u8).to_ascii_uppercase()) {
return Err(vm.new_buffer_error(
"PyBuffer_ToContiguous only supports contiguous exported buffers",
));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

ls -la
rg -n "PyBuffer_(To|From)Contiguous" crates/capi/src/pybuffer.rs || true
rg -n "is_contiguous_for_order" crates/capi/src/pybuffer.rs || true

# Print around the cited lines
sed -n '330,430p' crates/capi/src/pybuffer.rs | cat -n

# Also show definition of is_contiguous_for_order if present elsewhere in file
rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs || true
if rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs >/dev/null; then
  start=$(rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs | head -n1 | cut -d: -f1)
  start=$((start-40))
  end=$((start+120))
  sed -n "${start},${end}p" crates/capi/src/pybuffer.rs | cat -n
fi

# Search for any code path that handles non-contiguous (e.g., flatten/strides) near these functions
rg -n "strid|stride|shape|indirect|view.*contiguous|ToContiguous|FromContiguous" crates/capi/src/pybuffer.rs || true

Repository: RustPython/RustPython

Length of output: 16604


🏁 Script executed:

#!/bin/bash
set -euo pipefail

ls -la
rg -n "PyBuffer_(To|From)Contiguous" crates/capi/src/pybuffer.rs || true
rg -n "is_contiguous_for_order" crates/capi/src/pybuffer.rs || true

# Print around the cited lines
sed -n '330,430p' crates/capi/src/pybuffer.rs | cat -n

# Also show definition of is_contiguous_for_order if present elsewhere in file
rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs || true
if rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs >/dev/null; then
  start=$(rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs | head -n1 | cut -d: -f1)
  start=$((start-40))
  end=$((start+120))
  sed -n "${start},${end}p" crates/capi/src/pybuffer.rs | cat -n
fi

# Search for any code path that handles non-contiguous (e.g., flatten/strides) near these functions
rg -n "strid|stride|shape|indirect|view.*contiguous|ToContiguous|FromContiguous" crates/capi/src/pybuffer.rs || true

Repository: RustPython/RustPython

Length of output: 16604


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "PyBuffer_(To|From)Contiguous" crates/capi/src/pybuffer.rs || true
rg -n "is_contiguous_for_order" crates/capi/src/pybuffer.rs || true

echo "---- around 363-406 ----"
sed -n '340,420p' crates/capi/src/pybuffer.rs | cat -n

echo "---- definition of is_contiguous_for_order ----"
rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs || true
if rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs >/dev/null; then
  start=$(rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs | head -n1 | cut -d: -f1)
  start=$((start-60))
  end=$((start+180))
  sed -n "${start},${end}p" crates/capi/src/pybuffer.rs | cat -n
fi

echo "---- search for flatten/strided logic in file ----"
rg -n "strid|stride|flatten|indirect|non[- ]contig|non[- ]contiguous|ToContiguous|FromContiguous" crates/capi/src/pybuffer.rs || true

Repository: RustPython/RustPython

Length of output: 6988


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "PyBuffer_(To|From)Contiguous" crates/capi/src/pybuffer.rs || true
rg -n "is_contiguous_for_order" crates/capi/src/pybuffer.rs || true

echo "---- around 363-406 ----"
sed -n '340,420p' crates/capi/src/pybuffer.rs | cat -n

echo "---- definition of is_contiguous_for_order ----"
rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs || true
if rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs >/dev/null; then
  start=$(rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs | head -n1 | cut -d: -f1)
  start=$((start-60))
  end=$((start+180))
  sed -n "${start},${end}p" crates/capi/src/pybuffer.rs | cat -n
fi

echo "---- search for flatten/strided logic in file ----"
rg -n "strid|stride|flatten|indirect|non[- ]contig|non[- ]contiguous|ToContiguous|FromContiguous" crates/capi/src/pybuffer.rs || true

Repository: RustPython/RustPython

Length of output: 6988


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "PyBuffer_(To|From)Contiguous" -S .
rg -n "ToContiguous\(" -S crates src Lib || true
rg -n "FromContiguous\(" -S crates src Lib || true

Repository: RustPython/RustPython

Length of output: 1556


🏁 Script executed:

sed -n '200,360p' crates/capi/src/pybuffer.rs | cat -n

Repository: RustPython/RustPython

Length of output: 6483


🌐 Web query:

CPython PyBuffer_ToContiguous semantics non-contiguous buffer strided indirect

💡 Result:

PyBuffer_ToContiguous is a CPython C-API function used to copy data from a source Py_buffer structure into a contiguous destination memory block [1][2]. Its primary purpose is to flatten potentially non-contiguous, strided, or indirect (suboffset-containing) buffer data into a standard, tightly packed C-style or Fortran-style representation [1][3]. The function signature is: int PyBuffer_ToContiguous(void *buf, const Py_buffer *src, Py_ssize_t len, char order) Key semantics and behaviors include: 1. Contiguity Handling: The function first checks if the source buffer (src) is already contiguous according to the requested order ('C', 'F', or 'A' for either) using PyBuffer_IsContiguous [1][3]. If it is already contiguous, it performs a high-speed memcpy directly from the source buffer's memory [3]. 2. Complex Buffers (Strided/Indirect): If the source buffer is not already in the requested contiguous format—such as when it features non-standard strides, multi-dimensional layouts, or suboffsets (indirect pointers)—the function invokes a more complex internal copying routine [3]. This routine systematically iterates through the buffer's dimensions, strides, and suboffsets to reconstruct the data into a linear sequence in the destination buffer (buf) [3][2]. 3. Safety and Validation: - The function requires that the requested length (len) exactly matches the total length of the source buffer (src->len) [1][4][3]. If these do not match, the function fails and returns -1 [1][3]. - It ensures that the destination buffer has sufficient space, though in practice, the caller is responsible for providing a destination buffer of the correct size (matching src->len) [4][3]. 4. Usage Notes: - Order: The order parameter specifies the target memory layout ('C' for row-major, 'F' for column-major) [1][2]. - Error Handling: It returns 0 on success and -1 on error [1][5]. - Evolution: Historical implementations of this function had bugs regarding non-contiguous arrays (e.g., issue #12834) and off-by-one errors [6][7]. Modern CPython versions (3.3+) have significantly improved these routines to handle complex buffer layouts correctly by using an internal, recursive copying mechanism [8][6][3]. Essentially, PyBuffer_ToContiguous acts as a bridge for C-extensions to normalize disparate memory layouts into a predictable format required by many low-level libraries or I/O operations [2].

Citations:


🏁 Script executed:

sed -n '200,360p' crates/capi/src/pybuffer.rs | cat -n

Repository: RustPython/RustPython

Length of output: 6483


🌐 Web query:

CPython PyBuffer_ToContiguous semantics non-contiguous buffer strided indirect

💡 Result:

PyBuffer_ToContiguous is a CPython C-API function used to copy data from an exported buffer (Py_buffer) into a flat, contiguous destination buffer (buf) [1][2]. Semantics for Non-Contiguous, Strided, and Indirect Buffers: 1. Contiguity Check: The function first checks if the source buffer is already contiguous in the requested order ('C' for C-style, 'F' for Fortran-style, or 'A' for either) using PyBuffer_IsContiguous [1][3]. If it is, a direct memory copy (memcpy) is performed [3][4]. 2. Non-Contiguous/Strided Handling: If the buffer is not contiguous, the function employs a more complex procedure to pack the data [3][4]. It iterates over the logical structure of the buffer using its shape, strides, and suboffsets to extract elements and place them sequentially into the destination buffer [3]. 3. Indirect Buffers (Suboffsets): If a buffer is indirect (i.e., has suboffsets), PyBuffer_ToContiguous correctly handles the dereferencing required to access the actual data pointed to by the suboffset pointers [5][3]. 4. Requirements: The caller must provide a destination buffer of at least the size specified by the source buffer's len field. The function will fail and return -1 if the provided length does not exactly match the source buffer's total length (src->len) [1][6][3]. In essence, PyBuffer_ToContiguous acts as a formatter that linearizes complex, multi-dimensional, or strided memory layouts into a flat byte sequence based on the requested memory order [3][4]. It is the underlying mechanism used by memoryview.tobytes to resolve non-contiguous memory structures [7][4].

Citations:


🏁 Script executed:

sed -n '200,360p' crates/capi/src/pybuffer.rs | cat -n

Repository: RustPython/RustPython

Length of output: 6483


🌐 Web query:

CPython PyBuffer_ToContiguous semantics non-contiguous buffer strided indirect

💡 Result:

In CPython, the PyBuffer_ToContiguous function is used to copy data from a Py_buffer structure into a destination buffer as a contiguous byte array [1][2]. Its semantics regarding buffer types are as follows: 1. Contiguous Buffers: If the source buffer is already contiguous in the requested order (C-style, Fortran-style, or either), the function effectively performs a memory copy (e.g., memcpy) to the destination [3]. 2. Non-Contiguous and Strided Buffers: If the buffer is non-contiguous or employs strides, the function automatically reconstructs the logical data layout into a contiguous representation within the provided destination buffer [3]. It traverses the buffer according to its shape and strides, gathering the non-contiguous elements into a packed, contiguous memory block [3]. 3. Indirect Buffers: Buffers that utilize suboffsets (indirect buffers) are inherently considered non-contiguous in CPython's buffer protocol [4]. PyBuffer_ToContiguous handles these by resolving the pointers to collect the data into the requested contiguous format [3]. Function Signature: int PyBuffer_ToContiguous(void *buf, const Py_buffer *src, Py_ssize_t len, char order) Parameters: - buf: The destination buffer where the contiguous data will be copied. - src: The source Py_buffer containing the data (which may be strided or indirect). - len: The total number of bytes to copy. This must match src->len [1][2]. - order: The desired layout order: 'C' (C-style), 'F' (Fortran-style), or 'A' (either one) [1][2]. Return Value: Returns 0 on success and -1 on error [1][2]. Historical Note: The function has historically been subject to bugs related to off-by-one errors and incorrect handling of complex buffer layouts, particularly when dealing with non-contiguous or multi-dimensional arrays [5][3]. While these issues have been addressed in modern CPython versions, users should ensure they are using updated versions of the interpreter if relying on this API for complex buffer manipulations [5][3].

Citations:


Fix PyBuffer_{To,From}Contiguous to support non-contiguous (strided/indirect) buffers
crates/capi/src/pybuffer.rs currently rejects any view failing is_contiguous_for_order(...) (lines 363-366 and 403-406) and then performs a raw dst.copy_from_slice(src) over view_ref.buf, so strided/indirect layouts can’t be packed/filled as PyBuffer_{To,From}Contiguous is expected to do.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/capi/src/pybuffer.rs` around lines 363 - 366, The functions
PyBuffer_ToContiguous and PyBuffer_FromContiguous currently bail out when
is_contiguous_for_order(view_ref, ...) returns false and then assume contiguous
memory by using dst.copy_from_slice(view_ref.buf), so they fail for
strided/indirect buffers; update both functions to handle non-contiguous views
by iterating over the view_ref (using its shape/strides/len/offset fields) and
performing element-wise reads/writes to pack/unpack into the contiguous dst
buffer according to the requested order instead of calling dst.copy_from_slice;
keep the existing contiguous fast-path (when is_contiguous_for_order is true)
and implement a fallback that computes the source/destination index using
view_ref.strides and itemsize to correctly handle strided/indirect layouts.

@bschoenmaeckers bschoenmaeckers marked this pull request as draft June 10, 2026 09:55

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a part of public API? if then, does RustPython need to change buffer layout?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants