gh-129107: make bytearray free-thread safe by tom-pytel · Pull Request #129108 · python/cpython · GitHub
Skip to content

gh-129107: make bytearray free-thread safe#129108

Merged
kumaraditya303 merged 13 commits into
python:mainfrom
tom-pytel:fix-issue-129107
Feb 15, 2025
Merged

gh-129107: make bytearray free-thread safe#129108
kumaraditya303 merged 13 commits into
python:mainfrom
tom-pytel:fix-issue-129107

Conversation

@tom-pytel

@tom-pytel tom-pytel commented Jan 20, 2025

Copy link
Copy Markdown
Contributor

Most of the work has been wrapping functions that need it in critical sections (one at a time, testing each one). There were some functions which were pointed directly at stringlib methods and needed critical sections, but since stringlib is shared with str and bytes and those immutable types don't need locks, the stringlib functions were not touched. Instead, they were wrapped in functions which take object locks.

There is work left to do, testing and doing the iterator. But starting as it is, this PR passes the test suite (at least on my system and build) as well as fixing the error cases presented in the script in the associated issue.

Functions which were looked at ("-" means no critical section was needed, otherwise function was fixed in one way or another):

* bytearray_alloc               = {"__alloc__", bytearray_alloc, METH_NOARGS, alloc_doc},
* bytearray_reduce              = BYTEARRAY_REDUCE_METHODDEF
* bytearray_reduce_ex           = BYTEARRAY_REDUCE_EX_METHODDEF
* bytearray_sizeof              = BYTEARRAY_SIZEOF_METHODDEF
* bytearray_append              = BYTEARRAY_APPEND_METHODDEF
- bytearray_clear               = BYTEARRAY_CLEAR_METHODDEF
* bytearray_copy                = BYTEARRAY_COPY_METHODDEF
b bytearray_count               = BYTEARRAY_COUNT_METHODDEF
* bytearray_decode              = BYTEARRAY_DECODE_METHODDEF
b bytearray_endswith            = BYTEARRAY_ENDSWITH_METHODDEF
* bytearray_extend              = BYTEARRAY_EXTEND_METHODDEF
b bytearray_find                = BYTEARRAY_FIND_METHODDEF
- bytearray_fromhex             = BYTEARRAY_FROMHEX_METHODDEF
* bytearray_hex                 = BYTEARRAY_HEX_METHODDEF
b bytearray_index               = BYTEARRAY_INDEX_METHODDEF
* bytearray_insert              = BYTEARRAY_INSERT_METHODDEF
s bytearray_join                = BYTEARRAY_JOIN_METHODDEF
* bytearray_lstrip              = BYTEARRAY_LSTRIP_METHODDEF
- bytearray_maketrans           = BYTEARRAY_MAKETRANS_METHODDEF
s bytearray_partition           = BYTEARRAY_PARTITION_METHODDEF
* bytearray_pop                 = BYTEARRAY_POP_METHODDEF
* bytearray_remove              = BYTEARRAY_REMOVE_METHODDEF
s bytearray_replace             = BYTEARRAY_REPLACE_METHODDEF
* bytearray_removeprefix        = BYTEARRAY_REMOVEPREFIX_METHODDEF
* bytearray_removesuffix        = BYTEARRAY_REMOVESUFFIX_METHODDEF
* bytearray_reverse             = BYTEARRAY_REVERSE_METHODDEF
b bytearray_rfind               = BYTEARRAY_RFIND_METHODDEF
b bytearray_rindex              = BYTEARRAY_RINDEX_METHODDEF
* bytearray_rpartition          = BYTEARRAY_RPARTITION_METHODDEF
s bytearray_rsplit              = BYTEARRAY_RSPLIT_METHODDEF
* bytearray_rstrip              = BYTEARRAY_RSTRIP_METHODDEF
s bytearray_split               = BYTEARRAY_SPLIT_METHODDEF
s bytearray_splitlines          = BYTEARRAY_SPLITLINES_METHODDEF
b bytearray_startswith          = BYTEARRAY_STARTSWITH_METHODDEF
* bytearray_strip               = BYTEARRAY_STRIP_METHODDEF
* bytearray_translate           = BYTEARRAY_TRANSLATE_METHODDEF

* bytearray_length              = sq_length
- PyByteArray_Concat            = sq_concat
* bytearray_repeat              = sq_repeat
* bytearray_getitem             = sq_item
* bytearray_setitem             = sq_ass_item
* bytearray_contains            = sq_contains
* bytearray_iconcat             = sq_inplace_concat
* bytearray_irepeat             = sq_inplace_repeat

* bytearray_length              =
* bytearray_subscript           =
* bytearray_ass_subscript       =

* bytearray_getbuffer           =
* bytearray_releasebuffer       =

* bytearray_mod                 =

- bytearray_dealloc             = tp_dealloc
* bytearray_repr                = tp_repr
- bytearray_str                 = tp_str
- bytearray_richcompare         = tp_richcompare
- bytearray_iter                = tp_iter
- bytearray___init__            = tp_init


* bytearrayiter_length_hint     = "__length_hint__"
* bytearrayiter_reduce          = "__reduce__"
* bytearrayiter_setstate        = "__setstate__"
- bytearrayiter_dealloc         =
- bytearrayiter_traverse        = tp_traverse
* bytearrayiter_next            = tp_iternext


- PyByteArray_FromObject        =
- PyByteArray_Concat            =
- PyByteArray_FromStringAndSize =
* PyByteArray_Size              =
- PyByteArray_AsString          =
* PyByteArray_Resize            =


w stringlib_capitalize          = {"capitalize", stringlib_capitalize, METH_NOARGS, _Py_capitalize__doc__},
w stringlib_center              = STRINGLIB_CENTER_METHODDEF
w stringlib_expandtabs          = STRINGLIB_EXPANDTABS_METHODDEF
w stringlib_isalnum             = {"isalnum", stringlib_isalnum, METH_NOARGS, _Py_isalnum__doc__},
w stringlib_isalpha             = {"isalpha", stringlib_isalpha, METH_NOARGS, _Py_isalpha__doc__},
w stringlib_isascii             = {"isascii", stringlib_isascii, METH_NOARGS, _Py_isascii__doc__},
w stringlib_isdigit             = {"isdigit", stringlib_isdigit, METH_NOARGS, _Py_isdigit__doc__},
w stringlib_islower             = {"islower", stringlib_islower, METH_NOARGS, _Py_islower__doc__},
w stringlib_isspace             = {"isspace", stringlib_isspace, METH_NOARGS, _Py_isspace__doc__},
w stringlib_istitle             = {"istitle", stringlib_istitle, METH_NOARGS, _Py_istitle__doc__},
w stringlib_isupper             = {"isupper", stringlib_isupper, METH_NOARGS, _Py_isupper__doc__},
w stringlib_ljust               = STRINGLIB_LJUST_METHODDEF
w stringlib_lower               = {"lower", stringlib_lower, METH_NOARGS, _Py_lower__doc__},
w stringlib_rjust               = STRINGLIB_RJUST_METHODDEF
w stringlib_swapcase            = {"swapcase", stringlib_swapcase, METH_NOARGS, _Py_swapcase__doc__},
w stringlib_title               = {"title", stringlib_title, METH_NOARGS, _Py_title__doc__},
w stringlib_upper               = {"upper", stringlib_upper, METH_NOARGS, _Py_upper__doc__},
w stringlib_zfill               = STRINGLIB_ZFILL_METHODDEF

Comments and guidance requested.

@ZeroIntensity ZeroIntensity left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll yield to someone else, but this is one of the things that we probably shouldn't do with critical sections. bytearray is a hot object--we should keep performance in mind here. A lot (but not all!) of bytearray isn't prone to deadlocks because it doesn't typically re-enter.

I'm still undecided, but it might be worth borrowing from the FT list implementation for doing it locklessly.

Comment thread Include/cpython/bytearrayobject.h
@tom-pytel

tom-pytel commented Jan 21, 2025

Copy link
Copy Markdown
Contributor Author

Comment thread Objects/bytearrayobject.c Outdated
@kumaraditya303

Copy link
Copy Markdown
Contributor

I suggest to split this PR into 2:

  • one to fix the bytearray object
  • one to fix the iterator of bytearray

@tom-pytel

Copy link
Copy Markdown
Contributor Author

I suggest to split this PR into 2:

Makes sense, iterator is smaller (and easier to backport?) so will break that out and send it separate up later today and leave this one as the main bytearray?

@kumaraditya303

Copy link
Copy Markdown
Contributor

Yup, SGTM, ping me for reviews.

Comment thread Lib/test/test_bytes.py Outdated
Comment thread Objects/bytearrayobject.c
Comment thread Objects/bytearrayobject.c
Comment thread Objects/bytearrayobject.c Outdated
@kumaraditya303

Copy link
Copy Markdown
Contributor

@colesbury This PR fixes thread safety of bytearray with critical section; do we want to have lock free reads for this too?

@colesbury

Copy link
Copy Markdown
Contributor

do we want to have lock free reads for this too?

No, probably not. At least not for now.

@kumaraditya303 kumaraditya303 enabled auto-merge (squash) February 15, 2025 07:13
@kumaraditya303 kumaraditya303 merged commit a05433f into python:main Feb 15, 2025
@robsdedude

Copy link
Copy Markdown
Contributor

Thank you very much for the continued efforts 🙇 Are there plans to backport this (and the follow-up PRs) to 3.13?

@tom-pytel

Copy link
Copy Markdown
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants