{{ message }}
PERF: Add C++ DetectParamTypes + SQLExecuteFast pipeline#549
Draft
bewithgaurav wants to merge 9 commits into
Draft
PERF: Add C++ DetectParamTypes + SQLExecuteFast pipeline#549bewithgaurav wants to merge 9 commits into
bewithgaurav wants to merge 9 commits into
Conversation
Move parameter type detection from Python into C++ using raw CPython type checks (PyLong_CheckExact, PyFloat_CheckExact, etc.). Merge the DetectParamTypes → BindParameters → SQLExecute pipeline into a single DDBCSQLExecuteFast call so ParamInfo never crosses the pybind11 boundary. - DetectParamTypes: handles int (range-detected), float, bool, str (unicode + geometry sniffing), bytes, datetime/date/time, Decimal (MONEY range + generic numeric), UUID, None, with fallback to string - SQLExecuteFast_wrap: single pipeline with GIL release, always uses SQLPrepare for parameterized queries - cursor.py: fast path routing when no setinputsizes overrides present; old DDBCSQLExecute path preserved for setinputsizes callers - Named constants: MAX_INLINE_CHAR, MAX_INLINE_BINARY, MAX_NUMERIC_PRECISION, MONEY/SMALLMONEY ranges, PARAM_C_TYPE_TEXT platform macro
- Add complete DAE (Data-At-Execution) loop to SQLExecuteFast_wrap: SQL_NEED_DATA → SQLParamData/SQLPutData for large str/bytes/binary, matching the existing SQLExecute_wrap logic exactly - Fix DAE type assignment: non-unicode DAE strings use SQL_C_CHAR (not PARAM_C_TYPE_TEXT which maps to SQL_C_WCHAR on macOS/Linux) - Fix MONEY range lower bound: use MONEY_MIN not SMALLMONEY_MIN so negative decimals in MONEY range bind as VARCHAR (matches Python path) - Raise TypeError for unknown param types instead of silent str conversion - Add SQLFreeStmt(SQL_RESET_PARAMS) to unbind after execute
- Comment out use_prepare parameter name (C4100: unreferenced parameter) - Remove unused catch variable name (C4101: unreferenced local variable)
Add explicit null pointer and zero-length guards before memcpy in build_numeric_data to satisfy DevSkim code scanning rule DS121708.
…or attrs, parity test Six review fixes for SQLExecuteFast_wrap and DetectParamTypes: 1. Encoding key: read 'encoding' from settings dict (was 'charEncoding' which never matched). Only honor when ctype==SQL_C_CHAR so the default utf-16le doesn't corrupt SQL_C_CHAR DAE/inline byte paths. 2. Subclass support: PyLong_Check/PyFloat_Check/PyUnicode_Check/PyBytes_Check instead of *_CheckExact. Fixes user-defined int/str/bytes/float subclasses that were silently rejected with TypeError. Switched PyBytes_GET_SIZE to PyBytes_Size for subclass-safe length. 3. GIL release in DAE loop: SQLParamData and SQLPutData now release the GIL during each ODBC call, matching slow-path concurrency for large blobs/strings. 4. Preserve exec_rc: stash the SQLExecute return code before SQLFreeStmt so SUCCESS_WITH_INFO and other non-success-non-error codes are not clobbered by the unbind call. 5. Shallow-copy params: params = py::list(params) at function entry so DetectParamTypes' in-place PyList_SET_ITEM cannot mutate the caller's list under any future code path that might pass it directly. 6. Cursor attrs: SQLSetStmtAttr(SQL_ATTR_CURSOR_TYPE/CONCURRENCY) at entry to match slow-path semantics regardless of prior hstmt state. Also adds tests/test_023_fast_path_parity.py covering int/str/bytes/float subclasses, caller-list non-mutation, and unsupported-type TypeError.
Eight follow-up fixes after review feedback on c5a827f. 1. Refcount leak (BLOCKER): replace PyList_SET_ITEM (uppercase, no decref of old slot) with PyList_SetItem (decrefs old slot before stealing the new reference) in DetectParamTypes time/Decimal/UUID branches. The previous shallow-copy defense via py::list(params) was a no-op because pybind11s list constructor only inc_refs an already-list argument. 2. Geometry + DAE conflict: gate the geometry-prefix override on the not-DAE branch so a long POLYGON/POINT/LINESTRING string does not end up with isDAE=true, dataPtr set, AND a non-zero columnSize. 3. Decimal NaN/Infinity: throw ValueError instead of silently binding 0 via build_numeric_data on an empty digits tuple. 4. Time format: always emit microseconds (HH:MM:SS.ffffff), matching slow path isoformat(timespec=microseconds). 5. PyObject_IsInstance: explicit equality check so a custom __instancecheck__ that raises (returns -1) does not fall through with a Python error set. 6. Dead code: removed unused SMALLMONEY_MIN/SMALLMONEY_MAX constants and the unused utf16Len assignments in DetectParamTypes. 7. Encoding-key contract: only honor encoding_settings encoding when the user explicitly opted in via setencoding(..., ctype=SQL_C_CHAR=1). The Python layer SQL_C_CHAR constant is numerically -8 (real ODBC SQL_C_WCHAR), so by default the wide-char path is taken and encoding is irrelevant. 8. Parity test rewrite: drop the dead _force_slow_path_roundtrip helper, use the project cursor fixture instead of a hard-coded conn string, and add (a) a real fast-vs-slow parity check via setinputsizes-forced slow path, (b) a refcount-leak regression test using a Decimal subclass + weakref, (c) explicit NaN-rejection coverage.
Resolve conflicts in ddbc_bindings.cpp from main's GH-610 work: - Keep both build_numeric_data (this PR) and ResolveNullParamType (main) - Adopt main's BindParameters/BindParameterArray signatures that take SqlHandle& handle; update the SQLExecuteFast_wrap call site to pass *statementHandle so the fast path uses the per-handle NULL describe cache - Migrate SQLExecuteFast_wrap from std::wstring + WStringToSQLWCHAR to std::u16string + reinterpretU16stringAsSqlWChar (main's uniform 16-bit query/param representation), dropping the platform #ifdef in both the prepare path and the DAE wide-char put-data loop Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Work Item / Issue Reference
Summary
This pull request refactors the parameter handling logic in the
executemethod ofmssql_python/cursor.pyto introduce a more efficient "fast path" for parameter binding and execution. The new approach allows the code to skip Python-side type detection and binding when there are noinputsizesoverrides, delegating the entire process to the C++ layer for better performance. The slow path is retained for cases whereinputsizesare set.Performance improvements and code simplification:
DDBCSQLExecuteFastto handle parameter type detection, binding, and execution entirely in C++ when there are noinputsizesoverrides, improving performance for common cases. (F783d0c6L1471)inputsizesare present. [1] [2]inputsizesoverrides are specified, ensuring compatibility with advanced usage. (F783d0c6L1471)