Combined/all fixes v2 by LeeroyHannigan · Pull Request #24 · ExtendDB/extenddb · GitHub
Skip to content

Combined/all fixes v2#24

Merged
pdf-amzn merged 51 commits into
mainfrom
combined/all-fixes-v2
May 15, 2026
Merged

Combined/all fixes v2#24
pdf-amzn merged 51 commits into
mainfrom
combined/all-fixes-v2

Conversation

@LeeroyHannigan

Copy link
Copy Markdown
Collaborator

Combined conformance fixes — 454/576 → 562/576 (97.6%)

Summary

This is a large combined branch merging 20 individual fix branches plus additional incremental fixes made directly on the combined branch. It brings ExtendDB from 64.8% to 97.6% DynamoDB parity.

Included branches

Branch Category
fix/conditional-write-race Concurrency
fix/pg-identifier-overflow Storage
fix/lsi-always-synchronous Index consistency
fix/condition-missing-attribute-semantics Expression evaluation
fix/update-item-list-index-set Update operations
fix/number-validation-normalization Data validation
fix/error-message-fidelity Error messages
fix/validation-error-codes Error codes
fix/missing-validations Unused attr validation
fix/key-condition-parentheses Expression parsing
fix/transact-write-4mb-limit Transaction limits
fix/error-message-format Error message format
fix/begins-with-binary-key Binary key queries
fix/projection-list-index Projection expressions
fix/legacy-api KeyConditions, ScanFilter, QueryFilter, AttributesToGet
fix/missing-validations-batch Reserved words, batch validation
fix/error-messages-deep CreateTable/PutItem error routing
fix/error-messages-quick Quick message fixes
fix/validation-message-format Validation message format
fix/lsi-query-sort-key LSI query support

Additional fixes on combined branch

  • ListTables sort order (COLLATE "C")
  • size() on missing attributes
  • Query Select SPECIFIC_ATTRIBUTES validation
  • Empty BS message
  • UpdateItem condition evaluation on non-existent items
  • Tags ARN validation
  • UpdateTable: throughput validation, PAY_PER_REQUEST rejection, no-op rejection, GSI attribute validation, non-existent GSI error code
  • TransactWrite condition on non-existent items
  • Reserved keyword validation for tokenize_for paths
  • PutItem table name validation ordering
  • Empty UpdateExpression message
  • CreateTable: duplicate KeySchema, >2 elements format, LSI without range key message
  • Query: reject KCE without partition key reference

Remaining (14 tests, 2.4%)

  • Validation ordering (8) — requires multi-error response support
  • Java-toString dump format (3) — SDK-version-dependent message content
  • Expression-type-aware error prefixes (1)
  • Cross-account auth denial (1)
  • CreateTable multi-error (1)

Test plan

  • All 351 Rust unit tests pass
  • Parity suite: 562/576 passing (97.6%)
  • GSI propagation delay set to 4ms for test stability

DynamoDB uses specific message patterns for validation errors that the
conformance tests verify exactly. This fixes ~34 test failures by:

- Table name validation: use "1 validation error detected: Value 'X' at
  'tableName' failed to satisfy constraint: ..." format with separate
  messages for too-short, too-long, empty, and null cases
- Batch operations: "The requestItems parameter is required for BatchGetItem/
  BatchWriteItem" and per-table limit messages
- Transactions: "Value '[]' at 'transactItems' ... greater than or equal to 1"
- Expression syntax errors: "Invalid {Type}: Syntax error; token: ..., near: ..."
- Empty expressions: "Invalid {Type}: The expression can not be empty;"
- Query Limit: capital 'L', no value quotes
- Scan segment: include actual segment/total values in error
- Add validation_error()/validation_errors() formatter functions
- Table name: length errors include value and field path
- KeySchema: >2 elements includes serialized array value
- TransactGetItems/TransactWriteItems: empty/too-many use correct format
- BatchGetItem: per-table >100 keys uses correct field path
- BatchWriteItem: >25 items includes map value representation
The query builder used chr(1114111) (Unicode max codepoint) appended to
the prefix for the exclusive upper bound. This only works for TEXT columns
— on BYTEA columns it causes a PostgreSQL type error, surfacing as an
InternalServerError.

Fix: detect binary sort key columns (sk_b) and compute the upper bound
in Rust by incrementing the prefix bytes with carry propagation. Bind both
prefix and upper bound as separate BYTEA parameters.
Projecting mylist[N] was returning null because insert_nested did not
handle PathElement::Index. DynamoDB returns the projected element wrapped
in a single-element list: {"mylist": [value]}.

Verified against real DynamoDB: mylist[0] → {"mylist":{"L":[value]}},
out-of-bounds index → empty item.
- Reserved keyword enforcement (configurable via enforce_reserved_keywords)
- Duplicate key detection in BatchGetItem and TransactGetItems
- LSI on hash-only table rejection
- Duplicate index name rejection in CreateTable
- Empty string key value rejection
DynamoDB allows one level of parentheses in KeyConditionExpression:
- Around individual clauses: (pk = :pk) AND (sk > :sk)
- Around the full expression: (pk = :pk AND sk > :sk)
- Nested parens are rejected (matches DynamoDB behavior)

Adds outer-paren stripping in parse_key_condition and per-clause
paren handling in parse_key_clause.
- Empty KCE/UpdateExpression: 'The expression can not be empty;'
- Tokenizer syntax errors: 'Syntax error; token: "x", near: "..."'
- Expression type prefix (ProjectionExpression, FilterExpression, etc.)
- NULL attr false: correct ValidationException message
- Empty SS/NS/BS: 'were invalid:' not 'are not valid.'
- PutItem: reject EAV/EAN without ConditionExpression
- prefix_expression_error helper for consistent formatting
- KeyType/ScalarAttributeType: custom Deserialize with DynamoDB enum error format
- Select enum: correct member order in error message
- CreateTable: route validation errors from deserialization, missing TableName message
- PutItem: route validation errors from deserialization, null TableName message
…sToGet

Older AWS SDKs use pre-expression parameters for Query and Scan. These
were previously either silently ignored or rejected. Now they are desugared
into their modern expression-based equivalents:

- KeyConditions → KeyCondition struct (EQ, GT, LT, GE, LE, BETWEEN, BEGINS_WITH)
- QueryFilter/ScanFilter → filter Expr AST (all comparison operators including
  CONTAINS, NOT_CONTAINS, BEGINS_WITH, BETWEEN, IN, NULL, NOT_NULL)
- AttributesToGet → ProjectionExpression (on Query and Scan, matching existing
  GetItem/BatchGetItem support)

Mutual exclusivity enforced: mixing legacy + expression params returns
ValidationException with the standard DynamoDB conflict message.
# Conflicts:
#	crates/storage-postgres/src/data/index.rs
# Conflicts:
#	crates/core/src/expression/mod.rs
#	crates/engine/src/transact_get_items.rs
# Conflicts:
#	crates/engine/src/query.rs
# Conflicts:
#	crates/core/src/expression/mod.rs
#	crates/core/src/validation/mod.rs
#	crates/engine/src/get_item.rs
#	crates/engine/src/query.rs
#	crates/engine/src/scan.rs
#	crates/engine/src/transact_get_items.rs
#	crates/engine/src/update_item.rs
# Conflicts:
#	crates/engine/src/create_table.rs
#	crates/engine/src/put_item.rs
# Conflicts:
#	crates/engine/src/get_item.rs
#	crates/engine/src/put_item.rs
#	crates/engine/src/query.rs
#	crates/engine/src/scan.rs
#	crates/engine/src/transact_get_items.rs
#	crates/engine/src/update_item.rs
# Conflicts:
#	crates/core/src/validation/mod.rs
#	crates/engine/src/batch_get_item.rs
#	crates/engine/src/batch_write_item.rs
#	crates/engine/src/transact_get_items.rs
#	crates/engine/src/transact_write_items.rs
…, empty BS message

- ListTables: add COLLATE "C" for byte-order sort matching DynamoDB
- size(): return 0 for missing attributes instead of erroring (filter skips)
- Query: reject Select=SPECIFIC_ATTRIBUTES without ProjectionExpression
- Empty BS: fix message to "Binary sets should not be empty"
- UpdateItem: evaluate ConditionExpression against empty item when key
  doesn't exist, not the key-only upsert placeholder
- Tags: invalid ARN format returns ValidationException
- UpdateItem: empty string UpdateExpression now reaches tokenize_for which
  returns "The expression can not be empty;" instead of being caught by
  the "must be provided" validation
- PutItem: validate table name before table_key_info lookup so invalid
  names return ValidationException not ResourceNotFoundException
…TransactWrite condition on non-existent items
@pdf-amzn

Copy link
Copy Markdown
Collaborator

@pdf-amzn pdf-amzn merged commit 88066a2 into main May 15, 2026
@pdf-amzn pdf-amzn deleted the combined/all-fixes-v2 branch May 15, 2026 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants