Client.load_table_from_dataframe() sometimes chooses invalid column type · Issue #1650 · googleapis/python-bigquery · GitHub
Skip to content
This repository was archived by the owner on Mar 6, 2026. It is now read-only.
This repository was archived by the owner on Mar 6, 2026. It is now read-only.

Client.load_table_from_dataframe() sometimes chooses invalid column type #1650

Description

@willsthompson

Environment details

  • OS type and version: macOS 13
  • Python version: Python 3.10.7
  • pip version: pip 23.2
  • google-cloud-bigquery version: 3.11.4

Steps to reproduce

  1. Create a table with a BIGNUMERIC type column
  2. Use Client.load_table_from_dataframe() to load a column with the value Decimal("0.12345678901234560000000000000000000000")

Code example

df = pd.DataFrame("TEST_COL", Decimal("0.12345678901234560000000000000000000000"))
with bigquery.Client(project=project, credentials=credentials) as client:
    load_job = client.load_table_from_dataframe(df, table_name)
    result = load_job.result()

Stack trace

Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydevd_bundle/pydevd_exec2.py", line 3, in Exec
    exec(exp, global_vars, local_vars)
  File "<input>", line 1, in <module>
  File "pyarrow/array.pxi", line 1044, in pyarrow.lib.Array.from_pandas
  File "pyarrow/array.pxi", line 316, in pyarrow.lib.array
  File "pyarrow/array.pxi", line 83, in pyarrow.lib._ndarray_to_array
  File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Rescaling Decimal128 value would cause data loss

Details

I poked around a bit, and I think I see the problem. To determine the column type you create an arrow table

then lookup the type by type ID
detected_type = ARROW_SCALAR_IDS_TO_BQ.get(arrow_table.type.id)

My test decimal value can be represented by an arrow 128-bit type

>>>pyarrow.array(pd.DataFrame({"x": [Decimal('0.12345678901234560000000000000000000000')]})["x"]).type
Decimal128Type(decimal128(38, 38))

But AFAICT that has the same type ID as every other Decimal128Type, so it maps to "NUMERIC" in your lookup table:

pyarrow.decimal128(38, scale=9).id: "NUMERIC",

So when you convert to pyarrow before uploading, it tries to coerce using your hardcoded type for "NUMERIC"

and fails because it needed decimal128(38, 38) to represent the value, but is given decimal128(38, 9). I think you will need to detect if a 128 bit type can't fit in your NUMERIC subtype and upgrade those to BIGNUMERIC.

Hope that helps!

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: bigqueryIssues related to the googleapis/python-bigquery API.

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions