Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PanicException on cloud DataFrame write #21170

Open
2 tasks done
hutch3232 opened this issue Feb 10, 2025 · 1 comment
Open
2 tasks done

PanicException on cloud DataFrame write #21170

hutch3232 opened this issue Feb 10, 2025 · 1 comment
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars

Comments

@hutch3232
Copy link

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

test_polars.py

# /// script
# requires-python = ">=3.9"
# dependencies = [
#     "boto3",
#     "botocore<1.36.0",
#     "polars==1.22.0",
#     "s3fs",
# ]
# ///
import os

import polars as pl

os.environ["RUST_BACKTRACE"] = "1"

data = pl.DataFrame(
    {
        "col1": [1, 2, 3]
    }
)

data.write_csv(
    "s3://my-bucket/my-prefix/test_write.csv",
    storage_options={
        "profile": "my-profile",
        "endpoint_url": "https://my-endpoint.com/",
    }
)
uv run test_polars.py
Installed 10 packages in 17.18s
/mnt/code/python/test_polars.py:24: UserWarning: the configured AWS profile 'my-profile' may be ignored as it is not compatible with the provided storage_option key 'endpoint_url'. To silence this warning, pass 'aws_profile': None in storage_options.
  data.write_csv(

thread '<unnamed>' panicked at crates/polars-io/src/cloud/adaptors.rs:132:50:
called `Result::unwrap()` on an `Err` value: ComputeError(ErrString("Generic S3 error: Error after 2 retries in 3.220692045s, max_retries:2, retry_timeout:10s, source:error sending request for url (http://<an ip address that i'm not sure the source of>/latest/api/token)"))
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
   3: core::ptr::drop_in_place<polars_io::cloud::adaptors::CloudWriter>
   4: polars_python::dataframe::io::<impl polars_python::dataframe::PyDataFrame>::write_csv
   5: polars_python::dataframe::io::<impl polars_python::dataframe::PyDataFrame>::__pymethod_write_csv__
   6: pyo3::impl_::trampoline::trampoline
   7: polars_python::dataframe::io::_::__INVENTORY::trampoline
   8: cfunction_call
             at /usr/local/src/conda/python-3.9.18/Objects/methodobject.c:543:19
   9: _PyObject_MakeTpCall
             at /usr/local/src/conda/python-3.9.18/Objects/call.c:191:18
  10: _PyObject_VectorcallTstate
             at /usr/local/src/conda/python-3.9.18/Include/cpython/abstract.h:116:16
  11: _PyObject_VectorcallTstate
             at /usr/local/src/conda/python-3.9.18/Include/cpython/abstract.h:103:1
  12: PyObject_Vectorcall
             at /usr/local/src/conda/python-3.9.18/Include/cpython/abstract.h:127:12
  13: call_function
             at /usr/local/src/conda/python-3.9.18/Python/ceval.c:5077:13
  14: _PyEval_EvalFrameDefault
             at /usr/local/src/conda/python-3.9.18/Python/ceval.c:3537:19
  15: _PyEval_EvalFrame
             at /usr/local/src/conda/python-3.9.18/Include/internal/pycore_ceval.h:40:12
  16: _PyEval_EvalCode
             at /usr/local/src/conda/python-3.9.18/Python/ceval.c:4329:14
  17: _PyFunction_Vectorcall
             at /usr/local/src/conda/python-3.9.18/Objects/call.c:396:12
  18: _PyObject_VectorcallTstate
             at /usr/local/src/conda/python-3.9.18/Include/cpython/abstract.h:118:11
  19: method_vectorcall
             at /usr/local/src/conda/python-3.9.18/Objects/classobject.c:53:18
  20: _PyObject_VectorcallTstate
             at /usr/local/src/conda/python-3.9.18/Include/cpython/abstract.h:118:11
  21: PyObject_Vectorcall
             at /usr/local/src/conda/python-3.9.18/Include/cpython/abstract.h:127:12
  22: call_function
             at /usr/local/src/conda/python-3.9.18/Python/ceval.c:5077:13
  23: _PyEval_EvalFrameDefault
             at /usr/local/src/conda/python-3.9.18/Python/ceval.c:3537:19
  24: _PyEval_EvalFrame
             at /usr/local/src/conda/python-3.9.18/Include/internal/pycore_ceval.h:40:12
  25: _PyEval_EvalCode
             at /usr/local/src/conda/python-3.9.18/Python/ceval.c:4329:14
  26: _PyEval_EvalCodeWithName
             at /usr/local/src/conda/python-3.9.18/Python/ceval.c:4361:12
  27: PyEval_EvalCodeEx
             at /usr/local/src/conda/python-3.9.18/Python/ceval.c:4377:12
  28: PyEval_EvalCode
             at /usr/local/src/conda/python-3.9.18/Python/ceval.c:828:12
  29: run_eval_code_obj
             at /usr/local/src/conda/python-3.9.18/Python/pythonrun.c:1221:9
  30: run_mod
             at /usr/local/src/conda/python-3.9.18/Python/pythonrun.c:1242:19
  31: pyrun_file
             at /usr/local/src/conda/python-3.9.18/Python/pythonrun.c:1140:15
  32: pyrun_simple_file
             at /usr/local/src/conda/python-3.9.18/Python/pythonrun.c:450:13
  33: PyRun_SimpleFileExFlags
             at /usr/local/src/conda/python-3.9.18/Python/pythonrun.c:483:15
  34: pymain_run_file
             at /usr/local/src/conda/python-3.9.18/Modules/main.c:379:15
  35: pymain_run_python
             at /usr/local/src/conda/python-3.9.18/Modules/main.c:604:21
  36: Py_RunMain
             at /usr/local/src/conda/python-3.9.18/Modules/main.c:683:5
  37: Py_BytesMain
             at /usr/local/src/conda/python-3.9.18/Modules/main.c:1129:12
  38: __libc_start_main
  39: <unknown>
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Traceback (most recent call last):
  File "/mnt/code/python/test_polars.py", line 24, in <module>
    data.write_csv(
  File "/mnt/imported/data/uv/archive-v0/HVyDXyC3LTN6WTA7zDeVB/lib/python3.9/site-packages/polars/dataframe/frame.py", line 2966, in write_csv
    self._df.write_csv(
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: ComputeError(ErrString("Generic S3 error: Error after 2 retries in 3.220692045s, max_retries:2, retry_timeout:10s, source:error sending request for url (http://<an ip address that i'm not sure the source of>/latest/api/token)"))

Log output

Issue description

I've tried a variety of combinations to try and write to S3 based on some recent features with the credential provider, e.g., #18757

Note, I have also try setting AWS_PROFILE and AWS_ENDPOINT_URL.

Expected behavior

It's odd to me also that it says endpoint_url is incompatible with profile - yet I don't think it picks up on the endpoint, per: #18757 (comment)

Happy to run any tests! Thanks for all the work that has gone into the S3 compatibility!

Installed versions

--------Version info---------
Polars:              1.22.0
Index type:          UInt32
Platform:            Linux-4.18.0-553.34.1.el8_10.x86_64-x86_64-with-glibc2.31
Python:              3.9.18 (main, Sep 11 2023, 13:41:44)
[GCC 11.2.0]
LTS CPU:             False

----Optional dependencies----
Azure CLI            <not installed>
adbc_driver_manager  <not installed>
altair               <not installed>
azure.identity       <not installed>
boto3                1.35.99
cloudpickle          <not installed>
connectorx           <not installed>
deltalake            <not installed>
fastexcel            <not installed>
fsspec               2025.2.0
gevent               <not installed>
google.auth          <not installed>
great_tables         <not installed>
matplotlib           <not installed>
nest_asyncio         <not installed>
numpy                <not installed>
openpyxl             <not installed>
pandas               <not installed>
pyarrow              <not installed>
pydantic             <not installed>
pyiceberg            <not installed>
sqlalchemy           <not installed>
torch                <not installed>
xlsx2csv             <not installed>
xlsxwriter           <not installed>
@hutch3232 hutch3232 added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Feb 10, 2025
@hutch3232
Copy link
Author

I looked into this a little more and found it is more nuanced than originally thought. The code actually can run, provided the S3 profile exists with valid keys the first time it is run. If you run it and get some kind of access error, then subsequent tries with valid keys continues to provide this panic.

What is very odd to me is that it seems like some caching to disk (or something) is happening because I'm using uv run via CLI and encountering it, so it should be a fresh python session each time. I even tried killing my terminal and restarting and still got the panic.

My only resolution was restarting the kubernetes node I was on, which wipes my home directory and /tmp among other things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

1 participant