Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python/adbc_driver_postgresql: integration tests are failing #1333

Closed
lidavidm opened this issue Nov 30, 2023 · 4 comments
Closed

python/adbc_driver_postgresql: integration tests are failing #1333

lidavidm opened this issue Nov 30, 2023 · 4 comments

Comments

@lidavidm
Copy link
Member

=================================== FAILURES ===================================
_______________________ test_polars_write_database[ints] _______________________

postgres_uri = 'postgresql://localhost:5432/postgres?user=postgres&***'
df = shape: (4, 1)
┌──────┐
│ ints │
│ ---  │
│ i64  │
╞══════╡
│ 1    │
│ 2    │
│ 4    │
│ 8    │
└──────┘

    @pytest.mark.parametrize(
        "df",
        [
            "ints",
            "floats",
        ],
        indirect=True,
    )
    def test_polars_write_database(postgres_uri: str, df: "polars.DataFrame") -> None:
        table_name = f"polars_test_ingest_{uuid.uuid4().hex}"
        try:
>           df.write_database(
                table_name=table_name,
                connection=postgres_uri,
                # TODO(apache/arrow-adbc#541): polars doesn't map the semantics
                # properly here, and one of their modes isn't supported
                if_exists="replace",
                engine="adbc",
            )

python/adbc_driver_postgresql/tests/test_polars.py:71: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/share/miniconda3/envs/test/lib/python3.11/site-packages/polars/utils/deprecation.py:100: in wrapper
    return function(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = shape: (4, 1)
┌──────┐
│ ints │
│ ---  │
│ i64  │
╞══════╡
│ 1    │
│ 2    │
│ 4    │
│ 8    │
└──────┘
table_name = 'polars_test_ingest_23929440c8084693b505cf596fe45b46'
connection = 'postgresql://localhost:5432/postgres?user=postgres&***'

    @deprecate_renamed_parameter("connection_uri", "connection", version="0.18.9")
    def write_database(
        self,
        table_name: str,
        connection: str,
        *,
        if_exists: DbWriteMode = "fail",
        engine: DbWriteEngine = "sqlalchemy",
    ) -> None:
        """
        Write a polars frame to a database.
    
        Parameters
        ----------
        table_name
            Schema-qualified name of the table to create or append to in the target
            SQL database. If your table name contains special characters, it should
            be quoted.
        connection
            Connection URI string, for example:
    
            * "***server:port/database"
            * "sqlite:////path/to/database.db"
        if_exists : {'append', 'replace', 'fail'}
            The insert mode:
    
            * 'replace' will create a new database table, overwriting an existing one.
            * 'append' will append to an existing table.
            * 'fail' will fail if table already exists.
        engine : {'sqlalchemy', 'adbc'}
            Select the engine used for writing the data.
        """
        from polars.io.database import _open_adbc_connection
    
        def unpack_table_name(name: str) -> tuple[str | None, str]:
            """Unpack optionally qualified table name into schema/table pair."""
            from csv import reader as delimited_read
    
            table_ident = next(delimited_read([name], delimiter="."))
            if len(table_ident) > 2:
                raise ValueError(f"`table_name` appears to be invalid: {name!r}")
            elif len(table_ident) > 1:
                schema = table_ident[0]
                tbl = table_ident[1]
            else:
                schema = None
                tbl = table_ident[0]
            return schema, tbl
    
        if engine == "adbc":
            try:
                import adbc_driver_manager
    
                adbc_version = parse_version(
                    getattr(adbc_driver_manager, "__version__", "0.0")
                )
            except ModuleNotFoundError as exc:
                raise ModuleNotFoundError(
                    "adbc_driver_manager not found"
                    "\n\nInstall Polars with: pip install adbc_driver_manager"
                ) from exc
    
            if if_exists == "fail":
                # if the table exists, 'create' will raise an error,
                # resulting in behaviour equivalent to 'fail'
                mode = "create"
            elif if_exists == "replace":
                if adbc_version < (0, 7):
                    adbc_str_version = ".".join(str(v) for v in adbc_version)
>                   raise ModuleNotFoundError(
                        f"`if_exists = 'replace'` requires ADBC version >= 0.7, found {adbc_str_version}"
                    )
E                   ModuleNotFoundError: `if_exists = 'replace'` requires ADBC version >= 0.7, found 0.0.5174

/usr/share/miniconda3/envs/test/lib/python3.11/site-packages/polars/dataframe/frame.py:3491: ModuleNotFoundError

During handling of the above exception, another exception occurred:

postgres_uri = 'postgresql://localhost:5432/postgres?user=postgres&***'
df = shape: (4, 1)
┌──────┐
│ ints │
│ ---  │
│ i64  │
╞══════╡
│ 1    │
│ 2    │
│ 4    │
│ 8    │
└──────┘

    @pytest.mark.parametrize(
        "df",
        [
            "ints",
            "floats",
        ],
        indirect=True,
    )
    def test_polars_write_database(postgres_uri: str, df: "polars.DataFrame") -> None:
        table_name = f"polars_test_ingest_{uuid.uuid4().hex}"
        try:
            df.write_database(
                table_name=table_name,
                connection=postgres_uri,
                # TODO(apache/arrow-adbc#541): polars doesn't map the semantics
                # properly here, and one of their modes isn't supported
                if_exists="replace",
                engine="adbc",
            )
        finally:
            with dbapi.connect(postgres_uri) as conn:
                with conn.cursor() as cursor:
>                   cursor.execute(f"DROP TABLE {table_name}")

python/adbc_driver_postgresql/tests/test_polars.py:82: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
python/adbc_driver_manager/adbc_driver_manager/dbapi.py:669: in execute
    handle, self._rowcount = self._stmt.execute_query()
adbc_driver_manager/_lib.pyx:1106: in adbc_driver_manager._lib.AdbcStatement.execute_query
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   adbc_driver_manager.ProgrammingError: NOT_FOUND: [libpq] Failed to execute query: ERROR:  table "polars_test_ingest_23929440c8084693b505cf596fe45b46" does not exist
E   
E   Query was:DROP TABLE polars_test_ingest_23929440c8084693b505cf596fe45b46. SQLSTATE: 42P01

adbc_driver_manager/_lib.pyx:227: ProgrammingError
______________________ test_polars_write_database[floats] ______________________

postgres_uri = 'postgresql://localhost:5432/postgres?user=postgres&***'
df = shape: (4, 1)
┌────────┐
│ floats │
│ ---    │
│ f64    │
╞════════╡
│ 1.0    │
│ 2.0    │
│ 4.0    │
│ 8.0    │
└────────┘

    @pytest.mark.parametrize(
        "df",
        [
            "ints",
            "floats",
        ],
        indirect=True,
    )
    def test_polars_write_database(postgres_uri: str, df: "polars.DataFrame") -> None:
        table_name = f"polars_test_ingest_{uuid.uuid4().hex}"
        try:
>           df.write_database(
                table_name=table_name,
                connection=postgres_uri,
                # TODO(apache/arrow-adbc#541): polars doesn't map the semantics
                # properly here, and one of their modes isn't supported
                if_exists="replace",
                engine="adbc",
            )

python/adbc_driver_postgresql/tests/test_polars.py:71: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/share/miniconda3/envs/test/lib/python3.11/site-packages/polars/utils/deprecation.py:100: in wrapper
    return function(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = shape: (4, 1)
┌────────┐
│ floats │
│ ---    │
│ f64    │
╞════════╡
│ 1.0    │
│ 2.0    │
│ 4.0    │
│ 8.0    │
└────────┘
table_name = 'polars_test_ingest_7dc879158fdf47f5be7712cc17531132'
connection = 'postgresql://localhost:5432/postgres?user=postgres&***'

    @deprecate_renamed_parameter("connection_uri", "connection", version="0.18.9")
    def write_database(
        self,
        table_name: str,
        connection: str,
        *,
        if_exists: DbWriteMode = "fail",
        engine: DbWriteEngine = "sqlalchemy",
    ) -> None:
        """
        Write a polars frame to a database.
    
        Parameters
        ----------
        table_name
            Schema-qualified name of the table to create or append to in the target
            SQL database. If your table name contains special characters, it should
            be quoted.
        connection
            Connection URI string, for example:
    
            * "***server:port/database"
            * "sqlite:////path/to/database.db"
        if_exists : {'append', 'replace', 'fail'}
            The insert mode:
    
            * 'replace' will create a new database table, overwriting an existing one.
            * 'append' will append to an existing table.
            * 'fail' will fail if table already exists.
        engine : {'sqlalchemy', 'adbc'}
            Select the engine used for writing the data.
        """
        from polars.io.database import _open_adbc_connection
    
        def unpack_table_name(name: str) -> tuple[str | None, str]:
            """Unpack optionally qualified table name into schema/table pair."""
            from csv import reader as delimited_read
    
            table_ident = next(delimited_read([name], delimiter="."))
            if len(table_ident) > 2:
                raise ValueError(f"`table_name` appears to be invalid: {name!r}")
            elif len(table_ident) > 1:
                schema = table_ident[0]
                tbl = table_ident[1]
            else:
                schema = None
                tbl = table_ident[0]
            return schema, tbl
    
        if engine == "adbc":
            try:
                import adbc_driver_manager
    
                adbc_version = parse_version(
                    getattr(adbc_driver_manager, "__version__", "0.0")
                )
            except ModuleNotFoundError as exc:
                raise ModuleNotFoundError(
                    "adbc_driver_manager not found"
                    "\n\nInstall Polars with: pip install adbc_driver_manager"
                ) from exc
    
            if if_exists == "fail":
                # if the table exists, 'create' will raise an error,
                # resulting in behaviour equivalent to 'fail'
                mode = "create"
            elif if_exists == "replace":
                if adbc_version < (0, 7):
                    adbc_str_version = ".".join(str(v) for v in adbc_version)
>                   raise ModuleNotFoundError(
                        f"`if_exists = 'replace'` requires ADBC version >= 0.7, found {adbc_str_version}"
                    )
E                   ModuleNotFoundError: `if_exists = 'replace'` requires ADBC version >= 0.7, found 0.0.5174

/usr/share/miniconda3/envs/test/lib/python3.11/site-packages/polars/dataframe/frame.py:3491: ModuleNotFoundError

During handling of the above exception, another exception occurred:

postgres_uri = 'postgresql://localhost:5432/postgres?user=postgres&***'
df = shape: (4, 1)
┌────────┐
│ floats │
│ ---    │
│ f64    │
╞════════╡
│ 1.0    │
│ 2.0    │
│ 4.0    │
│ 8.0    │
└────────┘

    @pytest.mark.parametrize(
        "df",
        [
            "ints",
            "floats",
        ],
        indirect=True,
    )
    def test_polars_write_database(postgres_uri: str, df: "polars.DataFrame") -> None:
        table_name = f"polars_test_ingest_{uuid.uuid4().hex}"
        try:
            df.write_database(
                table_name=table_name,
                connection=postgres_uri,
                # TODO(apache/arrow-adbc#541): polars doesn't map the semantics
                # properly here, and one of their modes isn't supported
                if_exists="replace",
                engine="adbc",
            )
        finally:
            with dbapi.connect(postgres_uri) as conn:
                with conn.cursor() as cursor:
>                   cursor.execute(f"DROP TABLE {table_name}")

python/adbc_driver_postgresql/tests/test_polars.py:82: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
python/adbc_driver_manager/adbc_driver_manager/dbapi.py:669: in execute
    handle, self._rowcount = self._stmt.execute_query()
adbc_driver_manager/_lib.pyx:1106: in adbc_driver_manager._lib.AdbcStatement.execute_query
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   adbc_driver_manager.ProgrammingError: NOT_FOUND: [libpq] Failed to execute query: ERROR:  table "polars_test_ingest_7dc879158fdf47f5be7712cc17531132" does not exist
E   
E   Query was:DROP TABLE polars_test_ingest_7dc879158fdf47f5be7712cc17531132. SQLSTATE: 42P01

adbc_driver_manager/_lib.pyx:227: ProgrammingError
=========================== short test summary info ============================
FAILED python/adbc_driver_postgresql/tests/test_polars.py::test_polars_write_database[ints] - adbc_driver_manager.ProgrammingError: NOT_FOUND: [libpq] Failed to execute query: ERROR:  table "polars_test_ingest_23929440c8084693b505cf596fe45b46" does not exist

Query was:DROP TABLE polars_test_ingest_23929440c8084693b505cf596fe45b46. SQLSTATE: 42P01
FAILED python/adbc_driver_postgresql/tests/test_polars.py::test_polars_write_database[floats] - adbc_driver_manager.ProgrammingError: NOT_FOUND: [libpq] Failed to execute query: ERROR:  table "polars_test_ingest_7dc879158fdf47f5be7712cc17531132" does not exist

Query was:DROP TABLE polars_test_ingest_7dc879158fdf47f5be7712cc17531132. SQLSTATE: 42P01
========================= 2 failed, 20 passed in 3.40s =========================
@WillAyd
Copy link
Contributor

WillAyd commented Dec 1, 2023

I think this started with this upstream PR in polars:

pola-rs/polars#12713

It looks like in arrow-adbc we are driving the release number off of what git describe --long --always produces:

But it looks like the tags that exist don't exist on the main branch, and the commits they reference are on the maintenance branches (were these cherry-picked?) so when building from source you always end up with version 0.0.0+dirty

Is it intentional for tags not to exist on the main branch or a msitake?

@lidavidm
Copy link
Member Author

lidavidm commented Dec 1, 2023

Ah...That is how the release process works (using maintenance branches), which is a holdover from the main Arrow project (which creates a branch and cherry-picks commits onto it as needed). So either we can adjust CI to force/fake a particular version, or we could consider just releasing directly off of main.

lidavidm pushed a commit that referenced this issue Dec 6, 2023
I don't think this is the better long term option of what is described
in
#1333 (comment)
but should get CI green for now
@lidavidm
Copy link
Member Author

lidavidm commented Dec 6, 2023

So continuing from #1337 we should set SETUPTOOLS_SCM_PRETEND_VERSION in our various pipelines and have the bump-version release script update this.

@lidavidm lidavidm modified the milestone: ADBC Libraries 0.9.0 Dec 19, 2023
@lidavidm
Copy link
Member Author

lidavidm commented Jan 2, 2024

Closing in favor of #1363

@lidavidm lidavidm closed this as completed Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants