[C] Research Turbodbc/Arrowdantic for developing ODBC-wrapping driver #72

lidavidm · 2022-08-19T18:27:43Z

Arrowdantic: https://github.com/jorgecarleitao/arrowdantic/
Turbodbc: https://github.com/blue-yonder/turbodbc/
arrow-odbc: https://github.com/pacman82/arrow-odbc

pacman82 · 2022-08-24T21:22:15Z

Hi there 👋 . I am the author of arrow-odbc and a typo braught me here. A quick heads up:

Turbodbc:

Uses ODBC C Interface directly from C++ and fills arrow (C++ official implementation) arrays in C++.
pyarrow is backed by the C++ arrow implementation. Python C API is used for interfacing
For all things to work together, C++ ABI, Python C-API, boost version, arrow version must match. Somewhat frickle build process.
Scope: Complies with Python Database API Specification 2.0 (PEP 249)

Arrowdantic (at the best of my knowledge):

Uses ODBC from a Rust crate (odbc-api) which and fills arrow2(Rust crate) arrays directly in Rust
Provides Python bindings for arrow2.
Scope: More an alternative to pyarrow with built-in ODBC support

arrow-odbc

Uses ODBC from a Rust crate (odbc-api) which talks to Python via C-Interface.
Uses arrow (Rust crate, official implementation) and Arrow-C Interface to interface with pyarrow
Scope: Read and write pyarrow arrays with ODBC from and to databases.

Cheers,
Markus

lidavidm · 2022-08-24T21:28:45Z

Hi, sorry for typosquatting 🙂

Thanks for the breakdown! The scope here would be lower level than any of these. I suppose I'm mostly curious about how each project achieves their speed objectives. Also, the plan would be to use nanoarrow to avoid bringing in dependencies on libarrow, Boost, or anything like that.

pacman82 · 2022-08-24T21:40:58Z

Hi, sorry for typosquatting 🙂

I don't mind.

Also, the plan would be to use nanoarrow to avoid bringing in dependencies on libarrow, Boost, or anything like that.

Yeah, building that is a pain. Personally I would recommend using one of the Rust implementations (either arrow or arrow2), since Rust links everything static by default, and cargo is way more fun than any C/C++ based build system. You do you, though.

Cheers, Markus

pacman82 · 2022-08-24T21:41:46Z

Yeah, building that is a pain

To clarify: I was referring the dependencies. I've no experience or knowledge about/with nanoarrow.

lidavidm · 2024-12-30T03:38:20Z

An additional snag is that Unix platforms need unixodbc, which is LGPL, and so I'm not sure we can take a dependency on that from an Apache project.

lidavidm · 2024-12-30T03:42:55Z

Turbodbc indeed requires Boost, but it appears to have a C++ interface, which means we could pull it via ExternalProject/FetchContent. On the other hand, arrow-odbc + our relatively new Rust API definitions is tempting. For me it boils down to whether Turbodbc's optimizations put it ahead of arrow-odbc or not. I think both libraries use SQLBindCol?

pacman82 · 2024-12-31T08:50:17Z

An additional snag is that Unix platforms need unixodbc, which is LGPL, and so I'm not sure we can take a dependency on that from an Apache project.

I would recommend to link dynamically against the ODBC driver Manager used by the System. As such you would not package unixODBC, but it would be installed seperatly via e.g. the package manager of the Linux distribution.

I think both libraries use SQLBindCol?

Both turbodbc and arrow-odbc use column wise block cursors via SQLBindCol.

Best, Markus

avhz · 2025-02-26T21:41:33Z

Coming from #2542. An ODBC bridge would be nice for sure, and more generic than SAP HANA support (my original proposal), so I assume more people will benefit.

If there is some way I can help, I would be interested in learning about Arrow in depth.

lidavidm · 2025-02-26T23:53:53Z

It needs someone to do the work, broadly. Or someone to sponsor the work, possibly.

lidavidm · 2025-02-27T01:01:30Z

Oh, and for posterity my current preference is to build on top of arrow-odbc and set up the infra to distribute Rust-based drivers for Python et al (though I'm not looking forward to the CMake part of that). Though I'd be curious if there's a performance comparison between Turbodbc and arrow-odbc

WillAyd · 2025-02-27T01:11:51Z

If CMake is not a hard requirement it might be easier to attempt that build system through Meson, since that natively supports Python and Rust

lidavidm · 2025-02-27T02:08:40Z

I think we'd want to ship CMake definitions and the rest of the build infra is all still CMake based, unfortunately

WillAyd · 2025-02-27T16:53:56Z

Sounds good. I'm half committed, but I might just prototype with Meson first and come back to CMake later if I get to the point of something stable.

Do our Rust libraries support C/C++ as well or would this just be a Rust + Python driver?

lidavidm · 2025-02-27T23:37:06Z

All the drivers work via C interop so they should support C/C++.

lidavidm added this to the 0.2.0 milestone Dec 13, 2022

lidavidm removed this from the ADBC Libraries 0.2.0 milestone Feb 2, 2023

lidavidm mentioned this issue Dec 26, 2024

ADBC for Oracle relational database #2377

Closed

WillAyd mentioned this issue Feb 26, 2025

Support SAP HANA #2542

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[C] Research Turbodbc/Arrowdantic for developing ODBC-wrapping driver #72

[C] Research Turbodbc/Arrowdantic for developing ODBC-wrapping driver #72

lidavidm commented Aug 19, 2022 •

edited

Loading

pacman82 commented Aug 24, 2022

lidavidm commented Aug 24, 2022

pacman82 commented Aug 24, 2022

pacman82 commented Aug 24, 2022

lidavidm commented Dec 30, 2024

lidavidm commented Dec 30, 2024

pacman82 commented Dec 31, 2024

avhz commented Feb 26, 2025

lidavidm commented Feb 26, 2025

lidavidm commented Feb 27, 2025

WillAyd commented Feb 27, 2025

lidavidm commented Feb 27, 2025

WillAyd commented Feb 27, 2025

lidavidm commented Feb 27, 2025

[C] Research Turbodbc/Arrowdantic for developing ODBC-wrapping driver #72

[C] Research Turbodbc/Arrowdantic for developing ODBC-wrapping driver #72

Comments

lidavidm commented Aug 19, 2022 • edited Loading

pacman82 commented Aug 24, 2022

lidavidm commented Aug 24, 2022

pacman82 commented Aug 24, 2022

pacman82 commented Aug 24, 2022

lidavidm commented Dec 30, 2024

lidavidm commented Dec 30, 2024

pacman82 commented Dec 31, 2024

avhz commented Feb 26, 2025

lidavidm commented Feb 26, 2025

lidavidm commented Feb 27, 2025

WillAyd commented Feb 27, 2025

lidavidm commented Feb 27, 2025

WillAyd commented Feb 27, 2025

lidavidm commented Feb 27, 2025

lidavidm commented Aug 19, 2022 •

edited

Loading