-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
1e8dd9f
commit f3a1108
Showing
3 changed files
with
22 additions
and
3 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,21 @@ | ||
# arro3 | ||
A minimal Python library for Apache Arrow, connecting to the Rust arrow crate | ||
|
||
A minimal Python library for [Apache Arrow](https://arrow.apache.org/docs/index.html), binding to the [Rust Arrow implementation](https://github.com/apache/arrow-rs). | ||
|
||
## Why another Arrow library? | ||
|
||
[pyarrow](https://arrow.apache.org/docs/python/index.html) is the reference Arrow implementation in Python, but there are a few reasons for `arro3` to exist: | ||
|
||
- **Lightweight**. pyarrow is 100MB on disk, plus 35MB for its required numpy dependency. `arro3-core` is around 1MB on disk with no required dependencies. | ||
- **Minimal**. The core library (`arro3-core`) has a very small scope. Other functionality, such as compute kernels, will be distributed in other namespace packages. | ||
- **Modular**. The [Arrow PyCapsule Interface](https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html) makes it easier to create small Arrow libraries that communicate via zero-copy data transfer. arro3's Python functions accept Arrow data from any Python Arrow library that implements the PyCapsule interface, including `pyarrow` and `nanoarrow`. | ||
- **Extensible**. Over time, can connect to [compute kernels provided by the Rust Arrow implementation](https://docs.rs/arrow/latest/arrow/compute/index.html). | ||
- **Compliant**. Full support for the Arrow specification*, including extension types. (*Limited to what the Arrow Rust supports, which does not yet support Arrow view types.) | ||
|
||
## Drawbacks | ||
|
||
In general, arro3 isn't designed for _constructing_ arrow data from other formats, but should enable users to manage arrow data created by other Arrow-compatible libraries. arro3 does not implement conversion of arbitrary Python objects to Arrow. This is complex and well served by other libraries (e.g. pyarrow). | ||
|
||
## Using from Rust | ||
|
||
Refer to [pyo3-arrow documentation](https://docs.rs/pyo3-arrow). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters