Skip to content

Commit

Permalink
Update base readme (#34)
Browse files Browse the repository at this point in the history
  • Loading branch information
kylebarron authored Jun 28, 2024
1 parent 1e8dd9f commit f3a1108
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 3 deletions.
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 20 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,21 @@
# arro3
A minimal Python library for Apache Arrow, connecting to the Rust arrow crate

A minimal Python library for [Apache Arrow](https://arrow.apache.org/docs/index.html), binding to the [Rust Arrow implementation](https://github.com/apache/arrow-rs).

## Why another Arrow library?

[pyarrow](https://arrow.apache.org/docs/python/index.html) is the reference Arrow implementation in Python, but there are a few reasons for `arro3` to exist:

- **Lightweight**. pyarrow is 100MB on disk, plus 35MB for its required numpy dependency. `arro3-core` is around 1MB on disk with no required dependencies.
- **Minimal**. The core library (`arro3-core`) has a very small scope. Other functionality, such as compute kernels, will be distributed in other namespace packages.
- **Modular**. The [Arrow PyCapsule Interface](https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html) makes it easier to create small Arrow libraries that communicate via zero-copy data transfer. arro3's Python functions accept Arrow data from any Python Arrow library that implements the PyCapsule interface, including `pyarrow` and `nanoarrow`.
- **Extensible**. Over time, can connect to [compute kernels provided by the Rust Arrow implementation](https://docs.rs/arrow/latest/arrow/compute/index.html).
- **Compliant**. Full support for the Arrow specification*, including extension types. (*Limited to what the Arrow Rust supports, which does not yet support Arrow view types.)

## Drawbacks

In general, arro3 isn't designed for _constructing_ arrow data from other formats, but should enable users to manage arrow data created by other Arrow-compatible libraries. arro3 does not implement conversion of arbitrary Python objects to Arrow. This is complex and well served by other libraries (e.g. pyarrow).

## Using from Rust

Refer to [pyo3-arrow documentation](https://docs.rs/pyo3-arrow).
2 changes: 1 addition & 1 deletion pyo3-arrow/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "pyo3-arrow"
version = "0.1.0-beta.1"
version = "0.1.0"
authors = ["Kyle Barron <kylebarron2@gmail.com>"]
edition = "2021"
description = "Arrow integration for pyo3."
Expand Down

0 comments on commit f3a1108

Please sign in to comment.