Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Provide geospatial support for Iceberg Spark #1830

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Kontinuation
Copy link
Member

This PR depends on #1828

Did you read the Contributor Guide?

Is this PR related to a JIRA ticket?

  • Yes, the URL of the associated JIRA ticket is https://issues.apache.org/jira/browse/SEDONA-XXX. The PR name follows the format [SEDONA-XXX] my subject.

  • No:

    • this is a documentation update. The PR name follows the format [DOCS] my subject
    • this is a CI update. The PR name follows the format [CI] my subject

What changes were proposed in this PR?

GEOMETRY and GEOGRAPHY types has been accepted as part of Iceberg V3 spec. We are actively working on implementing geospatial support for the apache/iceberg project. This repository demonstrates how we'll integrate Sedona with Iceberg Spark to work with geometry and geography data. The overall idea is:

  1. We'll define an SPI interface GeospatialLibraryProvider in apache/iceberg to allow third party libraries to integrate their geospatial support into iceberg-spark.
  2. Sedona implements this GeospatialLibraryProvider interface in sedona-spark-iceberg module.
  3. Iceberg will work with Sedona when iceberg-spark-runtime, sedona-spark-shaded and sedona-spark-iceberg are all installed into the Spark environment.

This PR is still in a proof-of-concept stage. It relies on several unreleased dependencies to work:

  • parquet-format: for the new GEOMETRY and GEOGRAPHY logical types
  • parquet-java: for the implementation of the parquet-geo standard
  • iceberg: proof-of-concept implementation of the geo spec

How was this patch tested?

Tested locally

Did this PR include necessary documentation updates?

  • Yes, I am adding a new API. I am using the current SNAPSHOT version number in vX.Y.Z format.
  • Yes, I have updated the documentation.
  • No, this PR does not affect any public API so no need to change the documentation.

@github-actions github-actions bot added the root label Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant