- Minimum supported Java version is now 8 (was 7).
- Updated
circus-train-version
to 16.0.0 (was 15.0.0). - Updated
hotels-oss-parent
version to 5.0.0 (was 4.2.0). - Updated
hive
version to 2.3.7 (was 2.3.5). This allows Circus Train Big Query to be used on JDK>=9. - Updated Google Cloud
libraries-bom
to 4.1.1 (was 2.9.0). - Updated
maven-shade-plugin
to 3.2.2 (was 3.2.1)
- Removed duplicated partition column from the schema. See #37.
- Temporary BigQuery tables are cleaned up even if replication fails.
- Using
libraries-bom
instead ofgoogle-cloud-bom
. - Updated
circus-train-version
to 15.0.0 (was 14.0.1). - Updated
lastcommons-test.version
to 7.0.2 (was 5.2.1).
- Updated
hotels-oss-parent
to 4.2.0 (was 4.0.1). - Excluded
org.pentaho.pentaho-aggdesigner-algorithm
dependency as it's not available in Maven Central.
- Refactored project to remove checkstyle and findbugs warnings, which does not impact functionality.
- Upgraded
circus-train-version
to 14.0.1 (was 13.0.0). - Upgraded
hotels-oss-parent
to 4.0.1 (was 2.3.3). - Upgraded
google-cloud-bom
to 0.73.0-alpha (was 0.34.0-alpha). - Shade and relocate all
com.google
packages (to fix Guava version issue).
- Upgraded
hotels-oss-parent
pom to 2.3.3 (was 2.0.6). See #23.
- Add Spring annotations to all the components that do not need to load during the housekeeping module. See #28.
- Read schema from Avro file using a stream instead of entire file to avoid OOM. See #26.
- Partition columns are set to type
string
, regardless of the type of the column that is used to partition the table. - If an error occurs while deleting an intermediate table during cleanup, this will no longer fail the replication. Instead, it will log the error and continue.
- Circus Train version upgraded to 13.0.0 (was 12.0.0). Note that this change is not backwards compatible as this BigQuery extension now needs to be explicitly added to the Circus Train classpath using Circus Train's standard extension loading mechanism. See #20.
- Replicated tables are now exported as AVRO files instead of CSV files. This allows BigQuery tables to be replicated without any schema or data change. See #18.
- Please note that this version is not backwards compatible for partitioned tables that have been replicated using an earlier version - in this case all the previously replicated partitions will need to be replicated again from scratch.
- Support for partition generation upon replication.
- Support for partition filters.
- Support for filters in Standard SQL.
- Integers from BigQuery are treated as 64 bit (BigInt Hive type) values rather than 32 bit (Int Hive type) values.
- Jobs no longer fail when parts of the best effort temporary data cleanup fail.
- Support sharded exports to enable replications of larger tables See #10.
- Set Hive replica table metadata to ignore header in replicated CSV files. See #12.
- Updated project to be compatible with Circus Train 12.0.0 and up.
- Improved error logging when extraction job fails or is no longer present.
- Extract table during replication rather than in listener to prevent silent failure. See #5.
- Project moved into Github.