Releases: steveloughran/cloudstore
Second release of -debug changes; works on CDH-7.1.7.x
when -debug is used for debug output, azure and s3a components log level switched to debug, including aws and shaded http logs
bandwith command now includes block count as a column in the csv output.
Example: 16MB upload against gcs through s3a connector, 8 MB clock size
hadoop jar cloudstore-1.0.jar bandwidth -block 8 -csv s3a.stevel.gcs.csv 16m s3a://stevel-gcs/bandwidth.bin
CSV file
abfs diags includes the
option; checks for prefetch safety in path capabilities scan
system properties printed by abfs and gcs include proxy info
Selected System Properties
[001] https.proxyHost = (unset)
[002] https.proxyPort = (unset)
[003] https.nonProxyHosts = (unset)
[004] https.proxyPassword = (unset)
[005] http.proxyHost = (unset)
[006] http.proxyPort = (unset)
[007] http.proxyPassword = (unset)
[008] http.nonProxyHosts = (unset)
[009] = "true"
[010] = (unset)
[011] networkaddress.cache.ttl = (unset)
[012] networkaddress.cache.negative.ttl = (unset)
[013] socksProxyHost = (unset)
[014] socksProxyPort = (unset)
[015] = (unset)
[016] = (unset)
[017] = (unset)
[018] = (unset)
[019] java.version = "1.8.0_362"
release of 2023-09-29
release-2023-09-26: cloudup
Major Cloudup rework
- incremental -update operation to skip files which exist.
- improved output
- options to flush/hflush
- standalone document with more examples
- tuning for maximum s3a performance
see cloudup
Release 2023-09-14. Bandwidth enhancements.
Release 2022-09-14. Bandwidth enhancements.
The bandwidth command now
- can save details to a CSV file
- adds options to set: read policy, block size, whether to flush/hflush after each write
- reports whether progress callbacks were made during (possibly slow) close() operations.
Read the bandwidth documentation for details and example analysis of CSV files using
different S3A tuning parameters (including prefetching).
- analyzes signing algorithm, including warnings about v2 sdk compatibility
- a bit more v2 sdk awareness
- prints UTC timestamp of when the log was collected
release-2023-08-10: AWS SDK v2 awareness
release of 2023-08-10
s3a diagnostics now works with aws v2 sdk branches.
- looks for the different classes
- all code using aws sdk v1 classes wrapped by exception handling
- endpoint analysis recognises fs.s3a.endpoint set to an ipv4 dotted address and comments on that (https won't work, path style access doomed)
Note: the low-level s3 operations only work on v1 sdk hadoop releases; no immediate plans to switch