Releases: steveloughran/cloudstore
2018-02-23 release
Release for 2018-02-23
- prints out proxy settings for the endpoint probe
- a FS can list the env vars it uses; these will get printed if set.
- general cleanup of output, including clearer explanations
There's not much else which can be done here other than some bandwidth probes, which I don't plan to add
2018-02-21 release
2018-02-21 release
This improves diagnostics by
- adding the notion of optional classes; allows S3A to list joda-time (required for Hadoop <=2.8, not on CP for trunk)
- picking up the first file in the root dir listing, if any, and reading its first character.
- some tuning of messages.
Example usage against Amazon's public landsat bucket
hadoop jar cloudstore-2.8.jar s3a://landsat-pds/
Release #3 2018-02-13
Lists provenance of all options, which includes how they got inferred.
Log of a run against an S3A FS which has some per-bucket settings. Censoring of secrets is automatic for properties declared as sensitive...the first and last chars are included for a hint of what there is, but overall: not enough information is leaked to make including this log in a JIRA a security breach.
> bin/hadoop jar /Users/stevel/Projects/Misc/cloudstore/target/cloudstore-2.8.jar s3a://hwdev-steve-ireland-new
Hadoop information
==================
Hadoop 3.1.0-SNAPSHOT
Compiled by stevel on 2018-02-12T18:18Z
Compiled with protoc 2.5.0
From source with checksum 7e4b15df61f22f370718a29a319435
Diagnostics for filesystem s3a://hwdev-steve-ireland-new
========================================================
S3A FileSystem connector
ASF Filesystem Connector to Amazon S3 Storage and compatible stores
https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html
System Properties
=================
...
Selected and Sanitized Configuration Options
============================================
fs.s3a.access.key = "A******************A" [core-site.xml]
fs.s3a.secret.key = "T**************************************p" [core-site.xml]
fs.s3a.session.token = "(unset)"
fs.s3a.server-side-encryption-algorithm = "(unset)"
fs.s3a.server-side-encryption.key = "(unset)"
fs.s3a.aws.credentials.provider = "(unset)"
fs.s3a.proxy.host = "(unset)"
fs.s3a.proxy.port = "(unset)"
fs.s3a.proxy.username = "(unset)"
fs.s3a.proxy.password = "(unset)"
fs.s3a.proxy.domain = "(unset)"
fs.s3a.proxy.workstation = "(unset)"
fs.s3a.fast.upload = "true" [core-site.xml]
fs.s3a.fast.upload.buffer = "disk" [core-site.xml]
fs.s3a.fast.upload.active.blocks = "4" [core-default.xml]
fs.s3a.signing-algorithm = "(unset)"
fs.s3a.experimental.input.fadvise = "(unset)"
fs.s3a.user.agent.prefix = "(unset)"
fs.s3a.experimental.input.fadvise = "(unset)"
fs.s3a.signing-algorithm = "(unset)"
fs.s3a.threads.max = "10" [core-default.xml]
fs.s3a.threads.keepalivetime = "60" [core-default.xml]
fs.s3a.max.total.tasks = "5" [core-default.xml]
fs.s3a.multipart.size = "8388608" [core-site.xml]
fs.s3a.buffer.dir = "/tmp/hadoop-stevel/s3a" [core-default.xml]
fs.s3a.metadatastore.impl = "org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore" [fs.s3a.bucket.hwdev-steve-ireland-new.metadatastore.impl via [core-site.xml]]
fs.s3a.metadatastore.authoritative = "false" [core-default.xml]
fs.s3a.committer.magic.enabled = "true" [fs.s3a.bucket.hwdev-steve-ireland-new.committer.magic.enabled via [core-site.xml]]
Classes
=======
class org.apache.hadoop.fs.s3a.S3AFileSystem was found in file:/Users/stevel/Projects/hadoop-trunk/hadoop-dist/target/hadoop-3.1.0-SNAPSHOT/share/hadoop/tools/lib/hadoop-aws-3.1.0-SNAPSHOT.jar
class com.amazonaws.services.s3.AmazonS3 was found in file:/Users/stevel/Projects/hadoop-trunk/hadoop-dist/target/hadoop-3.1.0-SNAPSHOT/share/hadoop/tools/lib/aws-java-sdk-bundle-1.11.199.jar
class com.amazonaws.ClientConfiguration was found in file:/Users/stevel/Projects/hadoop-trunk/hadoop-dist/target/hadoop-3.1.0-SNAPSHOT/share/hadoop/tools/lib/aws-java-sdk-bundle-1.11.199.jar
Endpoint: https://hwdev-steve-ireland-new.s3-eu-west-1.amazonaws.com/:
======================================================================
Canonical hostname 52.218.80.195
IP address 52.218.80.195
Connecting to https://hwdev-steve-ireland-new.s3-eu-west-1.amazonaws.com/
Response: 403 : Forbidden
HTTP response 403 from https://hwdev-steve-ireland-new.s3-eu-west-1.amazonaws.com/: Forbidden
Using proxy: false
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>62634DF280843113</RequestId><HostId>vb2JXCC+L/HPGF1b6oMk+4m8Eun142Dz1hiTcbtAP/5GNj1ClRaUnHdw4FzdOahz7WTTC+edfm8=</HostId></Error>
Transfer-Encoding: chunked
null: HTTP/1.1 403 Forbidden
Server: AmazonS3
x-amz-request-id: 62634DF280843113
x-amz-id-2: vb2JXCC+L/HPGF1b6oMk+4m8Eun142Dz1hiTcbtAP/5GNj1ClRaUnHdw4FzdOahz7WTTC+edfm8=
Date: Tue, 13 Feb 2018 17:48:14 GMT
x-amz-bucket-region: eu-west-1
Content-Type: application/xml
Test filesystem s3a://hwdev-steve-ireland-new
=============================================
2018-02-13 17:48:14,997 [main] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:initialize(232)) - Initializing S3AFileSystem for hwdev-steve-ireland-new
... followed by test FS operations ...
Release #2, 2018-02-13
Updated diagnostics
- handles hdfs by looking for http/s endpoints; printing out core hdfs binding props
- for any unknown fs schema, will look for
fs.SCHEMA.impl
in the config as the classname to look for - won't try and connect to endpoints with address "0.0.0.0"
This is built against HDP-2.6; there's a profile for 2.7 & 2.8 but you'd need to exclude hadoop-adl from the build. It's only there so the IDE can help with stepping through things
2018 02 12 release
First PoC; will improve as needed.
bin/hadoop jar cloudstore-2.8.jar s3a://landsat-pds
bin/hadoop jar cloudstore-2.8.jar adl://my-adl-container