Skip to content

Commit 655e142

Browse files
authored
Improve CLI usage instructions (aws#59)
* Improve CLI usage instructions Expand upon the CLI instructions with an example and information on where to find additional CLI applications. Adds an example data file for ease of instruction. Closes aws#43
1 parent c7a485c commit 655e142

File tree

2 files changed

+2066
-9
lines changed

2 files changed

+2066
-9
lines changed

Java/README.md

+56-9
Original file line numberDiff line numberDiff line change
@@ -102,18 +102,60 @@ mvn package -DexcludedGroups=functional
102102

103103
## Build Command-line (CLI) usage
104104

105-
For some of the algorithms included in this package, there are CLI applications that can
106-
be used for experiments. These applications use `String::split` to read
107-
delimited data, and as such are **not intended for production use**. Instead,
108-
use these applications as example code and as a way to learn about the
109-
algorithms and their hyperparameters.
105+
> **Important.** The CLI applications use `String::split` to read delimited data
106+
> and as such are **not intended for production use**.
110107
111-
After building the project (described in the previous section), you can invoke an example CLI application by adding the
112-
core jar file to your classpath. For example:
108+
For some of the algorithms included in this package there are CLI applications
109+
that can be used for experimentation as well as a way to learn about these
110+
algorithms and their hyperparameters. After building the project you can invoke
111+
an example CLI application by adding the core jar file to your classpath.
112+
113+
In the example below we train and score a Random Cut Forest model on the
114+
three-dimensional data shown in Figure 3 in the original RCF paper.
115+
([PDF][rcf-paper]) These example data can be
116+
found at `../example-data/rcf-paper.csv`:
117+
118+
```text
119+
$ tail data/example.csv
120+
-5.0074,-0.0038,-0.0237
121+
-5.0029,0.0170,-0.0057
122+
-4.9975,-0.0102,-0.0065
123+
4.9878,0.0136,-0.0087
124+
5.0118,0.0098,-0.0057
125+
0.0158,0.0061,0.0091
126+
5.0167,0.0041,0.0054
127+
-4.9947,0.0126,-0.0010
128+
-5.0209,0.0004,-0.0033
129+
4.9923,-0.0142,0.0030
130+
```
131+
132+
(Note that there is one data point above that is not like the others.) The
133+
`AnomalyScoreRunner` application reads in each line of the input data as a
134+
vector data point, scores the data point, and then updates the model with this
135+
point. The program output appends a column of anomaly scores to the input:
113136

114137
```text
115-
% java -cp core/target/randomcutforest-core-1.0-alpha.jar com.amazon.randomcutforest.runner.AnomalyScoreRunner --help
116-
Usage: java -cp randomcutforest-core-1.0-alpha.jar com.amazon.randomcutforest.runner.AnomalyScoreRunner [options] < input_file > output_file
138+
$ java -cp core/target/randomcutforest-core-1.0.jar com.amazon.randomcutforest.runner.AnomalyScoreRunner < ../example-data/rcf-paper.csv > example_output.csv
139+
$ tail example_output.csv
140+
-5.0029,0.0170,-0.0057,0.8129401629464965
141+
-4.9975,-0.0102,-0.0065,0.6591046054520615
142+
4.9878,0.0136,-0.0087,0.8552217070518414
143+
5.0118,0.0098,-0.0057,0.7224686064066762
144+
0.0158,0.0061,0.0091,2.8299054033889814
145+
5.0167,0.0041,0.0054,0.7571453322237215
146+
-4.9947,0.0126,-0.0010,0.7259960347128676
147+
-5.0209,0.0004,-0.0033,0.9119498264685114
148+
4.9923,-0.0142,0.0030,0.7310102658466711
149+
Done.
150+
```
151+
152+
(As you can see the anomalous data point was given large anomaly score.) You can
153+
read additional usage instructions, including options for setting model
154+
hyperparameters, using the `--help` flag:
155+
156+
```text
157+
$ java -cp core/target/randomcutforest-core-1.0.jar com.amazon.randomcutforest.runner.AnomalyScoreRunner --help
158+
Usage: java -cp target/random-cut-forest-1.0.jar com.amazon.randomcutforest.runner.AnomalyScoreRunner [options] < input_file > output_file
117159
118160
Compute scalar anomaly scores from the input rows and append them to the output rows.
119161
@@ -130,6 +172,9 @@ Options:
130172
--help, -h: Print this help message and exit.
131173
```
132174

175+
Other CLI applications are available in the `com.amazon.randomcutforest.runner`
176+
package.
177+
133178
## Testing
134179

135180
The core library test suite is divided into unit tests and "functional" tests. By "functional", we mean tests that
@@ -181,3 +226,5 @@ benchmark methods will be executed.
181226
```text
182227
% java -jar benchmark/target/randomcutforest-benchmark-1.0-jar-with-dependencies.jar RandomCutForestBenchmark\.updateAndGetAnomalyScore
183228
```
229+
230+
[rcf-paper]: http://proceedings.mlr.press/v48/guha16.pdf

0 commit comments

Comments
 (0)