@@ -102,18 +102,60 @@ mvn package -DexcludedGroups=functional
102
102
103
103
## Build Command-line (CLI) usage
104
104
105
- For some of the algorithms included in this package, there are CLI applications that can
106
- be used for experiments. These applications use ` String::split ` to read
107
- delimited data, and as such are ** not intended for production use** . Instead,
108
- use these applications as example code and as a way to learn about the
109
- algorithms and their hyperparameters.
105
+ > ** Important.** The CLI applications use ` String::split ` to read delimited data
106
+ > and as such are ** not intended for production use** .
110
107
111
- After building the project (described in the previous section), you can invoke an example CLI application by adding the
112
- core jar file to your classpath. For example:
108
+ For some of the algorithms included in this package there are CLI applications
109
+ that can be used for experimentation as well as a way to learn about these
110
+ algorithms and their hyperparameters. After building the project you can invoke
111
+ an example CLI application by adding the core jar file to your classpath.
112
+
113
+ In the example below we train and score a Random Cut Forest model on the
114
+ three-dimensional data shown in Figure 3 in the original RCF paper.
115
+ ([ PDF] [ rcf-paper ] ) These example data can be
116
+ found at ` ../example-data/rcf-paper.csv ` :
117
+
118
+ ``` text
119
+ $ tail data/example.csv
120
+ -5.0074,-0.0038,-0.0237
121
+ -5.0029,0.0170,-0.0057
122
+ -4.9975,-0.0102,-0.0065
123
+ 4.9878,0.0136,-0.0087
124
+ 5.0118,0.0098,-0.0057
125
+ 0.0158,0.0061,0.0091
126
+ 5.0167,0.0041,0.0054
127
+ -4.9947,0.0126,-0.0010
128
+ -5.0209,0.0004,-0.0033
129
+ 4.9923,-0.0142,0.0030
130
+ ```
131
+
132
+ (Note that there is one data point above that is not like the others.) The
133
+ ` AnomalyScoreRunner ` application reads in each line of the input data as a
134
+ vector data point, scores the data point, and then updates the model with this
135
+ point. The program output appends a column of anomaly scores to the input:
113
136
114
137
``` text
115
- % java -cp core/target/randomcutforest-core-1.0-alpha.jar com.amazon.randomcutforest.runner.AnomalyScoreRunner --help
116
- Usage: java -cp randomcutforest-core-1.0-alpha.jar com.amazon.randomcutforest.runner.AnomalyScoreRunner [options] < input_file > output_file
138
+ $ java -cp core/target/randomcutforest-core-1.0.jar com.amazon.randomcutforest.runner.AnomalyScoreRunner < ../example-data/rcf-paper.csv > example_output.csv
139
+ $ tail example_output.csv
140
+ -5.0029,0.0170,-0.0057,0.8129401629464965
141
+ -4.9975,-0.0102,-0.0065,0.6591046054520615
142
+ 4.9878,0.0136,-0.0087,0.8552217070518414
143
+ 5.0118,0.0098,-0.0057,0.7224686064066762
144
+ 0.0158,0.0061,0.0091,2.8299054033889814
145
+ 5.0167,0.0041,0.0054,0.7571453322237215
146
+ -4.9947,0.0126,-0.0010,0.7259960347128676
147
+ -5.0209,0.0004,-0.0033,0.9119498264685114
148
+ 4.9923,-0.0142,0.0030,0.7310102658466711
149
+ Done.
150
+ ```
151
+
152
+ (As you can see the anomalous data point was given large anomaly score.) You can
153
+ read additional usage instructions, including options for setting model
154
+ hyperparameters, using the ` --help ` flag:
155
+
156
+ ``` text
157
+ $ java -cp core/target/randomcutforest-core-1.0.jar com.amazon.randomcutforest.runner.AnomalyScoreRunner --help
158
+ Usage: java -cp target/random-cut-forest-1.0.jar com.amazon.randomcutforest.runner.AnomalyScoreRunner [options] < input_file > output_file
117
159
118
160
Compute scalar anomaly scores from the input rows and append them to the output rows.
119
161
@@ -130,6 +172,9 @@ Options:
130
172
--help, -h: Print this help message and exit.
131
173
```
132
174
175
+ Other CLI applications are available in the ` com.amazon.randomcutforest.runner `
176
+ package.
177
+
133
178
## Testing
134
179
135
180
The core library test suite is divided into unit tests and "functional" tests. By "functional", we mean tests that
@@ -181,3 +226,5 @@ benchmark methods will be executed.
181
226
``` text
182
227
% java -jar benchmark/target/randomcutforest-benchmark-1.0-jar-with-dependencies.jar RandomCutForestBenchmark\.updateAndGetAnomalyScore
183
228
```
229
+
230
+ [ rcf-paper ] : http://proceedings.mlr.press/v48/guha16.pdf
0 commit comments