- Overall workflow
- Configuration
- Building and running
- Mining
- Validation
- Export
- Building the misuse detector
- Installing and running the detector
- Downloading projects and jar files
To be able to use the tool, we first need to configure it and make sure we provide all the necessary input. Next, we can build, and then run the tool. Once inside of the container, we need to mine the rules and then validate them. Once the validation is complete, we need to export the confirmed rules, and finally build the detector that is equipped with the confirmed rules.
To be able to mine annotation usage rules, the tool requires three things:
- a set of input Java projects to mine the rules from
- a set of JAR files for resolving the types of imported classes, annotations, etc.
- if not provided, the miner uses a heuristic which may not always produce accurate results
configuration.json
file that specifies what APIs we want to mine rules for
The first two parameters can be provided through mounting existing directories to their respective volumes mentioned in the docker-compose.yml
file. We currently expose three volumes:
mining-sources
- represents the directory where the input Java projects will reside in.lib-sources
- represents the directory where the JAR files will reside in.exports
- represents the directory where the exported artifacts (candidate rules, confirmed rules, the detector, etc.) will reside in.
By default, these volumes are mounted to the directories with the same name in your host machine in the root of this project's directory. See the miner.volumes
section in the docker-compose.yml
file if you want to change any default directory.
While you can certainly populate these input directories yourself, we also provide tools to download the input Java projects (from GitHub) and JAR files (from Maven Central). See Downloading Projects and JAR files section.
configuration.json
file contains the necessary information about the APIs we are interested in mining. Example file:
{
"api": {
"prefixes": [
"org.eclipse.microprofile",
"javax",
"jakarta"
],
"subApi": {
"prefixes": [
"org.eclipse.microprofile.config",
"org.eclipse.microprofile.faulttolerance",
"org.eclipse.microprofile.graphql",
"org.eclipse.microprofile.health",
"org.eclipse.microprofile.jwt",
"org.eclipse.microprofile.metrics",
"org.eclipse.microprofile.openapi",
"org.eclipse.microprofile.opentracing",
"org.eclipse.microprofile.reactive",
"org.eclipse.microprofile.rest",
"org.eclipse.microprofile.auth"
]
}
}
}
We simply need two types of prefixes
for the configuration:
api#prefixes
is used to make sure the provided rules contain at least one annotation that belong to any of these namespaces.api#subApi#prefixes
is used to group the annotation usages based on the sub-APIs (i.e., packages), and perform the mining for each sub-API individually. The mining parameters are automatically configured according to the size of each group. This improves the quality of the mined rules.
Once configuration is done, build and run the application. We can build the application using build.sh
command. Once the project is successfully built, we can run it using run.sh
command. A successful execution will land you in the bash shell.
your-host-machine> ./build.sh && ./run.sh
# some build related output
# ...
========================================================================
This tool allows you to mine candidate annotation usage rules
from the target projects located in /pipeline/mining-sources.
Once you mine candidate rules, you can review and validate them
using the Rule Validation Tool (RVT). Confirmed rules can be exported
to be used for misuse detection.
Available commands:
mine - Mines candidate rules from target projects located in /pipeline/mining-sources
validate - Uploads the mined candidate rules into the RVT for validation
export-rules - Exports the validated correct rules from RVT
build-detector - Builds the misuse detector Maven plugin jar file, and provides installation directions
download-jars - Downloads the required jar files mentioned in /pipeline/config/configuration.json file
clone-projects - Clones the projects mentioned in the input file into /pipeline/mining-sources directory
info - Shows information about the available commands
pipeline>
By default, every time you run the container, the info
command is automatically run to show the available commands. You can obviously use the info
command at any point in time to get the same information.
We have 7 commands each performing different tasks. We now will describe each command individually.
Assuming that we have configured the mining-sources
and lib-sources
accurately (meaning that they contain the input Java projects, and JAR files), to start the mining process, all we need to do is to issue the mine
command. That's it! Once the mining process is done, it will save the candidate rules to the /pipeline/exports
directory (the name template is candidate_rules_{current_milliseconds}.json
).
Note. mine
command is a wrapper around the miner which is a runnable fat-JAR file. If you want to provide custom configuration values, you can certainly run the miner jar file.
Once the candidate rules are mined, we can move on to the validation. To perform the validation, we need to upload the candidate rules to our Rule Validation Tool (RVT). To do this, we simply need to issue the validate
command. This command will upload the candidate rules to RVT and provide a random username for accessing those rules. Example execution:
pipeline> validate
Using the following rules file for the validation: /pipeline/exports/candidate_rules_1663128572285.json
No username has been provided, generating a random one...
==========================================================
Successfully loaded the mined candidate rules!
To start validating candidate rules, please head over to:
http://localhost:8888
Username: magnetic-gallery
==========================================================
You can also provide a custom username using --user USERNAME
option:
pipeline> validate --user microprofile-rules
All you have to do is to put the username in the login field in the RVT login page and hit Enter.
To learn more about how to use RVT and its domain-specific language, please head over to tutorial.
By default, the validate
command picks the most recent candidate rules file based on the timestamp (milliseconds) provided in the name of the file. However, if we want to specify a specific file, we can do that using the --file FILEPATH
command as follows:
pipeline> validate --file /pipeline/examples/candidate_rules_example.json
**Note. ** The validate
command saves the username (provided or generated) to /tmp/rvt_most_recent_login
file in the container. This allows us to call the export-rules
and build-detector
commands (both explained next) without specifying the user, because both of these commands check the /tmp/rvt_most_recent_login
file if no user is provided.
Once the validation is complete (which does not equate to validating all the candidate rules necessarily), we can export the confirmed rules from RVT. To do this, we can issue the export-rules
command. Example command:
pipeline> export-rules
Exported the confirmed rules to: /pipeline/exports/confirmed-magnetic-gallery.json
By default, if no user is provided, export-rules
takes the username from /tmp/rvt_most_recent_login
file which is set by the validate
command (as described above). However, since /tmp/rvt_most_recent_login
is a temporary file, if we want to export rules for another username (or if we have restarted the container), we can supply a custom username as follows:
pipeline> export-rules --user microprofile-rules
Exported the confirmed rules to: /pipeline/exports/confirmed-microprofile-rules.json
The export-rules
command saves the confirmed rules to the exports
directory with the name confirmed-{username}.json
.
We can use the exported confirmed rules for building a misuse detector. To build the misuse detector, we only need to issue the build-detector
command which will package our misuse detector Maven plugin with the exported rules and the jar files required for type resolving, into a single jar file. This jar file, alongside a shell script to install it will be available in /pipeline/exports/detector/
directory. You can issue the install command from your host machine or any other device that has Java and Maven installed.
Similar to the export-rules
command, you can also supply a custom username to the build-detector
command using --user USERNAME
option. Since the confirmed rules contain the username that they belong to, this option allows us to pick anything other than the most recent username.
By default, the build-detector
command uses the confirmed rules file associated with the provided or most recent username. However, if we want to provide a specific file, we can use the --file FILEPATH
command as follows:
pipeline> build-detector --file /pipeline/examples/confirmed_rules_example.json
To install the detector, all we need to do is to run the provided installation script.
Since the detector is a Maven plugin, we need to run it in a Maven project. A very simple example:
# mvn ca.ualberta:violation-detector-maven-plugin:scan
To learn more about running the detector, please see the How to use the plugin section in the following tutorial.
Our tool provides commands for downloading jar files and cloning the target projects from GitHub.
To download the jars, you can add an array called "jarFiles"
to configuration.json
file where each element in the array represents a jar file that will be downloaded from Maven Central. The format for each entry is groupId:artifactId:version
. The configuration.json file we have provided contains an example.
Once the configuration.json file contains all the entries, you can issue the command download-jars
command. If you have already mounted an existing directory to lib-sources
volume, please be aware that the download-jars
command will not delete or re-download existing jar files. Example command:
pipeline> download-jars # no parameter required
To clone target projects from GitHub, you can use clone-projects
command which takes a path to a file where all the projects are mentioned. Check out an example file here. The format for an entry is that the first line is the GitHub link, the next line may have a commit hash (will checkout the master/main branch if none specified), and finally "--" (dashes) are used as the separator. Putting any file in the config
directory will result in that file being copied to the container. Finally, the projects will be downloaded to /pipeline/mining-sources
directory in the container. Example command:
pipeline> clone-projects --file /pipeline/config/projects.txt