Skip to content

Commit 2f8161e

Browse files
rbgirshickfacebook-github-bot
authored andcommitted
Initial commit
fbshipit-source-id: d6798bb3ead07e6e3da5edebc53b946e6cfa0807
0 parents  commit 2f8161e

File tree

203 files changed

+23801
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

203 files changed

+23801
-0
lines changed

.gitignore

+24
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# Shared objects
7+
*.so
8+
9+
# Distribution / packaging
10+
lib/build/
11+
*.egg-info/
12+
*.egg
13+
14+
# Temporary files
15+
*.swn
16+
*.swo
17+
*.swp
18+
19+
# Dataset symlinks
20+
lib/datasets/data/*
21+
!lib/datasets/data/README.md
22+
23+
# Generated C files
24+
lib/utils/cython_*.c

CONTRIBUTING.md

+39
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# Contributing to Detectron
2+
We want to make contributing to this project as easy and transparent as
3+
possible.
4+
5+
## Our Development Process
6+
Minor changes and improvements will be released on an ongoing basis. Larger
7+
changes (e.g., changesets implementing a new paper) will be released on a more
8+
periodic basis.
9+
10+
## Pull Requests
11+
We actively welcome your pull requests.
12+
13+
1. Fork the repo and create your branch from `master`.
14+
2. If you've added code that should be tested, add tests.
15+
3. If you've changed APIs, update the documentation.
16+
4. Ensure the test suite passes.
17+
5. Make sure your code lints.
18+
6. Ensure no regressions in baseline model speed and accuracy.
19+
7. If you haven't already, complete the Contributor License Agreement ("CLA").
20+
21+
## Contributor License Agreement ("CLA")
22+
In order to accept your pull request, we need you to submit a CLA. You only need
23+
to do this once to work on any of Facebook's open source projects.
24+
25+
Complete your CLA here: <https://code.facebook.com/cla>
26+
27+
## Issues
28+
GitHub issues will be largely unattended and are mainly intended as a community
29+
forum for collectively debugging issues, hopefully leading to pull requests with
30+
fixes when appropriate.
31+
32+
## Coding Style
33+
* 4 spaces for indentation rather than tabs
34+
* 80 character line length
35+
* PEP8 formatting
36+
37+
## License
38+
By contributing to Detectron, you agree that your contributions will be licensed
39+
under the LICENSE file in the root directory of this source tree.

FAQ.md

+30
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# FAQ
2+
3+
This document covers frequently asked questions.
4+
5+
- For general information about Detectron, please see [`README.md`](README.md).
6+
- For installation instructions, please see [`INSTALL.md`](INSTALL.md).
7+
- For a quick getting started guide, please see [`GETTING_STARTED.md`](GETTING_STARTED.md).
8+
9+
#### Q: How do I compute validation AP during training?
10+
11+
**A:** Detectron does not compute validation statistics (e.g., AP) during training because this slows training. Instead, we've implemented a "validation monitor", which is a process that polls for new model checkpoints saved by a training job and when one is found performs inference with it by scheduling a job with `tools/test_net.py` asynchronously using free GPUs in our cluster. We have not released the validation monitor because (1) it's a relatively thin wrapper on top of `tools/train_net.py` and (2) the little code that comprises it is specific to our cluster and would not be generally useful.
12+
13+
#### Q: How do I restrict Detectron to use only a subset of the GPUs on a server?
14+
15+
**A:** Don't modify the code; use the [`CUDA_VISIBLE_DEVICES`](http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars) environment variable instead.
16+
17+
#### Q: Detection on one image is really slow compared to the reported performance, why?
18+
19+
A: Various algorithms and caches (e.g., from `cudnn`) take some time to warm up. Peak inference performance will not be reached until after a few images have been processed.
20+
21+
Also potentially relevant: inference with Mask R-CNN on high-resolution images may be slow simply because substantial time is spent upsampling the predicted masks to the original image resolution (this has not been optimized). You can diagnose this issue if the `misc_mask` time reported by `tools/infer_simple.py` is high (e.g., much more than 20-90ms). The solution is to first resize your images such that the short side is around 600-800px (the exact choice does not matter) and then run inference on the resized image.
22+
23+
24+
#### Q: How do I implement a custom Caffe2 CPU or GPU operator for use in Detectron?
25+
26+
**A:** Detectron uses a number of specialized Caffe2 operators that are distributed via the [Caffe2 Detectron module](https://github.com/caffe2/caffe2/tree/master/modules/detectron) as part of the core Caffe2 GitHub repository. If you'd like to implement a custom Caffe2 operator for your project, we have written a toy example illustrating how to add an operator under the Detectron source tree; please see [`lib/ops/zero_even_op.*`](lib/ops/) and [`tests/test_zero_even_op.py`](tests/test_zero_even_op.py). For more background on writing Caffe2 operators please consult the [Caffe2 documentation](https://caffe2.ai/docs/custom-operators.html).
27+
28+
#### Q: How do I use Detectron to train a model on a custom dataset?
29+
30+
**A:** If possible, we strongly recommend that you first convert the custom dataset annotation format to the [COCO API json format](http://cocodataset.org/#download). Then, add your dataset to the [dataset catalog](lib/datasets/dataset_catalog.py) so that Detectron can use it for training and inference. If your dataset cannot be converted to the COCO API json format, then it's likely that more significant code modifications will be required. If the dataset you're adding is popular, please consider making the converted annotations publicly available; If code modifications are required, please consider submitting a pull request.

GETTING_STARTED.md

+99
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# Using Detectron
2+
3+
This document provides brief tutorials covering Detectron for inference and training on the COCO dataset.
4+
5+
- For general information about Detectron, please see [`README.md`](README.md).
6+
- For installation instructions, please see [`INSTALL.md`](INSTALL.md).
7+
8+
## Inference with Pretrained Models
9+
10+
#### 1. Directory of Image Files
11+
To run inference on a directory of image files (`demo/*.jpg` in this example), you can use the `infer_simple.py` tool. In this example, we're using an end-to-end trained Mask R-CNN model with a ResNet-101-FPN backbone from the model zoo:
12+
13+
```
14+
python2 tools/infer_simple.py \
15+
--cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \
16+
--output-dir /tmp/detectron-visualizations \
17+
--image-ext jpg \
18+
--wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \
19+
demo
20+
```
21+
22+
Detectron should automatically download the model from the URL specified by the `--wts` argument. This tool will output visualizations of the detections in PDF format in the directory specified by `--output-dir`. Here's an example of the output you should expect to see (for copyright information about the demo images see [`demo/NOTICE`](demo/NOTICE)).
23+
24+
<div align="center">
25+
<img src="demo/output/17790319373_bd19b24cfc_k_example_output.jpg" width="700px" />
26+
<p>Example Mask R-CNN output.</p>
27+
</div>
28+
29+
**Notes:**
30+
31+
- When running inference on your own high-resolution images, Mask R-CNN may be slow simply because substantial time is spent upsampling the predicted masks to the original image resolution (this has not been optimized). You can diagnose this issue if the `misc_mask` time reported by `tools/infer_simple.py` is high (e.g., much more than 20-90ms). The solution is to first resize your images such that the short side is around 600-800px (the exact choice does not matter) and then run inference on the resized image.
32+
33+
34+
#### 2. COCO Dataset
35+
This example shows how to run an end-to-end trained Mask R-CNN model from the model zoo using a single GPU for inference. As configured, this will run inference on all images in `coco_2014_minival` (which must be properly installed).
36+
37+
```
38+
python2 tools/test_net.py \
39+
--cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \
40+
TEST.WEIGHTS https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \
41+
NUM_GPUS 1
42+
```
43+
44+
Running inference with the same model using `$N` GPUs (e.g., `N=8`).
45+
46+
```
47+
python2 tools/test_net.py \
48+
--cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \
49+
--multi-gpu-testing \
50+
TEST.WEIGHTS https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \
51+
NUM_GPUS $N
52+
```
53+
54+
On an NVIDIA Tesla P100 GPU, inference should take about 130-140 ms per image for this example.
55+
56+
57+
## Training a Model with Detectron
58+
59+
This is a tiny tutorial showing how to train a model on COCO. The model will be an end-to-end trained Faster R-CNN using a ResNet-50-FPN backbone. For the purpose of this tutorial, we'll use a short training schedule and a small input image size so that training and inference will be relatively fast. As a result, the box AP on COCO will be relatively low compared to our [baselines](MODEL_ZOO.md). This example is provided for instructive purposes only (i.e., not for comparing against publications).
60+
61+
#### 1. Training with 1 GPU
62+
63+
```
64+
python2 tools/train_net.py \
65+
--cfg configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml \
66+
OUTPUT_DIR /tmp/detectron-output
67+
```
68+
69+
**Expected results:**
70+
71+
- Output (models, validation set detections, etc.) will be saved under `/tmp/detectron-output`
72+
- On a Maxwell generation GPU (e.g., M40), training should take around 4.2 hours
73+
- Inference time should be around 80ms / image (also on an M40)
74+
- Box AP on `coco_2014_minival` should be around 22.1% (+/- 0.1% stdev measured over 3 runs)
75+
76+
### 2. Multi-GPU Training
77+
78+
We've also provided configs to illustrate training with 2, 4, and 8 GPUs using learning schedules that will be approximately equivalent to the one used with 1 GPU above. The configs are located at: `configs/getting_started/tutorial_{2,4,8}gpu_e2e_faster_rcnn_R-50-FPN.yaml`. For example, launching a training job with 2 GPUs will look like this:
79+
80+
```
81+
python2 tools/train_net.py \
82+
--multi-gpu-testing \
83+
--cfg configs/getting_started/tutorial_2gpu_e2e_faster_rcnn_R-50-FPN.yaml \
84+
OUTPUT_DIR /tmp/detectron-output
85+
```
86+
87+
Note that we've also added the `--multi-gpu-testing` flag to instruct Detectron to parallelize inference over multiple GPUs (2 in this example; see `NUM_GPUS` in the config file) after training has finished.
88+
89+
**Expected results:**
90+
91+
- Training should take around 2.3 hours (2 x M40)
92+
- Inference time should be around 80ms / image (but in parallel on 2 GPUs, so half the total time)
93+
- Box AP on `coco_2014_minival` should be around 22.1% (+/- 0.1% stdev measured over 3 runs)
94+
95+
To understand how learning schedules are adjusted (the "linear scaling rule"), please study these tutorial config files and read our paper [Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour](https://arxiv.org/abs/1706.02677). **Aside from this tutorial, all of our released configs make use of 8 GPUs. If you will be using fewer than 8 GPUs for training (or do anything else that changes the minibatch size), it is essential that you understand how to manipulate training schedules according to the linear scaling rule.**
96+
97+
**Notes:**
98+
99+
- This training example uses a relatively low GPU-compute model and thus overhead from Caffe2 Python ops is relatively high. As a result, scaling as the number of GPUs is increased from 2 to 8 is relatively poor (e.g., training with 8 GPUs takes about 0.9 hours, only 4.5x faster than with 1 GPU). As larger, more GPU-compute heavy models are used, the scaling improves.

0 commit comments

Comments
 (0)