Skip to content

Latest commit



72 lines (55 loc) · 2.51 KB

File metadata and controls

72 lines (55 loc) · 2.51 KB


A tensorflow implement of CTPN: Detecting Text in Natural Image with Connectionist Text Proposal Network.

Most of code in this project are adapted from CTPN, tf-faster-rcnn and text-detection-ctpn

The result of pretrained model on ICDAR13:

Net Dataset Recall Precision Hmean
Origin CTPN ICDAR13 training data + ? 73.72% 92.77% 82.15%
vgg16 MLT17 latin/chn + ICDAR13 training data 74.26% 82.46% 78.15%

If you want an end to end OCR service, check this repo:


Install dependencies:

pip3 install -r requirements.txt

Build Cython part for both demo and training.

cd lib/
make clean

Quick start

Download pre-trained CTPN model(based on vgg16) from google drive, put it in output/vgg16/voc_2007_trainval/default. Run

python3 tools/

This model is trained on 1080Ti with 80k iterations using this commit dc533e030e5431212c1d4dbca0bcd7e594a8a368.


  1. Download training dataset from google drive. This dataset contain 3727 images from MLT17(latin+chinese) and ICDAR13 training set. Ground truth anchors are generated by minAreaRect of text area, see eragonruan/text-detection-ctpn#issues215 for more details.You can use tools/ to make your training data. Put downloaded data in ./data/VOCdevkit2007/VOC2007

  2. Download pre-trained slim vgg16 model from here Put the pretrained_models in ./data/pretrained_model

  3. Start training

python3 tools/

The output checkpoint file will be saved at ./output/vgg16/voc_2007_trainval/default

  1. Start tensorboard
tensorboard --logdir=./tensorboard

Test on ICDRA13

python3 tools/ --img_dir=path/to/ICDAR13/Challenge2_Test_Task12_Images/ -c=ICDAR13

After finish, a file will generated in data/ICDAR_submit, than run:

cd tools/ICDAR13
# use python2