why so slow? #50

likezjuisee · 2020-05-26T03:04:00Z

Find 1 images
6055 text boxes before nms
test/screenshot.png : detect 3372ms, restore 5ms, nms 147ms, recog 4796ms
[timing] 8.196640491485596

Pay20Y · 2020-05-26T13:48:29Z

It seems too many boxes before NMS. May I ask which dataset you used?

likezjuisee · 2020-05-27T02:15:51Z

The image mentioned before is my own test image. And the model used is from your readme.md.

likezjuisee · 2020-05-27T02:16:46Z

And I found the cost time is not stable, maybe 10s this time or 3s next time.

Pay20Y · 2020-05-27T11:49:28Z

That's strange. The NMS consumes little time, I think you should check your GPU first, it maybe run with CPU. You can also debug the detection branch first with EAST.

likezjuisee · 2020-05-28T03:42:51Z

(fots) root@test-desktop:~/like/fots/FOTS_TF# python3.5 main_test.py --gpu_list='1' --test_data_path=test/ --checkpoint_path=checkpoints/SynthText_6_epochs/
make: Entering directory '/home/test/like/fots/FOTS_TF/lanms'
make: 'adaptor.so' is up to date.
make: Leaving directory '/home/test/like/fots/FOTS_TF/lanms'
resnet_v1_50/block1 (?, ?, ?, 256)
resnet_v1_50/block2 (?, ?, ?, 512)
resnet_v1_50/block3 (?, ?, ?, 1024)
resnet_v1_50/block4 (?, ?, ?, 2048)
Shape of f_0 (?, ?, ?, 2048)
Shape of f_1 (?, ?, ?, 512)
Shape of f_2 (?, ?, ?, 256)
Shape of f_3 (?, ?, ?, 64)
Shape of h_0 (?, ?, ?, 2048), g_0 (?, ?, ?, 2048)
Shape of h_1 (?, ?, ?, 128), g_1 (?, ?, ?, 128)
Shape of h_2 (?, ?, ?, 64), g_2 (?, ?, ?, 64)
Shape of h_3 (?, ?, ?, 32), g_3 (?, ?, ?, 32)
pad_rois shape: Tensor("RoIrotate/TensorArrayStack/TensorArrayGatherV3:0", shape=(?, 8, ?, 32), dtype=float32)
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/sparse_ops.py:1165: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead.
2020-05-28 11:38:23.281118: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-05-28 11:38:23.480434: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:65:00.0
totalMemory: 10.76GiB freeMemory: 10.45GiB
2020-05-28 11:38:23.480467: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2020-05-28 11:38:23.787260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-28 11:38:23.787300: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2020-05-28 11:38:23.787306: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2020-05-28 11:38:23.787402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10086 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:65:00.0, compute capability: 7.5)
Restore from checkpoints/SynthText_6_epochs/model.ckpt-733268
Find 2 images
5084 text boxes before nms
test/006.jpg : detect 1903ms, restore 2ms, nms 23ms, recog 0ms
[timing] 1.9098529815673828
6055 text boxes before nms
test/screenshot.png : detect 758ms, restore 5ms, nms 77ms, recog 0ms
[timing] 0.7702224254608154

The cost time is reduced, but fps is lower than the paper mentioned.

the east result:

(east) root@test-desktop:~/like/east/EAST# python eval.py --test_data_path=/home/test/like/fots/FOTS_TF/test/ --gpu_list=0 --checkpoint_path=east_icdar2015_resnet_v1_50_rbox/ --output_dir=.
make: Entering directory '/home/test/like/east/EAST/lanms'
make: 'adaptor.so' is up to date.
make: Leaving directory '/home/test/like/east/EAST/lanms'
resnet_v1_50/block1 (?, ?, ?, 256)
resnet_v1_50/block2 (?, ?, ?, 512)
resnet_v1_50/block3 (?, ?, ?, 1024)
resnet_v1_50/block4 (?, ?, ?, 2048)
Shape of f_0 (?, ?, ?, 2048)
Shape of f_1 (?, ?, ?, 512)
Shape of f_2 (?, ?, ?, 256)
Shape of f_3 (?, ?, ?, 64)
Shape of h_0 (?, ?, ?, 2048), g_0 (?, ?, ?, 2048)
Shape of h_1 (?, ?, ?, 128), g_1 (?, ?, ?, 128)
Shape of h_2 (?, ?, ?, 64), g_2 (?, ?, ?, 64)
Shape of h_3 (?, ?, ?, 32), g_3 (?, ?, ?, 32)
2020-05-28 11:41:11.735064: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-05-28 11:41:11.830571: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.635
pciBusID: 0000:17:00.0
totalMemory: 10.76GiB freeMemory: 10.45GiB
2020-05-28 11:41:11.830603: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2020-05-28 11:41:12.134395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-28 11:41:12.134435: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0
2020-05-28 11:41:12.134445: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0: N
2020-05-28 11:41:12.134540: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10081 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:17:00.0, compute capability: 7.5)
Restore from east_icdar2015_resnet_v1_50_rbox/model.ckpt-49491
Find 2 images
4697 text boxes before nms
/home/test/like/fots/FOTS_TF/test/006.jpg : net 1876ms, restore 7ms, nms 21ms
[timing] 1.911694049835205
6116 text boxes before nms
/home/test/like/fots/FOTS_TF/test/screenshot.png : net 625ms, restore 9ms, nms 60ms
[timing] 0.7100484371185303

Is because of the ResNet50 too complicated?

likezjuisee · 2020-05-28T06:03:20Z

中文的识别结果，为啥都是字母和数字：

404,407,710,414,709,472,402,464,h-1y51e7hi5
250,518,567,525,566,564,249,557,iric/-n?
220,594,488,585,489,604,220,613,M-senl
194,984,433,992,431,1027,193,1018,i2EJ1c-
284,807,587,819,585,853,282,842,isvE
73,1523,401,1536,399,1579,71,1566,Mes5E-F2t
63,1597,152,1599,148,1718,59,1715,a
242,1265,518,1255,519,1293,243,1303,-3he-7te
621,1416,870,1407,872,1459,623,1469,Netzxt
31,94,110,93,111,155,32,156,fow
965,1454,1032,1457,1030,1486,964,1483,2743
142,1584,428,1589,427,1625,142,1620,"nFhatElzsae
1025,1026,1077,1023,1079,1050,1027,1053,Xe
29,163,116,165,115,189,28,187,az
377,272,474,269,475,291,377,293,ivm
695,1254,834,1249,836,1285,697,1291,ges
575,1533,790,1528,791,1569,575,1574,cIEER-T
279,128,438,121,439,156,281,163,Heer
-3,23,169,30,167,61,-4,54,Xhn
961,25,1061,20,1062,52,962,57,09:22
553,1590,687,1585,688,1617,555,1622,Frk
111,871,280,876,279,914,110,909,KL/E"
267,873,462,878,461,914,266,909,"FeEr'"
260,1420,408,1416,409,1463,261,1467,Ths
34,245,138,248,136,301,32,298,Jets
366,688,498,684,499,716,367,720,ey
722,1731,784,1729,785,1757,722,1759,ee
715,1115,792,1118,791,1151,714,1148,tit
763,240,863,242,862,293,762,291,Bgy
245,1119,369,1116,371,1158,246,1161,>A(
720,875,863,879,862,912,719,908,ae
421,1253,665,1256,664,1292,421,1288,thigeit.
544,240,692,241,691,294,543,292,egEI
479,872,639,881,637,916,477,907,Ree

Pay20Y · 2020-05-28T06:58:01Z

你好，因为模型是在英文数据集上训练的，所以无法直接用在中文数据集上，您可以修改一下代码，然后finetune一下。至于达不到论文中的FPS，这可能是因为我本身能力有限，代码不是那么完美，也可能和硬件设备有一些关系。

likezjuisee · 2020-05-28T07:01:45Z

明白了，已经很赞了。
如果我想做的是软件界面的文字识别，角度一般都是0度的，有什么快速的方法推荐么？

Pay20Y · 2020-05-28T07:17:43Z

您的意思是只有水平文本吗(没有倾斜之类的)？那您可以试一下CTPN+CTC的结构，网上也有很多实现比如这个

likezjuisee · 2020-05-28T07:25:37Z

是的，就像上面的软件界面，基本都是水平文本。
谢谢，我看看。

likezjuisee · 2020-05-28T09:37:57Z

尝试了一下，上面那张图需要2.5秒左右的时间。
还有就是这种两阶段的模型需要占用两个显卡，还是比较昂贵的哈。
还有其他方法么？

Pay20Y · 2020-05-28T11:59:12Z

应该是CTPN本身比较慢，可以试一下这个或者这个

likezjuisee · 2020-06-01T07:58:29Z

https://github.com/ouyanghuiyu/chineseocr_lite
这个我试了下，速度确实快，但是精度降低了很多，有点难以满足需求。
提高速度的思路，我理解是对模型进行了简化，FOTS会有FOTS_lite版本么？

Pay20Y · 2020-06-02T01:32:05Z

抱歉，目前我没有这样的计划，您可以看一下别的关于FOTS的复现。

likezjuisee · 2020-06-02T03:59:32Z

了解了。

SkrDrag · 2022-04-28T04:15:51Z

你好，因为模型是在英文数据集上训练的，所以无法直接用在中文数据集上，您可以修改一下代码，然后finetune一下。至于达不到论文中的FPS，这可能是因为我本身能力有限，代码不是那么完美，也可能和硬件设备有一些关系。

你好我想请问一下，我试图使用你发布的预训练模型在中文数据集上进行微调训练，却发生无法加载模型的报错。请问该怎么修改代码呢

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why so slow? #50

why so slow? #50

likezjuisee commented May 26, 2020

Pay20Y commented May 26, 2020

likezjuisee commented May 27, 2020

likezjuisee commented May 27, 2020

Pay20Y commented May 27, 2020

likezjuisee commented May 28, 2020 •

edited

Loading

likezjuisee commented May 28, 2020

Pay20Y commented May 28, 2020

likezjuisee commented May 28, 2020

Pay20Y commented May 28, 2020

likezjuisee commented May 28, 2020

likezjuisee commented May 28, 2020

Pay20Y commented May 28, 2020

likezjuisee commented Jun 1, 2020

Pay20Y commented Jun 2, 2020

likezjuisee commented Jun 2, 2020

SkrDrag commented Apr 28, 2022

why so slow? #50

why so slow? #50

Comments

likezjuisee commented May 26, 2020

Pay20Y commented May 26, 2020

likezjuisee commented May 27, 2020

likezjuisee commented May 27, 2020

Pay20Y commented May 27, 2020

likezjuisee commented May 28, 2020 • edited Loading

likezjuisee commented May 28, 2020

Pay20Y commented May 28, 2020

likezjuisee commented May 28, 2020

Pay20Y commented May 28, 2020

likezjuisee commented May 28, 2020

likezjuisee commented May 28, 2020

Pay20Y commented May 28, 2020

likezjuisee commented Jun 1, 2020

Pay20Y commented Jun 2, 2020

likezjuisee commented Jun 2, 2020

SkrDrag commented Apr 28, 2022

likezjuisee commented May 28, 2020 •

edited

Loading