Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why so slow? #50

Open
likezjuisee opened this issue May 26, 2020 · 16 comments
Open

why so slow? #50

likezjuisee opened this issue May 26, 2020 · 16 comments

Comments

@likezjuisee
Copy link

Find 1 images
6055 text boxes before nms
test/screenshot.png : detect 3372ms, restore 5ms, nms 147ms, recog 4796ms
[timing] 8.196640491485596

screenshot

@Pay20Y
Copy link
Owner

Pay20Y commented May 26, 2020

It seems too many boxes before NMS. May I ask which dataset you used?

@likezjuisee
Copy link
Author

The image mentioned before is my own test image. And the model used is from your readme.md.

@likezjuisee
Copy link
Author

And I found the cost time is not stable, maybe 10s this time or 3s next time.

@Pay20Y
Copy link
Owner

Pay20Y commented May 27, 2020

That's strange. The NMS consumes little time, I think you should check your GPU first, it maybe run with CPU. You can also debug the detection branch first with EAST.

@likezjuisee
Copy link
Author

likezjuisee commented May 28, 2020

(fots) root@test-desktop:~/like/fots/FOTS_TF# python3.5 main_test.py --gpu_list='1' --test_data_path=test/ --checkpoint_path=checkpoints/SynthText_6_epochs/
make: Entering directory '/home/test/like/fots/FOTS_TF/lanms'
make: 'adaptor.so' is up to date.
make: Leaving directory '/home/test/like/fots/FOTS_TF/lanms'
resnet_v1_50/block1 (?, ?, ?, 256)
resnet_v1_50/block2 (?, ?, ?, 512)
resnet_v1_50/block3 (?, ?, ?, 1024)
resnet_v1_50/block4 (?, ?, ?, 2048)
Shape of f_0 (?, ?, ?, 2048)
Shape of f_1 (?, ?, ?, 512)
Shape of f_2 (?, ?, ?, 256)
Shape of f_3 (?, ?, ?, 64)
Shape of h_0 (?, ?, ?, 2048), g_0 (?, ?, ?, 2048)
Shape of h_1 (?, ?, ?, 128), g_1 (?, ?, ?, 128)
Shape of h_2 (?, ?, ?, 64), g_2 (?, ?, ?, 64)
Shape of h_3 (?, ?, ?, 32), g_3 (?, ?, ?, 32)
pad_rois shape: Tensor("RoIrotate/TensorArrayStack/TensorArrayGatherV3:0", shape=(?, 8, ?, 32), dtype=float32)
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/sparse_ops.py:1165: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead.
2020-05-28 11:38:23.281118: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-05-28 11:38:23.480434: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:65:00.0
totalMemory: 10.76GiB freeMemory: 10.45GiB
2020-05-28 11:38:23.480467: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2020-05-28 11:38:23.787260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-28 11:38:23.787300: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2020-05-28 11:38:23.787306: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2020-05-28 11:38:23.787402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10086 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:65:00.0, compute capability: 7.5)
Restore from checkpoints/SynthText_6_epochs/model.ckpt-733268
Find 2 images
5084 text boxes before nms
test/006.jpg : detect 1903ms, restore 2ms, nms 23ms, recog 0ms
[timing] 1.9098529815673828
6055 text boxes before nms
test/screenshot.png : detect 758ms, restore 5ms, nms 77ms, recog 0ms
[timing] 0.7702224254608154

The cost time is reduced, but fps is lower than the paper mentioned.

the east result:

(east) root@test-desktop:~/like/east/EAST# python eval.py --test_data_path=/home/test/like/fots/FOTS_TF/test/ --gpu_list=0 --checkpoint_path=east_icdar2015_resnet_v1_50_rbox/ --output_dir=.
make: Entering directory '/home/test/like/east/EAST/lanms'
make: 'adaptor.so' is up to date.
make: Leaving directory '/home/test/like/east/EAST/lanms'
resnet_v1_50/block1 (?, ?, ?, 256)
resnet_v1_50/block2 (?, ?, ?, 512)
resnet_v1_50/block3 (?, ?, ?, 1024)
resnet_v1_50/block4 (?, ?, ?, 2048)
Shape of f_0 (?, ?, ?, 2048)
Shape of f_1 (?, ?, ?, 512)
Shape of f_2 (?, ?, ?, 256)
Shape of f_3 (?, ?, ?, 64)
Shape of h_0 (?, ?, ?, 2048), g_0 (?, ?, ?, 2048)
Shape of h_1 (?, ?, ?, 128), g_1 (?, ?, ?, 128)
Shape of h_2 (?, ?, ?, 64), g_2 (?, ?, ?, 64)
Shape of h_3 (?, ?, ?, 32), g_3 (?, ?, ?, 32)
2020-05-28 11:41:11.735064: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-05-28 11:41:11.830571: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.635
pciBusID: 0000:17:00.0
totalMemory: 10.76GiB freeMemory: 10.45GiB
2020-05-28 11:41:11.830603: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2020-05-28 11:41:12.134395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-28 11:41:12.134435: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0
2020-05-28 11:41:12.134445: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0: N
2020-05-28 11:41:12.134540: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10081 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:17:00.0, compute capability: 7.5)
Restore from east_icdar2015_resnet_v1_50_rbox/model.ckpt-49491
Find 2 images
4697 text boxes before nms
/home/test/like/fots/FOTS_TF/test/006.jpg : net 1876ms, restore 7ms, nms 21ms
[timing] 1.911694049835205
6116 text boxes before nms
/home/test/like/fots/FOTS_TF/test/screenshot.png : net 625ms, restore 9ms, nms 60ms
[timing] 0.7100484371185303

Is because of the ResNet50 too complicated?

@likezjuisee
Copy link
Author

中文的识别结果,为啥都是字母和数字:

404,407,710,414,709,472,402,464,h-1y51e7hi5
250,518,567,525,566,564,249,557,iric/-n?
220,594,488,585,489,604,220,613,M-senl
194,984,433,992,431,1027,193,1018,i2EJ1c-
284,807,587,819,585,853,282,842,isvE
73,1523,401,1536,399,1579,71,1566,Mes5E-F2t
63,1597,152,1599,148,1718,59,1715,a
242,1265,518,1255,519,1293,243,1303,-3he-7te
621,1416,870,1407,872,1459,623,1469,Netzxt
31,94,110,93,111,155,32,156,fow
965,1454,1032,1457,1030,1486,964,1483,2743
142,1584,428,1589,427,1625,142,1620,"nFhatElzsae
1025,1026,1077,1023,1079,1050,1027,1053,Xe
29,163,116,165,115,189,28,187,az
377,272,474,269,475,291,377,293,ivm
695,1254,834,1249,836,1285,697,1291,ges
575,1533,790,1528,791,1569,575,1574,cIEER-T
279,128,438,121,439,156,281,163,Heer
-3,23,169,30,167,61,-4,54,Xhn
961,25,1061,20,1062,52,962,57,09:22
553,1590,687,1585,688,1617,555,1622,Frk
111,871,280,876,279,914,110,909,KL/E"
267,873,462,878,461,914,266,909,"FeEr'"
260,1420,408,1416,409,1463,261,1467,Ths
34,245,138,248,136,301,32,298,Jets
366,688,498,684,499,716,367,720,ey
722,1731,784,1729,785,1757,722,1759,ee
715,1115,792,1118,791,1151,714,1148,tit
763,240,863,242,862,293,762,291,Bgy
245,1119,369,1116,371,1158,246,1161,>A(
720,875,863,879,862,912,719,908,ae
421,1253,665,1256,664,1292,421,1288,thigeit.
544,240,692,241,691,294,543,292,egEI
479,872,639,881,637,916,477,907,Ree

@Pay20Y
Copy link
Owner

Pay20Y commented May 28, 2020

你好,因为模型是在英文数据集上训练的,所以无法直接用在中文数据集上,您可以修改一下代码,然后finetune一下。至于达不到论文中的FPS,这可能是因为我本身能力有限,代码不是那么完美,也可能和硬件设备有一些关系。

@likezjuisee
Copy link
Author

明白了,已经很赞了。
如果我想做的是软件界面的文字识别,角度一般都是0度的,有什么快速的方法推荐么?

@Pay20Y
Copy link
Owner

Pay20Y commented May 28, 2020

您的意思是只有水平文本吗(没有倾斜之类的)?那您可以试一下CTPN+CTC的结构,网上也有很多实现比如这个

@likezjuisee
Copy link
Author

是的,就像上面的软件界面,基本都是水平文本。
谢谢,我看看。

@likezjuisee
Copy link
Author

尝试了一下,上面那张图需要2.5秒左右的时间。
还有就是这种两阶段的模型需要占用两个显卡,还是比较昂贵的哈。
还有其他方法么?

@Pay20Y
Copy link
Owner

Pay20Y commented May 28, 2020

应该是CTPN本身比较慢,可以试一下这个或者这个

@likezjuisee
Copy link
Author

https://github.com/ouyanghuiyu/chineseocr_lite
这个我试了下,速度确实快,但是精度降低了很多,有点难以满足需求。
提高速度的思路,我理解是对模型进行了简化,FOTS会有FOTS_lite版本么?

@Pay20Y
Copy link
Owner

Pay20Y commented Jun 2, 2020

抱歉,目前我没有这样的计划,您可以看一下别的关于FOTS的复现。

@likezjuisee
Copy link
Author

了解了。

@SkrDrag
Copy link

SkrDrag commented Apr 28, 2022

你好,因为模型是在英文数据集上训练的,所以无法直接用在中文数据集上,您可以修改一下代码,然后finetune一下。至于达不到论文中的FPS,这可能是因为我本身能力有限,代码不是那么完美,也可能和硬件设备有一些关系。

你好我想请问一下,我试图使用你发布的预训练模型在中文数据集上进行微调训练,却发生无法加载模型的报错。请问该怎么修改代码呢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants