Skip to content

Commit 49ed977

Browse files
zhuo-yoyowzRyan LoneypaularamoJakub DebskiPaula Ramos
authored
Notebook 405. Updated codes to prevent flickering effect when changing objects in front of the webcam and fix a bug for text recognition (openvinotoolkit#565)
* Update README_cn.md * Update README_cn.md * Update README_cn.md * Update README_cn.md * Create test.cpp * Add files via upload * Update README.md * Delete test.cpp * Create test.cpp * Add files via upload * Delete test.cpp * Create test.cpp * Add files via upload * Delete notebooks/405-paddleOCR-webcam/model/ch_ppocr_mobile_v2.0_cls_infer directory * Create test.cpp * Create test.cpp * Add files via upload * Delete test.cpp * Create test.cpp * Add files via upload * Delete test.cpp * Delete notebooks/405-paddleOCR-webcam/model/ch_ppocr_mobile_v2.0_cls_infer directory * Create test.cpp * Add files via upload * Delete test.cpp * Delete test.cpp * Delete 405-paddleOCR-webcam.ipynb * Add files via upload * Create 405-paddleOCR-webcam.ipynb * Delete 405-paddleOCR-webcam.ipynb * Add files via upload * Add files via upload * Delete test_video.mp4 * Delete 405-paddleOCR-webcam.ipynb * Add files via upload * Delete 405-paddleOCR-webcam.ipynb * Add files via upload * Update 405-paddleOCR-webcam.ipynb * Update README.md * Update README.md * Update 405-paddleOCR-webcam.ipynb * Update 405-paddleOCR-webcam.ipynb * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Jakub Debski <jakub.debski@intel.com> * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Jakub Debski <jakub.debski@intel.com> * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Jakub Debski <jakub.debski@intel.com> * Update 405-paddleOCR-webcam.ipynb * Update 405-paddleOCR-webcam.ipynb * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Paula Ramos <paula.ramos@intel.com> * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Paula Ramos <paula.ramos@intel.com> * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Paula Ramos <paula.ramos@intel.com> * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Paula Ramos <paula.ramos@intel.com> * Update README.md * Update 405-paddleOCR-webcam.ipynb * Update 405-paddleOCR-webcam.ipynb * Update 405-paddleOCR-webcam.ipynb * Update 405-paddleOCR-webcam.ipynb * Update 405-paddleOCR-webcam.ipynb * Delete notebooks/405-paddleOCR-webcam/data directory * Delete notebooks/405-paddleOCR-webcam/model directory * Delete 405-paddleOCR-webcam.ipynb * Add files via upload * Update 405-paddleOCR-webcam.ipynb * Update 405-paddleOCR-webcam.ipynb * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Paula Ramos <paula.ramos@intel.com> * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Paula Ramos <paula.ramos@intel.com> * Delete 405-paddleOCR-webcam.ipynb * Add files via upload * Delete 405-paddleOCR-webcam.ipynb * Add files via upload * Update 405-paddleOCR-webcam.ipynb * Update 405-paddleOCR-webcam.ipynb * Update README.md Co-authored-by: Adrian Boguszewski <adekboguszewski@gmail.com> * Update notebooks/README.md Co-authored-by: Adrian Boguszewski <adekboguszewski@gmail.com> * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Adrian Boguszewski <adekboguszewski@gmail.com> * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Adrian Boguszewski <adekboguszewski@gmail.com> * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Adrian Boguszewski <adekboguszewski@gmail.com> * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Adrian Boguszewski <adekboguszewski@gmail.com> * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Adrian Boguszewski <adekboguszewski@gmail.com> * Update notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb Co-authored-by: Adrian Boguszewski <adekboguszewski@gmail.com> * Update pre_post_processing.py * Update README.md * Create tt * Rename notebooks/405-paddleOCR-webcam/405-paddleOCR-webcam.ipynb to notebooks/405-paddle-ocr-webcam/405-paddle-ocr-webcam.ipynb * Rename notebooks/405-paddleOCR-webcam/README.md to notebooks/405-paddle-ocr-webcam/README.md * Rename notebooks/405-paddleOCR-webcam/pre_post_processing.py to notebooks/405-paddle-ocr-webcam/pre_post_processing.py * Rename notebooks/405-paddleOCR-webcam/ppocr_keys_v1.txt to notebooks/405-paddle-ocr-webcam/data/ppocr_keys_v1.txt * Add files via upload * Delete simfang.ttf * Delete tt * Update README.md * Update pre_post_processing.py * Update 405-paddle-ocr-webcam.ipynb * Update 405-paddle-ocr-webcam.ipynb * Delete 405-paddle-ocr-webcam.ipynb * Add files via upload * Update 405-paddleOCR-webcam.ipynb * Update 405-paddleOCR-webcam.ipynb * Update pre_post_processing.py * Rename 405-paddleOCR-webcam.ipynb to 405-paddle-ocr-webcam.ipynb * Update notebooks/405-paddle-ocr-webcam/405-paddle-ocr-webcam.ipynb Co-authored-by: Jakub Debski <jakub.debski@intel.com> * Update notebooks/405-paddle-ocr-webcam/405-paddle-ocr-webcam.ipynb Co-authored-by: Jakub Debski <jakub.debski@intel.com> * Update 405-paddle-ocr-webcam.ipynb * Update notebooks/405-paddle-ocr-webcam/405-paddle-ocr-webcam.ipynb Co-authored-by: Jakub Debski <jakub.debski@intel.com> * Update notebooks/405-paddle-ocr-webcam/405-paddle-ocr-webcam.ipynb Co-authored-by: Jakub Debski <jakub.debski@intel.com> * Update notebooks/405-paddle-ocr-webcam/405-paddle-ocr-webcam.ipynb Co-authored-by: Jakub Debski <jakub.debski@intel.com> * Update notebooks/405-paddle-ocr-webcam/405-paddle-ocr-webcam.ipynb Co-authored-by: Jakub Debski <jakub.debski@intel.com> * Update 405-paddle-ocr-webcam.ipynb * Update 405-paddle-ocr-webcam.ipynb * Update 405-paddle-ocr-webcam.ipynb * Update 405-paddle-ocr-webcam.ipynb * Update 405-paddle-ocr-webcam.ipynb * Update 405-paddle-ocr-webcam.ipynb * Update 405-paddle-ocr-webcam.ipynb * Update 405-paddle-ocr-webcam.ipynb * Delete 405-paddle-ocr-webcam.ipynb * Add files via upload * Update 405-paddle-ocr-webcam.ipynb * Update 405-paddle-ocr-webcam.ipynb * Update README.md * Update 405-paddle-ocr-webcam.ipynb * Prevent text size flickering * Update README.md * Update README.md * Update README.md * Update 405-paddle-ocr-webcam.ipynb Update codes to prevent flickering effect when changing objects for OCR in front of the webcam * Update 405-paddle-ocr-webcam.ipynb Update codes to prevent flickering effect when changing objects for OCR in front of the webcam * Update 405-paddle-ocr-webcam.ipynb * Update 405-paddle-ocr-webcam.ipynb * Delete 405-paddle-ocr-webcam.ipynb * Restructure codes for fixing a bug Restructured codes to fix the bug of inferencing for text recognition in successive batches * Update 405-paddle-ocr-webcam.ipynb Co-authored-by: yoyowz <35246475+yoyowz@users.noreply.github.com> Co-authored-by: Ryan Loney <ryan.loney@intel.com> Co-authored-by: Paula Ramos <pjramg@gmail.com> Co-authored-by: Jakub Debski <jakub.debski@intel.com> Co-authored-by: Paula Ramos <paula.ramos@intel.com> Co-authored-by: Adrian Boguszewski <adekboguszewski@gmail.com>
1 parent 73a3044 commit 49ed977

File tree

1 file changed

+56
-42
lines changed

1 file changed

+56
-42
lines changed

notebooks/405-paddle-ocr-webcam/405-paddle-ocr-webcam.ipynb

+56-42
Original file line numberDiff line numberDiff line change
@@ -300,15 +300,14 @@
300300
" return padding_im\n",
301301
"\n",
302302
"\n",
303-
"def batch_text_box(dt_boxes, frame):\n",
303+
"def prep_for_rec(dt_boxes, frame):\n",
304304
" \"\"\"\n",
305-
" Batch the detected bounding boxes for text recognition\n",
305+
" Preprocessing of the detected bounding boxes for text recognition\n",
306306
"\n",
307307
" Parameters:\n",
308308
" dt_boxes: detected bounding boxes from text detection \n",
309309
" frame: original input frame \n",
310310
" \"\"\"\n",
311-
" \n",
312311
" ori_im = frame.copy()\n",
313312
" img_crop_list = [] \n",
314313
" for bno in range(len(dt_boxes)):\n",
@@ -321,29 +320,38 @@
321320
" width_list = []\n",
322321
" for img in img_crop_list:\n",
323322
" width_list.append(img.shape[1] / float(img.shape[0]))\n",
323+
" \n",
324324
" # Sorting can speed up the recognition process\n",
325325
" indices = np.argsort(np.array(width_list))\n",
326-
" rec_res = [['', 0.0]] * img_num\n",
327-
" batch_num = 6\n",
328-
"\n",
329-
" # For each detected text box batch, run inference for text recognition\n",
330-
" for beg_img_no in range(0, img_num, batch_num):\n",
331-
" end_img_no = min(img_num, beg_img_no + batch_num)\n",
332-
"\n",
333-
" norm_img_batch = []\n",
334-
" max_wh_ratio = 0\n",
335-
" for ino in range(beg_img_no, end_img_no):\n",
336-
" h, w = img_crop_list[indices[ino]].shape[0:2]\n",
337-
" wh_ratio = w * 1.0 / h\n",
338-
" max_wh_ratio = max(max_wh_ratio, wh_ratio)\n",
339-
" for ino in range(beg_img_no, end_img_no):\n",
340-
" norm_img = resize_norm_img(img_crop_list[indices[ino]], max_wh_ratio)\n",
341-
" norm_img = norm_img[np.newaxis, :]\n",
342-
" norm_img_batch.append(norm_img)\n",
326+
" return img_crop_list, img_num, indices\n",
327+
"\n",
328+
"\n",
329+
"def batch_text_box(img_crop_list, img_num, indices, beg_img_no, batch_num):\n",
330+
" \"\"\"\n",
331+
" Batch for text recognition\n",
332+
"\n",
333+
" Parameters:\n",
334+
" img_crop_list: processed detected bounding box images \n",
335+
" img_num: number of bounding boxes from text detection\n",
336+
" indices: sorting for bounding boxes to speed up text recognition\n",
337+
" beg_img_no: the beginning number of bounding boxes for each batch of text recognition inference\n",
338+
" batch_num: number of images for each batch\n",
339+
" \"\"\"\n",
340+
" norm_img_batch = []\n",
341+
" max_wh_ratio = 0\n",
342+
" end_img_no = min(img_num, beg_img_no + batch_num)\n",
343+
" for ino in range(beg_img_no, end_img_no):\n",
344+
" h, w = img_crop_list[indices[ino]].shape[0:2]\n",
345+
" wh_ratio = w * 1.0 / h\n",
346+
" max_wh_ratio = max(max_wh_ratio, wh_ratio)\n",
347+
" for ino in range(beg_img_no, end_img_no):\n",
348+
" norm_img = resize_norm_img(img_crop_list[indices[ino]], max_wh_ratio)\n",
349+
" norm_img = norm_img[np.newaxis, :]\n",
350+
" norm_img_batch.append(norm_img)\n",
343351
"\n",
344352
" norm_img_batch = np.concatenate(norm_img_batch)\n",
345353
" norm_img_batch = norm_img_batch.copy()\n",
346-
" return norm_img_batch, rec_res, indices, beg_img_no"
354+
" return norm_img_batch"
347355
]
348356
},
349357
{
@@ -461,7 +469,7 @@
461469
" frame = cv2.resize(src=frame, dsize=None, fx=scale, fy=scale,\n",
462470
" interpolation=cv2.INTER_AREA)\n",
463471
" # preprocess image for text detection\n",
464-
" test_image = image_preprocess(frame,640)\n",
472+
" test_image = image_preprocess(frame, 640)\n",
465473
" \n",
466474
" # measure processing time for text detection\n",
467475
" start_time = time.time()\n",
@@ -480,9 +488,20 @@
480488
"\n",
481489
" # Preprocess detection results for recognition\n",
482490
" dt_boxes = processing.sorted_boxes(dt_boxes) \n",
483-
" if dt_boxes:\n",
491+
" batch_num = 6\n",
492+
" img_crop_list, img_num, indices = prep_for_rec(dt_boxes, frame)\n",
493+
" \n",
494+
" # For storing recognition results, include two parts:\n",
495+
" # txts are the recognized text results, scores are the recognition confidence level \n",
496+
" rec_res = [['', 0.0]] * img_num\n",
497+
" txts = [] \n",
498+
" scores = []\n",
499+
"\n",
500+
" for beg_img_no in range(0, img_num, batch_num):\n",
501+
"\n",
484502
" # Recognition starts from here\n",
485-
" norm_img_batch, rec_res, indices, beg_img_no = batch_text_box(dt_boxes, frame)\n",
503+
" norm_img_batch = batch_text_box(\n",
504+
" img_crop_list, img_num, indices, beg_img_no, batch_num)\n",
486505
"\n",
487506
" # Run inference for text recognition \n",
488507
" rec_results = rec_compiled_model([norm_img_batch])[rec_output_layer]\n",
@@ -491,31 +510,26 @@
491510
" postprocess_op = processing.build_post_process(processing.postprocess_params)\n",
492511
" rec_result = postprocess_op(rec_results)\n",
493512
" for rno in range(len(rec_result)):\n",
494-
" rec_res[indices[beg_img_no + rno]] = rec_result[rno]\n",
495-
"\n",
496-
" # Text recognition results, rec_res, include two parts:\n",
497-
" # txts are the recognized text results, scores are the recognition confidence level \n",
513+
" rec_res[indices[beg_img_no + rno]] = rec_result[rno] \n",
498514
" if rec_res:\n",
499-
" image = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))\n",
500-
" boxes = dt_boxes\n",
501515
" txts = [rec_res[i][0] for i in range(len(rec_res))] \n",
502-
" scores = [rec_res[i][1] for i in range(len(rec_res))] \n",
503-
"\n",
504-
" # draw text recognition results beside the image\n",
505-
" draw_img = processing.draw_ocr_box_txt(\n",
506-
" image,\n",
507-
" boxes,\n",
508-
" txts,\n",
509-
" scores,\n",
510-
" drop_score=0.5)\n",
511-
" else:\n",
512-
" draw_img = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)\n",
516+
" scores = [rec_res[i][1] for i in range(len(rec_res))]\n",
517+
" \n",
518+
" image = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))\n",
519+
" boxes = dt_boxes\n",
520+
" # draw text recognition results beside the image\n",
521+
" draw_img = processing.draw_ocr_box_txt(\n",
522+
" image,\n",
523+
" boxes,\n",
524+
" txts,\n",
525+
" scores,\n",
526+
" drop_score=0.5)\n",
513527
"\n",
514528
" # Visualize PaddleOCR results\n",
515529
" f_height, f_width = draw_img.shape[:2]\n",
516530
" fps = 1000 / processing_time_det\n",
517531
" cv2.putText(img=draw_img, text=f\"Inference time: {processing_time_det:.1f}ms ({fps:.1f} FPS)\", \n",
518-
" org=(20, 40),fontFace=cv2.FONT_HERSHEY_COMPLEX, fontScale=f_height / 1000,\n",
532+
" org=(20, 40),fontFace=cv2.FONT_HERSHEY_COMPLEX, fontScale=f_width / 1000,\n",
519533
" color=(0, 0, 255), thickness=1, lineType=cv2.LINE_AA)\n",
520534
" \n",
521535
" # use this workaround if there is flickering\n",

0 commit comments

Comments
 (0)