개정판 60d74dbf
issue #506: fix text area merge
Change-Id: Ie6fe1de9b571de4646858f4f76fee90f74843957
DTI_PID/DTI_PID/tesseract_ocr_module.py | ||
---|---|---|
153 | 153 |
boundaryOcrData = pytesseract.image_to_boxes(im, config=_conf, lang=oCRLang) |
154 | 154 |
bounding_boxes = boundaryOcrData.split('\n') |
155 | 155 |
|
156 |
bounding_boxes = sorted(bounding_boxes, key=lambda param: int(param.split(' ')[4]) - int(param.split(' ')[2]), reverse=True) |
|
156 | 157 |
merged_boxes = [] |
157 | 158 |
|
158 | 159 |
for box in bounding_boxes: |
내보내기 Unified diff