개정판 af64ec0d
issue #624: tesseract version up(5.0 alpha 20200223)
Change-Id: Ia9dc4cf3bb55a71e9870db43a73c23a8982c7a0f
DTI_PID/DTI_PID/Tesseract-OCR/doc/AUTHORS | ||
---|---|---|
35 | 35 |
James R Barlow |
36 | 36 |
Amit Dovev |
37 | 37 |
Martin Ettl |
38 |
Shree Devi Kumar |
|
39 |
Noah Metzger |
|
38 | 40 |
Tom Morris |
39 | 41 |
Tobias Müller |
40 | 42 |
Egor Pugin |
43 |
Robert Sachunsky |
|
44 |
Raf Schietekat |
|
41 | 45 |
Sundar M. Vaidya |
42 | 46 |
Stefan Weil |
47 |
Alexander Zaitsev |
DTI_PID/DTI_PID/Tesseract-OCR/doc/COPYING | ||
---|---|---|
1 |
This package contains the Tesseract Open Source OCR Engine. |
|
2 |
Originally developed at Hewlett Packard Laboratories Bristol and |
|
3 |
at Hewlett Packard Co, Greeley Colorado, all the code |
|
4 |
in this distribution is now licensed under the Apache License: |
|
5 |
|
|
6 |
** Licensed under the Apache License, Version 2.0 (the "License"); |
|
7 |
** you may not use this file except in compliance with the License. |
|
8 |
** You may obtain a copy of the License at |
|
9 |
** http://www.apache.org/licenses/LICENSE-2.0 |
|
10 |
** Unless required by applicable law or agreed to in writing, software |
|
11 |
** distributed under the License is distributed on an "AS IS" BASIS, |
|
12 |
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
|
13 |
** See the License for the specific language governing permissions and |
|
14 |
** limitations under the License. |
|
15 |
|
|
16 |
|
|
17 |
Other Dependencies and Licenses: |
|
18 |
================================ |
|
19 |
|
|
20 |
Tesseract uses Leptonica library (http://leptonica.com/) which essentially |
|
21 |
uses a BSD 2-clause license. (http://leptonica.com/about-the-license.html) |
DTI_PID/DTI_PID/Tesseract-OCR/doc/LICENSE | ||
---|---|---|
1 |
|
|
2 |
Apache License |
|
3 |
Version 2.0, January 2004 |
|
4 |
http://www.apache.org/licenses/ |
|
5 |
|
|
6 |
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION |
|
7 |
|
|
8 |
1. Definitions. |
|
9 |
|
|
10 |
"License" shall mean the terms and conditions for use, reproduction, |
|
11 |
and distribution as defined by Sections 1 through 9 of this document. |
|
12 |
|
|
13 |
"Licensor" shall mean the copyright owner or entity authorized by |
|
14 |
the copyright owner that is granting the License. |
|
15 |
|
|
16 |
"Legal Entity" shall mean the union of the acting entity and all |
|
17 |
other entities that control, are controlled by, or are under common |
|
18 |
control with that entity. For the purposes of this definition, |
|
19 |
"control" means (i) the power, direct or indirect, to cause the |
|
20 |
direction or management of such entity, whether by contract or |
|
21 |
otherwise, or (ii) ownership of fifty percent (50%) or more of the |
|
22 |
outstanding shares, or (iii) beneficial ownership of such entity. |
|
23 |
|
|
24 |
"You" (or "Your") shall mean an individual or Legal Entity |
|
25 |
exercising permissions granted by this License. |
|
26 |
|
|
27 |
"Source" form shall mean the preferred form for making modifications, |
|
28 |
including but not limited to software source code, documentation |
|
29 |
source, and configuration files. |
|
30 |
|
|
31 |
"Object" form shall mean any form resulting from mechanical |
|
32 |
transformation or translation of a Source form, including but |
|
33 |
not limited to compiled object code, generated documentation, |
|
34 |
and conversions to other media types. |
|
35 |
|
|
36 |
"Work" shall mean the work of authorship, whether in Source or |
|
37 |
Object form, made available under the License, as indicated by a |
|
38 |
copyright notice that is included in or attached to the work |
|
39 |
(an example is provided in the Appendix below). |
|
40 |
|
|
41 |
"Derivative Works" shall mean any work, whether in Source or Object |
|
42 |
form, that is based on (or derived from) the Work and for which the |
|
43 |
editorial revisions, annotations, elaborations, or other modifications |
|
44 |
represent, as a whole, an original work of authorship. For the purposes |
|
45 |
of this License, Derivative Works shall not include works that remain |
|
46 |
separable from, or merely link (or bind by name) to the interfaces of, |
|
47 |
the Work and Derivative Works thereof. |
|
48 |
|
|
49 |
"Contribution" shall mean any work of authorship, including |
|
50 |
the original version of the Work and any modifications or additions |
|
51 |
to that Work or Derivative Works thereof, that is intentionally |
|
52 |
submitted to Licensor for inclusion in the Work by the copyright owner |
|
53 |
or by an individual or Legal Entity authorized to submit on behalf of |
|
54 |
the copyright owner. For the purposes of this definition, "submitted" |
|
55 |
means any form of electronic, verbal, or written communication sent |
|
56 |
to the Licensor or its representatives, including but not limited to |
|
57 |
communication on electronic mailing lists, source code control systems, |
|
58 |
and issue tracking systems that are managed by, or on behalf of, the |
|
59 |
Licensor for the purpose of discussing and improving the Work, but |
|
60 |
excluding communication that is conspicuously marked or otherwise |
|
61 |
designated in writing by the copyright owner as "Not a Contribution." |
|
62 |
|
|
63 |
"Contributor" shall mean Licensor and any individual or Legal Entity |
|
64 |
on behalf of whom a Contribution has been received by Licensor and |
|
65 |
subsequently incorporated within the Work. |
|
66 |
|
|
67 |
2. Grant of Copyright License. Subject to the terms and conditions of |
|
68 |
this License, each Contributor hereby grants to You a perpetual, |
|
69 |
worldwide, non-exclusive, no-charge, royalty-free, irrevocable |
|
70 |
copyright license to reproduce, prepare Derivative Works of, |
|
71 |
publicly display, publicly perform, sublicense, and distribute the |
|
72 |
Work and such Derivative Works in Source or Object form. |
|
73 |
|
|
74 |
3. Grant of Patent License. Subject to the terms and conditions of |
|
75 |
this License, each Contributor hereby grants to You a perpetual, |
|
76 |
worldwide, non-exclusive, no-charge, royalty-free, irrevocable |
|
77 |
(except as stated in this section) patent license to make, have made, |
|
78 |
use, offer to sell, sell, import, and otherwise transfer the Work, |
|
79 |
where such license applies only to those patent claims licensable |
|
80 |
by such Contributor that are necessarily infringed by their |
|
81 |
Contribution(s) alone or by combination of their Contribution(s) |
|
82 |
with the Work to which such Contribution(s) was submitted. If You |
|
83 |
institute patent litigation against any entity (including a |
|
84 |
cross-claim or counterclaim in a lawsuit) alleging that the Work |
|
85 |
or a Contribution incorporated within the Work constitutes direct |
|
86 |
or contributory patent infringement, then any patent licenses |
|
87 |
granted to You under this License for that Work shall terminate |
|
88 |
as of the date such litigation is filed. |
|
89 |
|
|
90 |
4. Redistribution. You may reproduce and distribute copies of the |
|
91 |
Work or Derivative Works thereof in any medium, with or without |
|
92 |
modifications, and in Source or Object form, provided that You |
|
93 |
meet the following conditions: |
|
94 |
|
|
95 |
(a) You must give any other recipients of the Work or |
|
96 |
Derivative Works a copy of this License; and |
|
97 |
|
|
98 |
(b) You must cause any modified files to carry prominent notices |
|
99 |
stating that You changed the files; and |
|
100 |
|
|
101 |
(c) You must retain, in the Source form of any Derivative Works |
|
102 |
that You distribute, all copyright, patent, trademark, and |
|
103 |
attribution notices from the Source form of the Work, |
|
104 |
excluding those notices that do not pertain to any part of |
|
105 |
the Derivative Works; and |
|
106 |
|
|
107 |
(d) If the Work includes a "NOTICE" text file as part of its |
|
108 |
distribution, then any Derivative Works that You distribute must |
|
109 |
include a readable copy of the attribution notices contained |
|
110 |
within such NOTICE file, excluding those notices that do not |
|
111 |
pertain to any part of the Derivative Works, in at least one |
|
112 |
of the following places: within a NOTICE text file distributed |
|
113 |
as part of the Derivative Works; within the Source form or |
|
114 |
documentation, if provided along with the Derivative Works; or, |
|
115 |
within a display generated by the Derivative Works, if and |
|
116 |
wherever such third-party notices normally appear. The contents |
|
117 |
of the NOTICE file are for informational purposes only and |
|
118 |
do not modify the License. You may add Your own attribution |
|
119 |
notices within Derivative Works that You distribute, alongside |
|
120 |
or as an addendum to the NOTICE text from the Work, provided |
|
121 |
that such additional attribution notices cannot be construed |
|
122 |
as modifying the License. |
|
123 |
|
|
124 |
You may add Your own copyright statement to Your modifications and |
|
125 |
may provide additional or different license terms and conditions |
|
126 |
for use, reproduction, or distribution of Your modifications, or |
|
127 |
for any such Derivative Works as a whole, provided Your use, |
|
128 |
reproduction, and distribution of the Work otherwise complies with |
|
129 |
the conditions stated in this License. |
|
130 |
|
|
131 |
5. Submission of Contributions. Unless You explicitly state otherwise, |
|
132 |
any Contribution intentionally submitted for inclusion in the Work |
|
133 |
by You to the Licensor shall be under the terms and conditions of |
|
134 |
this License, without any additional terms or conditions. |
|
135 |
Notwithstanding the above, nothing herein shall supersede or modify |
|
136 |
the terms of any separate license agreement you may have executed |
|
137 |
with Licensor regarding such Contributions. |
|
138 |
|
|
139 |
6. Trademarks. This License does not grant permission to use the trade |
|
140 |
names, trademarks, service marks, or product names of the Licensor, |
|
141 |
except as required for reasonable and customary use in describing the |
|
142 |
origin of the Work and reproducing the content of the NOTICE file. |
|
143 |
|
|
144 |
7. Disclaimer of Warranty. Unless required by applicable law or |
|
145 |
agreed to in writing, Licensor provides the Work (and each |
|
146 |
Contributor provides its Contributions) on an "AS IS" BASIS, |
|
147 |
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or |
|
148 |
implied, including, without limitation, any warranties or conditions |
|
149 |
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A |
|
150 |
PARTICULAR PURPOSE. You are solely responsible for determining the |
|
151 |
appropriateness of using or redistributing the Work and assume any |
|
152 |
risks associated with Your exercise of permissions under this License. |
|
153 |
|
|
154 |
8. Limitation of Liability. In no event and under no legal theory, |
|
155 |
whether in tort (including negligence), contract, or otherwise, |
|
156 |
unless required by applicable law (such as deliberate and grossly |
|
157 |
negligent acts) or agreed to in writing, shall any Contributor be |
|
158 |
liable to You for damages, including any direct, indirect, special, |
|
159 |
incidental, or consequential damages of any character arising as a |
|
160 |
result of this License or out of the use or inability to use the |
|
161 |
Work (including but not limited to damages for loss of goodwill, |
|
162 |
work stoppage, computer failure or malfunction, or any and all |
|
163 |
other commercial damages or losses), even if such Contributor |
|
164 |
has been advised of the possibility of such damages. |
|
165 |
|
|
166 |
9. Accepting Warranty or Additional Liability. While redistributing |
|
167 |
the Work or Derivative Works thereof, You may choose to offer, |
|
168 |
and charge a fee for, acceptance of support, warranty, indemnity, |
|
169 |
or other liability obligations and/or rights consistent with this |
|
170 |
License. However, in accepting such obligations, You may act only |
|
171 |
on Your own behalf and on Your sole responsibility, not on behalf |
|
172 |
of any other Contributor, and only if You agree to indemnify, |
|
173 |
defend, and hold each Contributor harmless for any liability |
|
174 |
incurred by, or claims asserted against, such Contributor by reason |
|
175 |
of your accepting any such warranty or additional liability. |
|
176 |
|
|
177 |
END OF TERMS AND CONDITIONS |
|
178 |
|
|
179 |
APPENDIX: How to apply the Apache License to your work. |
|
180 |
|
|
181 |
To apply the Apache License to your work, attach the following |
|
182 |
boilerplate notice, with the fields enclosed by brackets "[]" |
|
183 |
replaced with your own identifying information. (Don't include |
|
184 |
the brackets!) The text should be enclosed in the appropriate |
|
185 |
comment syntax for the file format. We also recommend that a |
|
186 |
file or class name and description of purpose be included on the |
|
187 |
same "printed page" as the copyright notice for easier |
|
188 |
identification within third-party archives. |
|
189 |
|
|
190 |
Copyright [yyyy] [name of copyright owner] |
|
191 |
|
|
192 |
Licensed under the Apache License, Version 2.0 (the "License"); |
|
193 |
you may not use this file except in compliance with the License. |
|
194 |
You may obtain a copy of the License at |
|
195 |
|
|
196 |
http://www.apache.org/licenses/LICENSE-2.0 |
|
197 |
|
|
198 |
Unless required by applicable law or agreed to in writing, software |
|
199 |
distributed under the License is distributed on an "AS IS" BASIS, |
|
200 |
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
|
201 |
See the License for the specific language governing permissions and |
|
202 |
limitations under the License. |
DTI_PID/DTI_PID/Tesseract-OCR/doc/README | ||
---|---|---|
1 |
How to run UNLV tests. |
|
2 |
|
|
3 |
The scripts in this directory make it possible to duplicate the tests |
|
4 |
published in the Fourth Annual Test of OCR Accuracy. |
|
5 |
See http://www.isri.unlv.edu/downloads/AT-1995.pdf |
|
6 |
but first you have to get the tools and data from UNLV: |
|
7 |
|
|
8 |
Step 1: to download the images goto |
|
9 |
http://www.isri.unlv.edu/ISRI/OCRtk |
|
10 |
and get 3b.tgz, Bb.tgz, Mb.tgz and Nb.tgz. |
|
11 |
|
|
12 |
Step 2: extract the files. It doesn't really matter where |
|
13 |
in your filesystem you put them, but they must go under a common |
|
14 |
root so you have directories 3, B, M and N in, for example, |
|
15 |
/users/me/ISRI-OCRtk. |
|
16 |
|
|
17 |
Step 3: Reorg the files |
|
18 |
The lack of tif extensions on the images is inconvenient, so there |
|
19 |
is a script to reorganize the data to match the rest of the test |
|
20 |
scripts. |
|
21 |
cd to /users/me/ISRI-OCRtk or wherever 3, B, M and N ended up and run |
|
22 |
/blah/blah/tesseract-ocr/testing/reorgdata.sh 3B |
|
23 |
This makes directories doe3.3B, bus.3B, mag.3B and news.3B. |
|
24 |
You can now get rid of 3, B, M, and N unless you want to get some of the |
|
25 |
other scanning resolutions out of them. |
|
26 |
|
|
27 |
Step 4: Download the ISRI toolkit from: |
|
28 |
http://www.isri.unlv.edu/downloads/ftk-1.0.tgz |
|
29 |
|
|
30 |
Step 5: If they work for you, use the binaries directly from the bin |
|
31 |
directory and put them in tesseract-ocr/testing/unlv |
|
32 |
otherwise build the tools for yourself and put them there. |
|
33 |
|
|
34 |
Step 6: cd back to your main tesseract-ocr dir and Build tesseract. |
|
35 |
|
|
36 |
Step 7: run testing/runalltests.sh with the root data dir and testname: |
|
37 |
testing/runalltests.sh /users/me/ISRI-OCRtk tess2.0 |
|
38 |
and go to the gym, have lunch etc. |
|
39 |
|
|
40 |
Step 8: There should be a file |
|
41 |
testing/reports/tess2.0.summary that contains the final summarized accuracy |
|
42 |
report and comparison with the 1995 results. |
|
43 |
|
DTI_PID/DTI_PID/Tesseract-OCR/doc/README.md | ||
---|---|---|
1 |
# Tesseract OCR |
|
2 |
|
|
3 |
[![Build Status](https://travis-ci.org/tesseract-ocr/tesseract.svg?branch=master)](https://travis-ci.org/tesseract-ocr/tesseract) |
|
4 |
[![Build status](https://ci.appveyor.com/api/projects/status/miah0ikfsf0j3819/branch/master?svg=true)](https://ci.appveyor.com/project/zdenop/tesseract/) |
|
5 |
![Build status](https://github.com/tesseract-ocr/tesseract/workflows/windows/badge.svg)<br> |
|
6 |
[![Coverity Scan Build Status](https://scan.coverity.com/projects/tesseract-ocr/badge.svg)](https://scan.coverity.com/projects/tesseract-ocr) |
|
7 |
[![Code Quality: Cpp](https://img.shields.io/lgtm/grade/cpp/g/tesseract-ocr/tesseract.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/tesseract-ocr/tesseract/context:cpp) |
|
8 |
[![Total Alerts](https://img.shields.io/lgtm/alerts/g/tesseract-ocr/tesseract.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/tesseract-ocr/tesseract/alerts) |
|
9 |
[![OSS-Fuzz](https://img.shields.io/badge/oss--fuzz-fuzzing-brightgreen)](https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&can=1&q=proj:tesseract-ocr) |
|
10 |
<br/> |
|
11 |
[![GitHub license](https://img.shields.io/badge/license-Apache--2.0-blue.svg)](https://raw.githubusercontent.com/tesseract-ocr/tesseract/master/LICENSE) |
|
12 |
[![Downloads](https://img.shields.io/badge/download-all%20releases-brightgreen.svg)](https://github.com/tesseract-ocr/tesseract/releases/) |
|
13 |
|
|
14 |
## About |
|
15 |
|
|
16 |
This package contains an **OCR engine** - `libtesseract` and a **command line program** - `tesseract`. |
|
17 |
Tesseract 4 adds a new neural net (LSTM) based OCR engine which is focused |
|
18 |
on line recognition, but also still supports the legacy Tesseract OCR engine of |
|
19 |
Tesseract 3 which works by recognizing character patterns. Compatibility with |
|
20 |
Tesseract 3 is enabled by using the Legacy OCR Engine mode (--oem 0). |
|
21 |
It also needs traineddata files which support the legacy engine, for example |
|
22 |
those from the tessdata repository. |
|
23 |
|
|
24 |
The lead developer is Ray Smith. The maintainer is Zdenko Podobny. |
|
25 |
For a list of contributors see [AUTHORS](https://github.com/tesseract-ocr/tesseract/blob/master/AUTHORS) |
|
26 |
and GitHub's log of [contributors](https://github.com/tesseract-ocr/tesseract/graphs/contributors). |
|
27 |
|
|
28 |
Tesseract has **unicode (UTF-8) support**, and can **recognize more than 100 languages** "out of the box". |
|
29 |
|
|
30 |
Tesseract supports **various output formats**: plain text, hOCR (HTML), PDF, invisible-text-only PDF, TSV. The master branch also has experimental support for ALTO (XML) output. |
|
31 |
|
|
32 |
You should note that in many cases, in order to get better OCR results, |
|
33 |
you'll need to **[improve the quality](https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html) of the image** you are giving Tesseract. |
|
34 |
|
|
35 |
This project **does not include a GUI application**. |
|
36 |
If you need one, please see the [3rdParty](https://tesseract-ocr.github.io/tessdoc/User-Projects-%E2%80%93-3rdParty.html) documentation. |
|
37 |
|
|
38 |
Tesseract **can be trained to recognize other languages**. |
|
39 |
See [Tesseract Training](https://tesseract-ocr.github.io/tessdoc/Training-Tesseract.html) for more information. |
|
40 |
|
|
41 |
## Brief history |
|
42 |
|
|
43 |
Tesseract was originally developed at Hewlett-Packard Laboratories Bristol and |
|
44 |
at Hewlett-Packard Co, Greeley Colorado between 1985 and 1994, with some |
|
45 |
more changes made in 1996 to port to Windows, and some C++izing in 1998. |
|
46 |
In 2005 Tesseract was open sourced by HP. Since 2006 it is developed by Google. |
|
47 |
|
|
48 |
The latest (LSTM based) stable version is **[4.1.1](https://github.com/tesseract-ocr/tesseract/releases/tag/4.1.1)**, released on December 26, 2019. |
|
49 |
Latest source code is available from [master branch on GitHub](https://github.com/tesseract-ocr/tesseract/tree/master). |
|
50 |
Open issues can be found in [issue tracker](https://github.com/tesseract-ocr/tesseract/issues), |
|
51 |
and [planning documentation](https://tesseract-ocr.github.io/tessdoc/Planning.html). |
|
52 |
|
|
53 |
The latest 3.0x version is **[3.05.02](https://github.com/tesseract-ocr/tesseract/releases/tag/3.05.02)**, released on June 19, 2018. Latest source code for 3.05 is available from [3.05 branch on GitHub](https://github.com/tesseract-ocr/tesseract/tree/3.05). |
|
54 |
There is no development for this version, but it can be used for special cases (e.g. see [Regression of features from 3.0x](https://tesseract-ocr.github.io/tessdoc/Planning.html#regression-of-features-from-30x)). |
|
55 |
|
|
56 |
See **[Release Notes](https://tesseract-ocr.github.io/tessdoc/ReleaseNotes.html)** |
|
57 |
and **[Change Log](https://github.com/tesseract-ocr/tesseract/blob/master/ChangeLog)** for more details of the releases. |
|
58 |
|
|
59 |
## Installing Tesseract |
|
60 |
|
|
61 |
You can either [Install Tesseract via pre-built binary package](https://tesseract-ocr.github.io/tessdoc/Home.html) |
|
62 |
or [build it from source](https://tesseract-ocr.github.io/tessdoc/Compiling.html). |
|
63 |
|
|
64 |
Supported Compilers are: |
|
65 |
|
|
66 |
* GCC 4.8 and above |
|
67 |
* Clang 3.4 and above |
|
68 |
* MSVC 2015, 2017, 2019 |
|
69 |
|
|
70 |
Other compilers might work, but are not officially supported. |
|
71 |
|
|
72 |
## Running Tesseract |
|
73 |
|
|
74 |
Basic **[command line usage](https://tesseract-ocr.github.io/tessdoc/Command-Line-Usage.html)**: |
|
75 |
|
|
76 |
tesseract imagename outputbase [-l lang] [--oem ocrenginemode] [--psm pagesegmode] [configfiles...] |
|
77 |
|
|
78 |
For more information about the various command line options use `tesseract --help` or `man tesseract`. |
|
79 |
|
|
80 |
Examples can be found in the [documentation](https://tesseract-ocr.github.io/tessdoc/Command-Line-Usage.html#simplest-invocation-to-ocr-an-image). |
|
81 |
|
|
82 |
## For developers |
|
83 |
|
|
84 |
Developers can use `libtesseract` [C](https://github.com/tesseract-ocr/tesseract/blob/master/include/tesseract/capi.h) or |
|
85 |
[C++](https://github.com/tesseract-ocr/tesseract/blob/master/include/tesseract/baseapi.h) API to build their own application. |
|
86 |
If you need bindings to `libtesseract` for other programming languages, please see the |
|
87 |
[wrapper](https://tesseract-ocr.github.io/tessdoc/AddOns.html#tesseract-wrappers) section in the AddOns documentation. |
|
88 |
|
|
89 |
Documentation of Tesseract generated from source code by doxygen can be found on [tesseract-ocr.github.io](https://tesseract-ocr.github.io/). |
|
90 |
|
|
91 |
## Support |
|
92 |
|
|
93 |
Before you submit an issue, please review **[the guidelines for this repository](https://github.com/tesseract-ocr/tesseract/blob/master/CONTRIBUTING.md)**. |
|
94 |
|
|
95 |
For support, first read the [documentation](https://tesseract-ocr.github.io/tessdoc/), |
|
96 |
particularly the [FAQ](https://tesseract-ocr.github.io/tessdoc/FAQ.html) to see if your problem is addressed there. |
|
97 |
If not, search the [Tesseract user forum](https://groups.google.com/d/forum/tesseract-ocr), the [Tesseract developer forum](https://groups.google.com/d/forum/tesseract-dev) and [past issues](https://github.com/tesseract-ocr/tesseract/issues), and if you still can't find what you need, ask for support in the mailing-lists. |
|
98 |
|
|
99 |
Mailing-lists: |
|
100 |
* [tesseract-ocr](https://groups.google.com/d/forum/tesseract-ocr) - For tesseract users. |
|
101 |
* [tesseract-dev](https://groups.google.com/d/forum/tesseract-dev) - For tesseract developers. |
|
102 |
|
|
103 |
Please report an issue only for a **bug**, not for asking questions. |
|
104 |
|
|
105 |
## License |
|
106 |
|
|
107 |
The code in this repository is licensed under the Apache License, Version 2.0 (the "License"); |
|
108 |
you may not use this file except in compliance with the License. |
|
109 |
You may obtain a copy of the License at |
|
110 |
|
|
111 |
http://www.apache.org/licenses/LICENSE-2.0 |
|
112 |
|
|
113 |
Unless required by applicable law or agreed to in writing, software |
|
114 |
distributed under the License is distributed on an "AS IS" BASIS, |
|
115 |
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
|
116 |
See the License for the specific language governing permissions and |
|
117 |
limitations under the License. |
|
118 |
|
|
119 |
**NOTE**: This software depends on other packages that may be licensed under different open source licenses. |
|
120 |
|
|
121 |
Tesseract uses [Leptonica library](http://leptonica.com/) which essentially |
|
122 |
uses a [BSD 2-clause license](http://leptonica.com/about-the-license.html). |
|
123 |
|
|
124 |
## Dependencies |
|
125 |
|
|
126 |
Tesseract uses [Leptonica library](https://github.com/DanBloomberg/leptonica) |
|
127 |
for opening input images (e.g. not documents like pdf). |
|
128 |
It is suggested to use leptonica with built-in support for [zlib](https://zlib.net), |
|
129 |
[png](https://sourceforge.net/projects/libpng) and |
|
130 |
[tiff](http://www.simplesystems.org/libtiff) (for multipage tiff). |
|
131 |
|
|
132 |
## Latest Version of README |
|
133 |
|
|
134 |
For the latest online version of the README.md see: |
|
135 |
|
|
136 |
https://github.com/tesseract-ocr/tesseract/blob/master/README.md |
DTI_PID/DTI_PID/Tesseract-OCR/lstmeval.1.html | ||
---|---|---|
1 |
<!DOCTYPE html> |
|
2 |
<html lang="en"> |
|
3 |
<head> |
|
4 |
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
|
5 |
<meta name="generator" content="AsciiDoc 8.6.10"> |
|
6 |
<title>LSTMEVAL(1)</title> |
|
7 |
<style type="text/css"> |
|
8 |
/* Shared CSS for AsciiDoc xhtml11 and html5 backends */ |
|
9 |
|
|
10 |
/* Default font. */ |
|
11 |
body { |
|
12 |
font-family: Georgia,serif; |
|
13 |
} |
|
14 |
|
|
15 |
/* Title font. */ |
|
16 |
h1, h2, h3, h4, h5, h6, |
|
17 |
div.title, caption.title, |
|
18 |
thead, p.table.header, |
|
19 |
#toctitle, |
|
20 |
#author, #revnumber, #revdate, #revremark, |
|
21 |
#footer { |
|
22 |
font-family: Arial,Helvetica,sans-serif; |
|
23 |
} |
|
24 |
|
|
25 |
body { |
|
26 |
margin: 1em 5% 1em 5%; |
|
27 |
} |
|
28 |
|
|
29 |
a { |
|
30 |
color: blue; |
|
31 |
text-decoration: underline; |
|
32 |
} |
|
33 |
a:visited { |
|
34 |
color: fuchsia; |
|
35 |
} |
|
36 |
|
|
37 |
em { |
|
38 |
font-style: italic; |
|
39 |
color: navy; |
|
40 |
} |
|
41 |
|
|
42 |
strong { |
|
43 |
font-weight: bold; |
|
44 |
color: #083194; |
|
45 |
} |
|
46 |
|
|
47 |
h1, h2, h3, h4, h5, h6 { |
|
48 |
color: #527bbd; |
|
49 |
margin-top: 1.2em; |
|
50 |
margin-bottom: 0.5em; |
|
51 |
line-height: 1.3; |
|
52 |
} |
|
53 |
|
|
54 |
h1, h2, h3 { |
|
55 |
border-bottom: 2px solid silver; |
|
56 |
} |
|
57 |
h2 { |
|
58 |
padding-top: 0.5em; |
|
59 |
} |
|
60 |
h3 { |
|
61 |
float: left; |
|
62 |
} |
|
63 |
h3 + * { |
|
64 |
clear: left; |
|
65 |
} |
|
66 |
h5 { |
|
67 |
font-size: 1.0em; |
|
68 |
} |
|
69 |
|
|
70 |
div.sectionbody { |
|
71 |
margin-left: 0; |
|
72 |
} |
|
73 |
|
|
74 |
hr { |
|
75 |
border: 1px solid silver; |
|
76 |
} |
|
77 |
|
|
78 |
p { |
|
79 |
margin-top: 0.5em; |
|
80 |
margin-bottom: 0.5em; |
|
81 |
} |
|
82 |
|
|
83 |
ul, ol, li > p { |
|
84 |
margin-top: 0; |
|
85 |
} |
|
86 |
ul > li { color: #aaa; } |
|
87 |
ul > li > * { color: black; } |
|
88 |
|
|
89 |
.monospaced, code, pre { |
|
90 |
font-family: "Courier New", Courier, monospace; |
|
91 |
font-size: inherit; |
|
92 |
color: navy; |
|
93 |
padding: 0; |
|
94 |
margin: 0; |
|
95 |
} |
|
96 |
pre { |
|
97 |
white-space: pre-wrap; |
|
98 |
} |
|
99 |
|
|
100 |
#author { |
|
101 |
color: #527bbd; |
|
102 |
font-weight: bold; |
|
103 |
font-size: 1.1em; |
|
104 |
} |
|
105 |
#email { |
|
106 |
} |
|
107 |
#revnumber, #revdate, #revremark { |
|
108 |
} |
|
109 |
|
|
110 |
#footer { |
|
111 |
font-size: small; |
|
112 |
border-top: 2px solid silver; |
|
113 |
padding-top: 0.5em; |
|
114 |
margin-top: 4.0em; |
|
115 |
} |
|
116 |
#footer-text { |
|
117 |
float: left; |
|
118 |
padding-bottom: 0.5em; |
|
119 |
} |
|
120 |
#footer-badges { |
|
121 |
float: right; |
|
122 |
padding-bottom: 0.5em; |
|
123 |
} |
|
124 |
|
|
125 |
#preamble { |
|
126 |
margin-top: 1.5em; |
|
127 |
margin-bottom: 1.5em; |
|
128 |
} |
|
129 |
div.imageblock, div.exampleblock, div.verseblock, |
|
130 |
div.quoteblock, div.literalblock, div.listingblock, div.sidebarblock, |
|
131 |
div.admonitionblock { |
|
132 |
margin-top: 1.0em; |
|
133 |
margin-bottom: 1.5em; |
|
134 |
} |
|
135 |
div.admonitionblock { |
|
136 |
margin-top: 2.0em; |
|
137 |
margin-bottom: 2.0em; |
|
138 |
margin-right: 10%; |
|
139 |
color: #606060; |
|
140 |
} |
|
141 |
|
|
142 |
div.content { /* Block element content. */ |
|
143 |
padding: 0; |
|
144 |
} |
|
145 |
|
|
146 |
/* Block element titles. */ |
|
147 |
div.title, caption.title { |
|
148 |
color: #527bbd; |
|
149 |
font-weight: bold; |
|
150 |
text-align: left; |
|
151 |
margin-top: 1.0em; |
|
152 |
margin-bottom: 0.5em; |
|
153 |
} |
|
154 |
div.title + * { |
|
155 |
margin-top: 0; |
|
156 |
} |
|
157 |
|
|
158 |
td div.title:first-child { |
|
159 |
margin-top: 0.0em; |
|
160 |
} |
|
161 |
div.content div.title:first-child { |
|
162 |
margin-top: 0.0em; |
|
163 |
} |
|
164 |
div.content + div.title { |
|
165 |
margin-top: 0.0em; |
|
166 |
} |
|
167 |
|
|
168 |
div.sidebarblock > div.content { |
|
169 |
background: #ffffee; |
|
170 |
border: 1px solid #dddddd; |
|
171 |
border-left: 4px solid #f0f0f0; |
|
172 |
padding: 0.5em; |
|
173 |
} |
|
174 |
|
|
175 |
div.listingblock > div.content { |
|
176 |
border: 1px solid #dddddd; |
|
177 |
border-left: 5px solid #f0f0f0; |
|
178 |
background: #f8f8f8; |
|
179 |
padding: 0.5em; |
|
180 |
} |
|
181 |
|
|
182 |
div.quoteblock, div.verseblock { |
|
183 |
padding-left: 1.0em; |
|
184 |
margin-left: 1.0em; |
|
185 |
margin-right: 10%; |
|
186 |
border-left: 5px solid #f0f0f0; |
|
187 |
color: #888; |
|
188 |
} |
|
189 |
|
|
190 |
div.quoteblock > div.attribution { |
|
191 |
padding-top: 0.5em; |
|
192 |
text-align: right; |
|
193 |
} |
|
194 |
|
|
195 |
div.verseblock > pre.content { |
|
196 |
font-family: inherit; |
|
197 |
font-size: inherit; |
|
198 |
} |
|
199 |
div.verseblock > div.attribution { |
|
200 |
padding-top: 0.75em; |
|
201 |
text-align: left; |
|
202 |
} |
|
203 |
/* DEPRECATED: Pre version 8.2.7 verse style literal block. */ |
|
204 |
div.verseblock + div.attribution { |
|
205 |
text-align: left; |
|
206 |
} |
|
207 |
|
|
208 |
div.admonitionblock .icon { |
|
209 |
vertical-align: top; |
|
210 |
font-size: 1.1em; |
|
211 |
font-weight: bold; |
|
212 |
text-decoration: underline; |
|
213 |
color: #527bbd; |
|
214 |
padding-right: 0.5em; |
|
215 |
} |
|
216 |
div.admonitionblock td.content { |
|
217 |
padding-left: 0.5em; |
|
218 |
border-left: 3px solid #dddddd; |
|
219 |
} |
|
220 |
|
|
221 |
div.exampleblock > div.content { |
|
222 |
border-left: 3px solid #dddddd; |
|
223 |
padding-left: 0.5em; |
|
224 |
} |
|
225 |
|
|
226 |
div.imageblock div.content { padding-left: 0; } |
|
227 |
span.image img { border-style: none; vertical-align: text-bottom; } |
|
228 |
a.image:visited { color: white; } |
|
229 |
|
|
230 |
dl { |
|
231 |
margin-top: 0.8em; |
|
232 |
margin-bottom: 0.8em; |
|
233 |
} |
|
234 |
dt { |
|
235 |
margin-top: 0.5em; |
|
236 |
margin-bottom: 0; |
|
237 |
font-style: normal; |
|
238 |
color: navy; |
|
239 |
} |
|
240 |
dd > *:first-child { |
|
241 |
margin-top: 0.1em; |
|
242 |
} |
|
243 |
|
|
244 |
ul, ol { |
|
245 |
list-style-position: outside; |
|
246 |
} |
|
247 |
ol.arabic { |
|
248 |
list-style-type: decimal; |
|
249 |
} |
|
250 |
ol.loweralpha { |
|
251 |
list-style-type: lower-alpha; |
|
252 |
} |
|
253 |
ol.upperalpha { |
|
254 |
list-style-type: upper-alpha; |
|
255 |
} |
|
256 |
ol.lowerroman { |
|
257 |
list-style-type: lower-roman; |
|
258 |
} |
|
259 |
ol.upperroman { |
|
260 |
list-style-type: upper-roman; |
|
261 |
} |
|
262 |
|
|
263 |
div.compact ul, div.compact ol, |
|
264 |
div.compact p, div.compact p, |
|
265 |
div.compact div, div.compact div { |
|
266 |
margin-top: 0.1em; |
|
267 |
margin-bottom: 0.1em; |
|
268 |
} |
|
269 |
|
|
270 |
tfoot { |
|
271 |
font-weight: bold; |
|
272 |
} |
|
273 |
td > div.verse { |
|
274 |
white-space: pre; |
|
275 |
} |
|
276 |
|
|
277 |
div.hdlist { |
|
278 |
margin-top: 0.8em; |
|
279 |
margin-bottom: 0.8em; |
|
280 |
} |
|
281 |
div.hdlist tr { |
|
282 |
padding-bottom: 15px; |
|
283 |
} |
|
284 |
dt.hdlist1.strong, td.hdlist1.strong { |
|
285 |
font-weight: bold; |
|
286 |
} |
|
287 |
td.hdlist1 { |
|
288 |
vertical-align: top; |
|
289 |
font-style: normal; |
|
290 |
padding-right: 0.8em; |
|
291 |
color: navy; |
|
292 |
} |
|
293 |
td.hdlist2 { |
|
294 |
vertical-align: top; |
|
295 |
} |
|
296 |
div.hdlist.compact tr { |
|
297 |
margin: 0; |
|
298 |
padding-bottom: 0; |
|
299 |
} |
|
300 |
|
|
301 |
.comment { |
|
302 |
background: yellow; |
|
303 |
} |
|
304 |
|
|
305 |
.footnote, .footnoteref { |
|
306 |
font-size: 0.8em; |
|
307 |
} |
|
308 |
|
|
309 |
span.footnote, span.footnoteref { |
|
310 |
vertical-align: super; |
|
311 |
} |
|
312 |
|
|
313 |
#footnotes { |
|
314 |
margin: 20px 0 20px 0; |
|
315 |
padding: 7px 0 0 0; |
|
316 |
} |
|
317 |
|
|
318 |
#footnotes div.footnote { |
|
319 |
margin: 0 0 5px 0; |
|
320 |
} |
|
321 |
|
|
322 |
#footnotes hr { |
|
323 |
border: none; |
|
324 |
border-top: 1px solid silver; |
|
325 |
height: 1px; |
|
326 |
text-align: left; |
|
327 |
margin-left: 0; |
|
328 |
width: 20%; |
|
329 |
min-width: 100px; |
|
330 |
} |
|
331 |
|
|
332 |
div.colist td { |
|
333 |
padding-right: 0.5em; |
|
334 |
padding-bottom: 0.3em; |
|
335 |
vertical-align: top; |
|
336 |
} |
|
337 |
div.colist td img { |
|
338 |
margin-top: 0.3em; |
|
339 |
} |
|
340 |
|
|
341 |
@media print { |
|
342 |
#footer-badges { display: none; } |
|
343 |
} |
|
344 |
|
|
345 |
#toc { |
|
346 |
margin-bottom: 2.5em; |
|
347 |
} |
|
348 |
|
|
349 |
#toctitle { |
|
350 |
color: #527bbd; |
|
351 |
font-size: 1.1em; |
|
352 |
font-weight: bold; |
|
353 |
margin-top: 1.0em; |
|
354 |
margin-bottom: 0.1em; |
|
355 |
} |
|
356 |
|
|
357 |
div.toclevel0, div.toclevel1, div.toclevel2, div.toclevel3, div.toclevel4 { |
|
358 |
margin-top: 0; |
|
359 |
margin-bottom: 0; |
|
360 |
} |
|
361 |
div.toclevel2 { |
|
362 |
margin-left: 2em; |
|
363 |
font-size: 0.9em; |
|
364 |
} |
|
365 |
div.toclevel3 { |
|
366 |
margin-left: 4em; |
|
367 |
font-size: 0.9em; |
|
368 |
} |
|
369 |
div.toclevel4 { |
|
370 |
margin-left: 6em; |
|
371 |
font-size: 0.9em; |
|
372 |
} |
|
373 |
|
|
374 |
span.aqua { color: aqua; } |
|
375 |
span.black { color: black; } |
|
376 |
span.blue { color: blue; } |
|
377 |
span.fuchsia { color: fuchsia; } |
|
378 |
span.gray { color: gray; } |
|
379 |
span.green { color: green; } |
|
380 |
span.lime { color: lime; } |
|
381 |
span.maroon { color: maroon; } |
|
382 |
span.navy { color: navy; } |
|
383 |
span.olive { color: olive; } |
|
384 |
span.purple { color: purple; } |
|
385 |
span.red { color: red; } |
|
386 |
span.silver { color: silver; } |
|
387 |
span.teal { color: teal; } |
|
388 |
span.white { color: white; } |
|
389 |
span.yellow { color: yellow; } |
|
390 |
|
|
391 |
span.aqua-background { background: aqua; } |
|
392 |
span.black-background { background: black; } |
|
393 |
span.blue-background { background: blue; } |
|
394 |
span.fuchsia-background { background: fuchsia; } |
|
395 |
span.gray-background { background: gray; } |
|
396 |
span.green-background { background: green; } |
|
397 |
span.lime-background { background: lime; } |
|
398 |
span.maroon-background { background: maroon; } |
|
399 |
span.navy-background { background: navy; } |
|
400 |
span.olive-background { background: olive; } |
|
401 |
span.purple-background { background: purple; } |
|
402 |
span.red-background { background: red; } |
|
403 |
span.silver-background { background: silver; } |
|
404 |
span.teal-background { background: teal; } |
|
405 |
span.white-background { background: white; } |
|
406 |
span.yellow-background { background: yellow; } |
|
407 |
|
|
408 |
span.big { font-size: 2em; } |
|
409 |
span.small { font-size: 0.6em; } |
|
410 |
|
|
411 |
span.underline { text-decoration: underline; } |
|
412 |
span.overline { text-decoration: overline; } |
|
413 |
span.line-through { text-decoration: line-through; } |
|
414 |
|
|
415 |
div.unbreakable { page-break-inside: avoid; } |
|
416 |
|
|
417 |
|
|
418 |
/* |
|
419 |
* xhtml11 specific |
|
420 |
* |
|
421 |
* */ |
|
422 |
|
|
423 |
div.tableblock { |
|
424 |
margin-top: 1.0em; |
|
425 |
margin-bottom: 1.5em; |
|
426 |
} |
|
427 |
div.tableblock > table { |
|
428 |
border: 3px solid #527bbd; |
|
429 |
} |
|
430 |
thead, p.table.header { |
|
431 |
font-weight: bold; |
|
432 |
color: #527bbd; |
|
433 |
} |
|
434 |
p.table { |
|
435 |
margin-top: 0; |
|
436 |
} |
|
437 |
/* Because the table frame attribute is overriden by CSS in most browsers. */ |
|
438 |
div.tableblock > table[frame="void"] { |
|
439 |
border-style: none; |
|
440 |
} |
|
441 |
div.tableblock > table[frame="hsides"] { |
|
442 |
border-left-style: none; |
|
443 |
border-right-style: none; |
|
444 |
} |
|
445 |
div.tableblock > table[frame="vsides"] { |
|
446 |
border-top-style: none; |
|
447 |
border-bottom-style: none; |
|
448 |
} |
|
449 |
|
|
450 |
|
|
451 |
/* |
|
452 |
* html5 specific |
|
453 |
* |
|
454 |
* */ |
|
455 |
|
|
456 |
table.tableblock { |
|
457 |
margin-top: 1.0em; |
|
458 |
margin-bottom: 1.5em; |
|
459 |
} |
|
460 |
thead, p.tableblock.header { |
|
461 |
font-weight: bold; |
|
462 |
color: #527bbd; |
|
463 |
} |
|
464 |
p.tableblock { |
|
465 |
margin-top: 0; |
|
466 |
} |
|
467 |
table.tableblock { |
|
468 |
border-width: 3px; |
|
469 |
border-spacing: 0px; |
|
470 |
border-style: solid; |
|
471 |
border-color: #527bbd; |
|
472 |
border-collapse: collapse; |
|
473 |
} |
|
474 |
th.tableblock, td.tableblock { |
|
475 |
border-width: 1px; |
|
476 |
padding: 4px; |
|
477 |
border-style: solid; |
|
478 |
border-color: #527bbd; |
|
479 |
} |
|
480 |
|
|
481 |
table.tableblock.frame-topbot { |
|
482 |
border-left-style: hidden; |
|
483 |
border-right-style: hidden; |
|
484 |
} |
|
485 |
table.tableblock.frame-sides { |
|
486 |
border-top-style: hidden; |
|
487 |
border-bottom-style: hidden; |
|
488 |
} |
|
489 |
table.tableblock.frame-none { |
|
490 |
border-style: hidden; |
|
491 |
} |
|
492 |
|
|
493 |
th.tableblock.halign-left, td.tableblock.halign-left { |
|
494 |
text-align: left; |
|
495 |
} |
|
496 |
th.tableblock.halign-center, td.tableblock.halign-center { |
|
497 |
text-align: center; |
|
498 |
} |
|
499 |
th.tableblock.halign-right, td.tableblock.halign-right { |
|
500 |
text-align: right; |
|
501 |
} |
|
502 |
|
|
503 |
th.tableblock.valign-top, td.tableblock.valign-top { |
|
504 |
vertical-align: top; |
|
505 |
} |
|
506 |
th.tableblock.valign-middle, td.tableblock.valign-middle { |
|
507 |
vertical-align: middle; |
|
508 |
} |
|
509 |
th.tableblock.valign-bottom, td.tableblock.valign-bottom { |
|
510 |
vertical-align: bottom; |
|
511 |
} |
|
512 |
|
|
513 |
|
|
514 |
/* |
|
515 |
* manpage specific |
|
516 |
* |
|
517 |
* */ |
|
518 |
|
|
519 |
body.manpage h1 { |
|
520 |
padding-top: 0.5em; |
|
521 |
padding-bottom: 0.5em; |
|
522 |
border-top: 2px solid silver; |
|
523 |
border-bottom: 2px solid silver; |
|
524 |
} |
|
525 |
body.manpage h2 { |
|
526 |
border-style: none; |
|
527 |
} |
|
528 |
body.manpage div.sectionbody { |
|
529 |
margin-left: 3em; |
|
530 |
} |
|
531 |
|
|
532 |
@media print { |
|
533 |
body.manpage div#toc { display: none; } |
|
534 |
} |
|
535 |
|
|
536 |
|
|
537 |
</style> |
|
538 |
<script type="text/javascript"> |
|
539 |
/*<![CDATA[*/ |
|
540 |
var asciidoc = { // Namespace. |
|
541 |
|
|
542 |
///////////////////////////////////////////////////////////////////// |
|
543 |
// Table Of Contents generator |
|
544 |
///////////////////////////////////////////////////////////////////// |
|
545 |
|
|
546 |
/* Author: Mihai Bazon, September 2002 |
|
547 |
* http://students.infoiasi.ro/~mishoo |
|
548 |
* |
|
549 |
* Table Of Content generator |
|
550 |
* Version: 0.4 |
|
551 |
* |
|
552 |
* Feel free to use this script under the terms of the GNU General Public |
|
553 |
* License, as long as you do not remove or alter this notice. |
|
554 |
*/ |
|
555 |
|
|
556 |
/* modified by Troy D. Hanson, September 2006. License: GPL */ |
|
557 |
/* modified by Stuart Rackham, 2006, 2009. License: GPL */ |
|
558 |
|
|
559 |
// toclevels = 1..4. |
|
560 |
toc: function (toclevels) { |
|
561 |
|
|
562 |
function getText(el) { |
|
563 |
var text = ""; |
|
564 |
for (var i = el.firstChild; i != null; i = i.nextSibling) { |
|
565 |
if (i.nodeType == 3 /* Node.TEXT_NODE */) // IE doesn't speak constants. |
|
566 |
text += i.data; |
|
567 |
else if (i.firstChild != null) |
|
568 |
text += getText(i); |
|
569 |
} |
|
570 |
return text; |
|
571 |
} |
|
572 |
|
|
573 |
function TocEntry(el, text, toclevel) { |
|
574 |
this.element = el; |
|
575 |
this.text = text; |
|
576 |
this.toclevel = toclevel; |
|
577 |
} |
|
578 |
|
|
579 |
function tocEntries(el, toclevels) { |
|
580 |
var result = new Array; |
|
581 |
var re = new RegExp('[hH]([1-'+(toclevels+1)+'])'); |
|
582 |
// Function that scans the DOM tree for header elements (the DOM2 |
|
583 |
// nodeIterator API would be a better technique but not supported by all |
|
584 |
// browsers). |
|
585 |
var iterate = function (el) { |
|
586 |
for (var i = el.firstChild; i != null; i = i.nextSibling) { |
|
587 |
if (i.nodeType == 1 /* Node.ELEMENT_NODE */) { |
|
588 |
var mo = re.exec(i.tagName); |
|
589 |
if (mo && (i.getAttribute("class") || i.getAttribute("className")) != "float") { |
|
590 |
result[result.length] = new TocEntry(i, getText(i), mo[1]-1); |
|
591 |
} |
|
592 |
iterate(i); |
|
593 |
} |
|
594 |
} |
|
595 |
} |
|
596 |
iterate(el); |
|
597 |
return result; |
|
598 |
} |
|
599 |
|
|
600 |
var toc = document.getElementById("toc"); |
|
601 |
if (!toc) { |
|
602 |
return; |
|
603 |
} |
|
604 |
|
|
605 |
// Delete existing TOC entries in case we're reloading the TOC. |
|
606 |
var tocEntriesToRemove = []; |
|
607 |
var i; |
|
608 |
for (i = 0; i < toc.childNodes.length; i++) { |
|
609 |
var entry = toc.childNodes[i]; |
|
610 |
if (entry.nodeName.toLowerCase() == 'div' |
|
611 |
&& entry.getAttribute("class") |
|
612 |
&& entry.getAttribute("class").match(/^toclevel/)) |
|
613 |
tocEntriesToRemove.push(entry); |
|
614 |
} |
|
615 |
for (i = 0; i < tocEntriesToRemove.length; i++) { |
|
616 |
toc.removeChild(tocEntriesToRemove[i]); |
|
617 |
} |
|
618 |
|
|
619 |
// Rebuild TOC entries. |
|
620 |
var entries = tocEntries(document.getElementById("content"), toclevels); |
|
621 |
for (var i = 0; i < entries.length; ++i) { |
|
622 |
var entry = entries[i]; |
|
623 |
if (entry.element.id == "") |
|
624 |
entry.element.id = "_toc_" + i; |
|
625 |
var a = document.createElement("a"); |
|
626 |
a.href = "#" + entry.element.id; |
|
627 |
a.appendChild(document.createTextNode(entry.text)); |
|
628 |
var div = document.createElement("div"); |
|
629 |
div.appendChild(a); |
|
630 |
div.className = "toclevel" + entry.toclevel; |
|
631 |
toc.appendChild(div); |
|
632 |
} |
|
633 |
if (entries.length == 0) |
|
634 |
toc.parentNode.removeChild(toc); |
|
635 |
}, |
|
636 |
|
|
637 |
|
|
638 |
///////////////////////////////////////////////////////////////////// |
|
639 |
// Footnotes generator |
|
640 |
///////////////////////////////////////////////////////////////////// |
|
641 |
|
|
642 |
/* Based on footnote generation code from: |
|
643 |
* http://www.brandspankingnew.net/archive/2005/07/format_footnote.html |
|
644 |
*/ |
|
645 |
|
|
646 |
footnotes: function () { |
|
647 |
// Delete existing footnote entries in case we're reloading the footnodes. |
|
648 |
var i; |
|
649 |
var noteholder = document.getElementById("footnotes"); |
|
650 |
if (!noteholder) { |
|
651 |
return; |
|
652 |
} |
|
653 |
var entriesToRemove = []; |
|
654 |
for (i = 0; i < noteholder.childNodes.length; i++) { |
|
655 |
var entry = noteholder.childNodes[i]; |
|
656 |
if (entry.nodeName.toLowerCase() == 'div' && entry.getAttribute("class") == "footnote") |
|
657 |
entriesToRemove.push(entry); |
|
658 |
} |
|
659 |
for (i = 0; i < entriesToRemove.length; i++) { |
|
660 |
noteholder.removeChild(entriesToRemove[i]); |
|
661 |
} |
|
662 |
|
|
663 |
// Rebuild footnote entries. |
|
664 |
var cont = document.getElementById("content"); |
|
665 |
var spans = cont.getElementsByTagName("span"); |
|
666 |
var refs = {}; |
|
667 |
var n = 0; |
|
668 |
for (i=0; i<spans.length; i++) { |
|
669 |
if (spans[i].className == "footnote") { |
|
670 |
n++; |
|
671 |
var note = spans[i].getAttribute("data-note"); |
|
672 |
if (!note) { |
|
673 |
// Use [\s\S] in place of . so multi-line matches work. |
|
674 |
// Because JavaScript has no s (dotall) regex flag. |
|
675 |
note = spans[i].innerHTML.match(/\s*\[([\s\S]*)]\s*/)[1]; |
|
676 |
spans[i].innerHTML = |
|
677 |
"[<a id='_footnoteref_" + n + "' href='#_footnote_" + n + |
|
678 |
"' title='View footnote' class='footnote'>" + n + "</a>]"; |
|
679 |
spans[i].setAttribute("data-note", note); |
|
680 |
} |
|
681 |
noteholder.innerHTML += |
|
682 |
"<div class='footnote' id='_footnote_" + n + "'>" + |
|
683 |
"<a href='#_footnoteref_" + n + "' title='Return to text'>" + |
|
684 |
n + "</a>. " + note + "</div>"; |
|
685 |
var id =spans[i].getAttribute("id"); |
|
686 |
if (id != null) refs["#"+id] = n; |
|
687 |
} |
|
688 |
} |
|
689 |
if (n == 0) |
|
690 |
noteholder.parentNode.removeChild(noteholder); |
|
691 |
else { |
|
692 |
// Process footnoterefs. |
|
693 |
for (i=0; i<spans.length; i++) { |
|
694 |
if (spans[i].className == "footnoteref") { |
|
695 |
var href = spans[i].getElementsByTagName("a")[0].getAttribute("href"); |
|
696 |
href = href.match(/#.*/)[0]; // Because IE return full URL. |
|
697 |
n = refs[href]; |
|
698 |
spans[i].innerHTML = |
|
699 |
"[<a href='#_footnote_" + n + |
|
700 |
"' title='View footnote' class='footnote'>" + n + "</a>]"; |
|
701 |
} |
|
702 |
} |
|
703 |
} |
|
704 |
}, |
|
705 |
|
|
706 |
install: function(toclevels) { |
|
707 |
var timerId; |
|
708 |
|
|
709 |
function reinstall() { |
|
710 |
asciidoc.footnotes(); |
|
711 |
if (toclevels) { |
|
712 |
asciidoc.toc(toclevels); |
|
713 |
} |
|
714 |
} |
|
715 |
|
|
716 |
function reinstallAndRemoveTimer() { |
|
717 |
clearInterval(timerId); |
|
718 |
reinstall(); |
|
719 |
} |
|
720 |
|
|
721 |
timerId = setInterval(reinstall, 500); |
|
722 |
if (document.addEventListener) |
|
723 |
document.addEventListener("DOMContentLoaded", reinstallAndRemoveTimer, false); |
|
724 |
else |
|
725 |
window.onload = reinstallAndRemoveTimer; |
|
726 |
} |
|
727 |
|
|
728 |
} |
|
729 |
asciidoc.install(); |
|
730 |
/*]]>*/ |
|
731 |
</script> |
|
732 |
</head> |
|
733 |
<body class="manpage"> |
|
734 |
<div id="header"> |
|
735 |
<h1> |
|
736 |
LSTMEVAL(1) Manual Page |
|
737 |
</h1> |
|
738 |
<h2>NAME</h2> |
|
739 |
<div class="sectionbody"> |
|
740 |
<p>lstmeval - |
|
741 |
Evaluation program for LSTM-based networks. |
|
742 |
</p> |
|
743 |
</div> |
|
744 |
</div> |
|
745 |
<div id="content"> |
|
746 |
<div class="sect1"> |
|
747 |
<h2 id="_synopsis">SYNOPSIS</h2> |
|
748 |
<div class="sectionbody"> |
|
749 |
<div class="paragraph"><p><strong>lstmeval</strong> --model <em>lang.lstm|modelname_checkpoint|modelname_N.NN_NN_NN.checkpoint</em> [--traineddata lang/lang.traineddata] --eval_listfile <em>lang.eval_files.txt</em> [--verbosity N] [--max_image_MB NNNN]</p></div> |
|
750 |
</div> |
|
751 |
</div> |
|
752 |
<div class="sect1"> |
|
753 |
<h2 id="_description">DESCRIPTION</h2> |
|
754 |
<div class="sectionbody"> |
|
755 |
<div class="paragraph"><p>lstmeval(1) evaluates LSTM-based networks. Either a recognition model or a training checkpoint can be given as input for evaluation along with a list of lstmf files. If evaluating a training checkpoint, <em>--traineddata</em> should also be specified. Intermediate training checkpoints can also be used.</p></div> |
|
756 |
</div> |
|
757 |
</div> |
|
758 |
<div class="sect1"> |
|
759 |
<h2 id="_options">OPTIONS</h2> |
|
760 |
<div class="sectionbody"> |
|
761 |
<div class="dlist"><dl> |
|
762 |
<dt class="hdlist1"> |
|
763 |
<em>--model FILE</em> |
|
764 |
</dt> |
|
765 |
<dd> |
|
766 |
<p> |
|
767 |
Name of model file (training or recognition) (type:string default:) |
|
768 |
</p> |
|
769 |
</dd> |
|
770 |
<dt class="hdlist1"> |
|
771 |
<em>--traineddata FILE</em> |
|
772 |
</dt> |
|
773 |
<dd> |
|
774 |
<p> |
|
775 |
If model is a training checkpoint, then traineddata must be the traineddata file that was given to the trainer (type:string default:) |
|
776 |
</p> |
|
777 |
</dd> |
|
778 |
<dt class="hdlist1"> |
|
779 |
<em>--eval_listfile FILE</em> |
|
780 |
</dt> |
|
781 |
<dd> |
|
782 |
<p> |
|
783 |
File listing sample files in lstmf training format. (type:string default:) |
|
784 |
</p> |
|
785 |
</dd> |
|
786 |
<dt class="hdlist1"> |
|
787 |
<em>--max_image_MB INT</em> |
|
788 |
</dt> |
|
789 |
<dd> |
|
790 |
<p> |
|
791 |
Max memory to use for images. (type:int default:2000) |
|
792 |
</p> |
|
793 |
</dd> |
|
794 |
<dt class="hdlist1"> |
|
795 |
<em>--verbosity INT</em> |
|
796 |
</dt> |
|
797 |
<dd> |
|
798 |
<p> |
|
799 |
Amount of diagnosting information to output (0-2). (type:int default:1) |
|
800 |
</p> |
|
801 |
</dd> |
|
802 |
</dl></div> |
|
803 |
</div> |
|
804 |
</div> |
|
805 |
<div class="sect1"> |
|
806 |
<h2 id="_history">HISTORY</h2> |
|
807 |
<div class="sectionbody"> |
|
808 |
<div class="paragraph"><p>lstmeval(1) was first made available for tesseract4.00.00alpha.</p></div> |
|
809 |
</div> |
|
810 |
</div> |
|
811 |
<div class="sect1"> |
|
812 |
<h2 id="_resources">RESOURCES</h2> |
|
813 |
<div class="sectionbody"> |
|
814 |
<div class="paragraph"><p>Main web site: <a href="https://github.com/tesseract-ocr">https://github.com/tesseract-ocr</a><br> |
|
815 |
Information on training tesseract LSTM: <a href="https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html">https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html</a></p></div> |
|
816 |
</div> |
|
817 |
</div> |
|
818 |
<div class="sect1"> |
|
819 |
<h2 id="_see_also">SEE ALSO</h2> |
|
820 |
<div class="sectionbody"> |
|
821 |
<div class="paragraph"><p>tesseract(1)</p></div> |
|
822 |
</div> |
|
823 |
</div> |
|
824 |
<div class="sect1"> |
|
825 |
<h2 id="_copying">COPYING</h2> |
|
826 |
<div class="sectionbody"> |
|
827 |
<div class="paragraph"><p>Copyright (C) 2012 Google, Inc. |
|
828 |
Licensed under the Apache License, Version 2.0</p></div> |
|
829 |
</div> |
|
830 |
</div> |
|
831 |
<div class="sect1"> |
|
832 |
<h2 id="_author">AUTHOR</h2> |
|
833 |
<div class="sectionbody"> |
|
834 |
<div class="paragraph"><p>The Tesseract OCR engine was written by Ray Smith and his research groups |
|
835 |
at Hewlett Packard (1985-1995) and Google (2006-present).</p></div> |
|
836 |
</div> |
|
837 |
</div> |
|
838 |
</div> |
내보내기 Unified diff