Tutorial 12. OCR Example

Tutorial 12. OCR Example


Python is used to run an OCR engine, tesseract-ocr, over all image files in a folder and get single text file output. (Python file at pythonaudio.blogspot.com)
Closed Caption:

why this is an example of a python
applications
or CR stands for optical character
recognition
typically you have some image files
maybe from scanning or after using the
Print Screen key
on the keyboard the object is to parse
all of them with some OC our engine
and convert them to text file
to good no creo tesseract Ozer
my you may get it at this web page
you can see from the information on this
page that they develop this program at
HP Labs
between 1985 and 1995
later in that page you can see bed after
2006
it was further developed at Google
further it is a very accurate LCR
and it is free my through
on this page there are links to download
the setup files for all operating
systems
for Windows this is the setup file
for after agree to the License
and selecting the defaults below CDR
successfully installed by
five times after installing it open the
command prompt
we can see the program was correctly
installed by typing the the flag by
6 you can do the same thing in Python
you have to import the OS my gmail
the OS modules for interacting with the
operating system
one of the most useful functions is
system
which allows for passing any string to
the command prompt so with me executed
my sevens to use tesseract Ozer
we have to write tesseract then image
file name
and finally text names for the image
file
we have to get the extension for the
text name we do not
as it always assumes it is txt my
8 we can think up tesseract Ozer as
back-end
and here we are using Python as
front-end by
that is we interact with the front end
and it deals with the back end
by 9 in our example
this is the content of a folder here we
have a CRP why
the height on file to be explained later
and some image files
I'll 10 by we run the program using
percent run command tonight Python
we did outputs from our program as well
as those from the OC our program I
11 my now the program or CRP ones
described by
be the first to reach the name of all
image files and puts them in a python
list
this is done in lines 12 11 a
the format of the image file name will
have some prefix
some number and finally the image
extension which is png
in our example by
12 we have to find what is the maximum
size and characters of the number field
also we have to find the prefix by
we generate some helpful printouts so we
know everything is working okay then we
have a loop to run for each image file
this step will generate has many text
files
as we have input image files next all
text files are read into a list
finally that was the same to the file
out txt
my 13 after the module OS is imported
but current directory is found and then
we find the list of files
in the current directory and assign it
to the Phils list
financials list is filtered sell only
those which have the correct extension
our infill I'm list
by 14 my
we find the length and characters on the
maximum image file names
this is done so certain functions can
work as we shall see later
a temporary list is created by a list
comprehension
this will have a list upstream months
along the image files
we find a maximum length and subtract
four from it
the reason we subtract is because this
trend
PNG takes four characters this size
contains the prefix as well as number
my 50 my
now we find the prefix an empty list is
created
and then we had to rate overall
characters and file names
we use the first image file for Lyme 0
since it always exist we could always
test for filing systems which would be a
better designs
the empty list is upended by each
character
bet it's not a digit once it is a digit
we break out above for Lou and use the
strain function join
to join all the characters in the list
by
16 now we have to print statements
which would serve as a check we could
immediately see
the program calculated the prefix
correctly as well as found the correct
number of image files
you might want to print everything such
as the content so the list for Lyme
if you want my seventy
my now for each image file in the list
for Lyme
blow CR's Ron the two poll which is
called up has four elements
the first his pill the current looping
variable and buses is one of the image
files
the second is the prefix such an ass p
the third element is the size as the
number field and characters
finally the number is extracted from the
file names
this to Paul will populate the four
placeholders in the ass
string expression the first percent S is
the first element
the file names the second percent as
refers to the prefix
then we write the number which is zero
format
the symbols car indicates the size is
calculated at runtime
since it might be different depending on
the file names
the simple star will get the maximum
number field link
so it will right now known as 09
in our example the percent the is the
integer corresponding to the fourth
element of the two people
my 18
with the new files created we again
inquire about what files exist
and then we keep only the text files in
the list fáil txt
my finally this list is sorted to be in
the correct order
like 09 10 11 by
19
next all the text files are read and
then inserted into a list
the link to the list is not important
however
it is the number of lines in all text
files my
20 my finally that list is written to a
new file called
out txt the source code for a CRP why is
that
Python audio dot blot five dot com my
21 by you will find additional
information
including a larger image up the slides
and text of the audio
at Python audio dot blot spot dot com I

Video Length: 07:41
Uploaded By: PythonAudio
View Count: 3,498

Related Software Products
OCR Folder
OCR Folder

Published By:
File Innovations

Description:
BRBR [Version 7.0 OCRTools: OCRFolder Desktop] BRBR The OCR Folder Desktop application generates Text from multiple Bitmap or Image Files in a Folder. User defined zones convert select areas of images to a Text Delimited file, XML, or Excel. Recognizes both ASCII text and Barcodes. BRBR Converts English, Spanish, Italian, French, German, and Swedish charactersets. BRBRBRBR


Related Videos
Zone OCR with Batch Scan Pro
Zone OCR with Batch Scan Pro

Zone OCR your documents to automate file naming and folder creation!
Video Length: 01:57
Uploaded By: agsupport
View Count: 733

OCR Nationals Level 2 ICT - AO1 - File and Folder Handling - Mr Rawat
OCR Nationals Level 2 ICT - AO1 - File and Folder Handling - Mr Rawat

OCR Nationals Level 2 ICT - AO1 - File and Folder Handling - Mr Rawat
Video Length: 03:18
Uploaded By: Mohammad Rawat
View Count: 653

Zonal OCR Demo: OCR PDF Data Capture
Zonal OCR Demo: OCR PDF Data Capture

A step by step demonstration on how to go about converting a scanned image into PDF format, selecting data capture zones on this image, and automating its delivery to a pre-determined folder by using Reform VDP's Zonal OCR feature. Website: http://www.fabsoft.com/products/reform-vdp.html Twitter Page: https://twitter.com/FabsoftTech Facebook Page: https://www.facebook.com/FabSoft-1495098320707428/ Google+ Page: https://plus.google.com/117921579101394566435/postsbr ...
Video Length: 07:42
Uploaded By: FabSoftTechnology
View Count: 651

OCR NATIONALS AO1 FILE MANAGEMENT TUTORIAL 15 IN 1 COLLECTION.wmv
OCR NATIONALS AO1 FILE MANAGEMENT TUTORIAL 15 IN 1 COLLECTION.wmv

OCR NATIONALS GCSE: AO1 File Management Videos 1 Creating a folder structure 2 Saving a file in a suitable folder with other files 3 Copying a file 4 Deleting a file 5 Moving a file 6 Renaming a file 8 Search For a file 10 Create Shortcut, Rename & Delete of Folder 11 Create Shortcut, Rename & Delete of File 12 Create Shortcut, Rename & Delete of Program 13 Locate and Open a file 14 Back up ...
Video Length: 13:44
Uploaded By: Varietyshow100
View Count: 566

Aspose.OCR Java for IntelliJ IDEA (Maven) - Plugin v1.0.0.0 Released!
Aspose.OCR Java for IntelliJ IDEA (Maven) - Plugin v1.0.0.0 Released!

Aspose.OCR Java for IntelliJ IDEA (Maven) is an IntelliJ IDEA Plugin. The Plugin lets you create maven projects (created project contains Aspose.OCR for Java API maven dependency) and download source code examples for using Aspose.OCR for Java API. This plugin helps java developers to comfortably work with Aspose.OCR for Java API within the IntelliJ IDEA. Aspose.OCR for Java Aspose.OCR for Java is a character recognition API that allows developers to ...
Video Length: 05:08
Uploaded By: Aspose Marketplace
View Count: 418

File Name and Folder Structure by OCR on ChronoScan Tips and Tricks
File Name and Folder Structure by OCR on ChronoScan Tips and Tricks

On this video we will learn how to use OCR information captured by ChronoScan to set out export folder structure and file names. http://chronoscan.org/ https://twitter.com/chronoscan https://www.linkedin.com/company/chronoscan-capture https://www.facebook.com/Chronoscan Chronoscan@chronoscan.org Support@chronoscan.org hr / bClosed Caption:/b though everyone keep you here today we are going to learn how to ...
Video Length: 03:17
Uploaded By: ChronoScan Advanced Scan & OCR Software
View Count: 370

Nuance AutoStore Mobile (Submit, OCR, Archive to Folder).
Nuance AutoStore Mobile (Submit, OCR, Archive to Folder).

See how the AutoStore mobile app can capture email attachments from any iOS or Android device and convert to text searchable (PDF, Word, Excel etc) and store directly into line of business systems (Folder, SharePoint, Dropbox GoogleDrive, DocuShare etc). hr / bClosed Caption:/b Arcelor mobile Hapoel with a touch on how you can be able to access a your business process directly at your mobile device or tablet in this case my open up an ...
Video Length: 02:57
Uploaded By: AutoStore Verticals
View Count: 297

Hot Folder OCR - Convert PDFs to MS Word format
Hot Folder OCR - Convert PDFs to MS Word format

Convert PDFs & Tiffs to Word without having to scan. This video shows how to convert email attachments and how to upload documents to "Hot Folders" for fast, easy conversion. hr / bClosed Caption:/b on in sometimes do you the ability to convert doc cheatham PDF format to Word or Excel format or even text searchable PDF is especially helpful agreements for contracts needs minor modifications but they're in PDF format for made ...
Video Length: 03:20
Uploaded By: QuikBox Office Appliances
View Count: 174

Copyright © 2025, Ivertech. All rights reserved.