vietocr.net是c#編寫的字符識(shí)別程序,首先進(jìn)行jpg,gif,bmp到tiff的轉(zhuǎn)換,這個(gè)用自帶的畫圖就可以。然后使用VietOCR.NET-4.3進(jìn)行多張tiff的merge。VietOCR.NET是基于OCR的應(yīng)用,旨在幫助您執(zhí)行打字機(jī)打印或掃描的圖像轉(zhuǎn)換為可編輯的文本。
主窗口由2個(gè)小組,左邊一個(gè)就可以查看您要處理的照片,另外可以分析從圖片中提取出文字。除了預(yù)覽掃描的信息,右邊也是該地區(qū)在那里你可以進(jìn)行必要的修改文本。
Features
Java & .NET GUI frontends for Tesseract OCR engine
Supports all languages provided by Tesseract
Supports automatic download and installation of language packs
PDF, TIFF, JPEG, GIF, PNG, BMP image formats
Paste image from clipboard
Selection box for Region of Interest (ROI)
File drag-and-drop
Bulk & batch operations
Text replacement postprocessing
Integrated scanning support
Spellcheck with Hunspell
Make Box Files。在orderNo.tif所在的目錄下打開一個(gè)命令行,輸入
C:Program FilesTesseract-OCR>tesseract.exe lang.jhy.exp8.TIF lang.jhy.exp8 batch.nochop makebox
使用jTessBoxEditor打開orderNo.tif文件,需要記住的是第2步生成的orderNo.box要和這個(gè)orderNo.tif文件同在一個(gè)目錄下。逐個(gè)校正文字,后保存。
下載jTessBoxEditor工具進(jìn)行每個(gè)自的糾正(注意有nextpage逐頁進(jìn)行糾正)
官方介紹:
PDF, TIFF, JPEG, GIF, PNG, BMP image formats
Multi-page TIFF images
Screenshots
Selection box
File drag-and-drop
Paste image from clipboard
Postprocessing for Vietnamese to boost accuracy rate
Vietnamese input methods
Localized user interface for many languages (Localization project)
Integrated scanning support
Watch folder monitor for support of batch processing
Custom text replacement in postprocessing
Spellcheck with Hunspell
Support for downloading and installing language data packs and appropriate spell dictionaries
Bravenet Counter Stats
Powered by Bravenet
View Statistics