sureshot Posted July 31, 2013 Share Posted July 31, 2013 (edited) I'm trying to write a script to screencapture a window of on-screen text (a single line of between 1 and 10 characters, depending), and return the OCR'd result to a variable that I can then do whatever I want with. Sample image is attached. So far I've used Tesseract UDF (actually, a modified one called "Simple Tesseract" found on these forums), and can successfully screencapture and crop to the desired region, and send the resulting image through the tesseract engine. However, Tesseract always returns an empty string. I've tried with MODI OCR too, and it can't recognize any text either. "Simple Tesseract" UDF: (Requires Tesseract OCR to be installed) expandcollapse popup#include-once #Include <Array.au3> #Include <File.au3> #include <GDIPlus.au3> #include <ScreenCapture.au3> #include <WinAPI.au3> #include <ScrollBarConstants.au3> #include <WindowsConstants.au3> #Include <GuiComboBox.au3> #Include <GuiListBox.au3> #EndRegion Header #Region Global Variables and Constants Global $last_capture Global $tesseract_temp_path = "C:\" #EndRegion Global Variables and Constants #Region Core functions ; #FUNCTION# ;=============================================================================== ; ; Name...........: _TesseractTempPathSet() ; Description ...: Sets the location where Tesseract functions temporary store their files. ; You must have read and write access to this location. ; The default location is "C:\". ; Syntax.........: _TesseractTempPathSet($temp_path) ; Parameters ....: $temp_path - The path to use for temporary file storage. ; This path must not contain any spaces (see "Remarks" below). ; Return values .: On Success - Returns 1. ; On Failure - Returns 0. ; Author ........: seangriffin ; Modified.......: ; Remarks .......: The current version of Tesseract doesn't support paths with spaces. ; Related .......: ; Link ..........: ; Example .......: No ; ; ;========================================================================================== func _TesseractTempPathSet($temp_path) $tesseract_temp_path = $temp_path Return 1 EndFunc ; #FUNCTION# ;=============================================================================== ; ; Name...........: _TesseractScreenCapture() ; Description ...: Captures text from the screen. ; Syntax.........: _TesseractScreenCapture($get_last_capture = 0, $delimiter = "", $cleanup = 1, $scale = 2, $left_indent = 0, $top_indent = 0, $right_indent = 0, $bottom_indent = 0, $show_capture = 0) ; Parameters ....: $get_last_capture - Retrieve the text of the last capture, rather than ; performing another capture. Useful if the text in ; the window or control hasn't changed since the last capture. ; 0 = do not retrieve the last capture (default) ; 1 = retrieve the last capture ; $delimiter - Optional: The string that delimits elements in the text. ; A string of text will be returned if this isn't provided. ; An array of delimited text will be returned if this is provided. ; Eg. Use @CRLF to return the items of a listbox as an array. ; $cleanup - Optional: Remove invalid text recognised ; 0 = do not remove invalid text ; 1 = remove invalid text (default) ; $scale - Optional: The scaling factor of the screenshot prior to text recognition. ; Increase this number to improve accuracy. ; The default is 2. ; $iLeft - x-Left coordinate ; $iTop - y-Top coordinate ; $iRight - x-Right coordinate ; $iBottom - y-Bottom coordinate ; $show_capture - Display screenshot and text captures ; (for debugging purposes). ; 0 = do not display the screenshot taken (default) ; 1 = display the screenshot taken and exit ; Return values .: On Success - Returns an array of text that was captured. ; On Failure - Returns an empty array. ; Author ........: seangriffin ; Modified.......: ; Remarks .......: Use the default values for first time use. If the text recognition accuracy is low, ; I suggest setting $show_capture to 1 and rerunning. If the screenshot of the ; window or control includes borders or erroneous pixels that may interfere with ; the text recognition process, then use $left_indent, $top_indent, $right_indent and ; $bottom_indent to adjust the portion of the screen being captured, to ; exclude these non-textural elements. ; If text accuracy is still low, increase the $scale parameter. In general, the higher ; the scale the clearer the font and the more accurate the text recognition. ; Related .......: ; Link ..........: ; Example .......: No ; ; ;========================================================================================== func _TesseractScreenCapture($get_last_capture = 0, $delimiter = "", $cleanup = 1, $scale = 2, $iLeft = 0, $iTop = 0, $iRight = 1, $iBottom = 1, $show_capture = 0) Local $tInfo dim $aArray, $final_ocr[1], $xyPos_old = -1, $capture_scale = 3 Local $tSCROLLINFO = DllStructCreate($tagSCROLLINFO) DllStructSetData($tSCROLLINFO, "cbSize", DllStructGetSize($tSCROLLINFO)) DllStructSetData($tSCROLLINFO, "fMask", $SIF_ALL) If $last_capture = "" Then $last_capture = ObjCreate("Scripting.Dictionary") EndIf ; if last capture is requested, and one exists. If $get_last_capture = 1 And $last_capture.item(0) <> "" Then Return $last_capture.item(0) EndIf $capture_filename = _TempFile($tesseract_temp_path, "~", ".tif") $ocr_filename = StringLeft($capture_filename, StringLen($capture_filename) - 4) $ocr_filename_and_ext = $ocr_filename & ".txt" CaptureToTIFF("", "", "", $capture_filename, $scale, $iLeft , $iTop , $iRight , $iBottom ) ShellExecuteWait(@ProgramFilesDir & "\tesseract-OCR\tesseract.exe", $capture_filename & " " & $ocr_filename & " digits") ; If no delimter specified, then return a string If StringCompare($delimiter, "") = 0 Then $final_ocr = FileRead($ocr_filename_and_ext) Else _FileReadToArray($ocr_filename_and_ext, $aArray) _ArrayDelete($aArray, 0) ; Append the recognised text to a final array _ArrayConcatenate($final_ocr, $aArray) EndIf ; If the captures are to be displayed If $show_capture = 1 Then GUICreate("Tesseract Screen Capture. Note: image displayed is not to scale", 640, 480, 0, 0, $WS_SIZEBOX + $WS_SYSMENU) ; will create a dialog box that when displayed is centered GUISetBkColor(0xE0FFFF) $Obj1 = ObjCreate("Preview.Preview.1") $Obj1_ctrl = GUICtrlCreateObj($Obj1, 0, 0, 640, 480) $Obj1.ShowFile ($capture_filename, 1) GUISetState() If IsArray($final_ocr) Then _ArrayDisplay($aArray, "Tesseract Text Capture") Else MsgBox(0, "Tesseract Text Capture", $final_ocr) EndIf GUIDelete() EndIf FileDelete($ocr_filename & ".*") ; Cleanup If IsArray($final_ocr) And $cleanup = 1 Then ; Cleanup the items For $final_ocr_num = 1 to (UBound($final_ocr)-1) ; Remove erroneous characters $final_ocr[$final_ocr_num] = StringReplace($final_ocr[$final_ocr_num], ".", "") $final_ocr[$final_ocr_num] = StringReplace($final_ocr[$final_ocr_num], "'", "") $final_ocr[$final_ocr_num] = StringReplace($final_ocr[$final_ocr_num], ",", "") $final_ocr[$final_ocr_num] = StringStripWS($final_ocr[$final_ocr_num], 3) Next ; Remove duplicate and blank items For $each in $final_ocr $found_item = _ArrayFindAll($final_ocr, $each) ; Remove blank items If IsArray($found_item) Then If StringCompare($final_ocr[$found_item[0]], "") = 0 Then _ArrayDelete($final_ocr, $found_item[0]) EndIf EndIf ; Remove duplicate items For $found_item_num = 2 to UBound($found_item) _ArrayDelete($final_ocr, $found_item[$found_item_num-1]) Next Next EndIf ; Store a copy of the capture If $last_capture.item(0) = "" Then $last_capture.item(0) = $final_ocr EndIf Return $final_ocr EndFunc ; #FUNCTION# ;=============================================================================== ; ; Name...........: CaptureToTIFF() ; Description ...: Captures an image of the screen, a window or a control, and saves it to a TIFF file. ; Syntax.........: CaptureToTIFF($win_title = "", $win_text = "", $ctrl_id = "", $sOutImage = "", $scale = 1, $left_indent = 0, $top_indent = 0, $right_indent = 0, $bottom_indent = 0) ; Parameters ....: $win_title - The title of the window to capture an image of. ; $win_text - Optional: The text of the window to capture an image of. ; $ctrl_id - Optional: The ID of the control to capture an image of. ; An image of the window will be returned if one isn't provided. ; $sOutImage - The filename to store the image in. ; $scale - Optional: The scaling factor of the capture. ; $iLeft - x-Left coordinate ; $iTop - y-Top coordinate ; $iRight - x-Right coordinate ; $iBottom - y-Bottom coordinate ; $bottom_indent - A number of pixels to indent the screen capture from the ; bottom of the window or control. ; Return values .: None ; Author ........: seangriffin ; Modified.......: ; Remarks .......: ; Related .......: ; Link ..........: ; Example .......: No ; ; ;========================================================================================== Func CaptureToTIFF($win_title = "", $win_text = "", $ctrl_id = "", $sOutImage = "", $scale = 1, $iLeft = 0, $iTop = 0, $iRight = 1, $iBottom = 1) Local $hWnd, $hwnd2, $hDC, $hBMP, $hImage1, $hGraphic, $CLSID, $tParams, $pParams, $tData, $i = 0, $hImage2, $pos[4] Local $Ext = StringUpper(StringMid($sOutImage, StringInStr($sOutImage, ".", 0, -1) + 1)) Local $giTIFColorDepth = 24 Local $giTIFCompression = $GDIP_EVTCOMPRESSIONNONE ; If capturing a control if StringCompare($ctrl_id, "") <> 0 Then $hwnd2 = ControlGetHandle($win_title, $win_text, $ctrl_id) $pos[0] = 0 $pos[1] = 0 $pos[2] = $iRight - $iLeft $pos[3] = $iBottom - $iTop Else ; If capturing a window if StringCompare($win_title, "") <> 0 Then $hwnd2 = WinGetHandle($win_title, $win_text) $pos[0] = 0 $pos[1] = 0 $pos[2] = $iRight - $iLeft $pos[3] = $iBottom - $iTop Else ; If capturing the desktop $hwnd2 = "" $pos[0] = 0 $pos[1] = 0 $pos[2] = $iRight - $iLeft $pos[3] = $iBottom - $iTop EndIf EndIf ; Capture an image of the window / control if IsHWnd($hwnd2) Then WinActivate($win_title, $win_text) $hBitmap2 = _ScreenCapture_CaptureWnd("", $hwnd2, $iLeft, $iTop, $iRight, $iBottom, False) Else $hBitmap2 = _ScreenCapture_Capture("", $iLeft, $iTop, $iRight, $iBottom, False) EndIf _GDIPlus_Startup () ; Convert the image to a bitmap $hImage2 = _GDIPlus_BitmapCreateFromHBITMAP ($hBitmap2) $hWnd = _WinAPI_GetDesktopWindow() $hDC = _WinAPI_GetDC($hWnd) $hBMP = _WinAPI_CreateCompatibleBitmap($hDC, $pos[2] * $scale , $pos[3] * $scale) _WinAPI_ReleaseDC($hWnd, $hDC) $hImage1 = _GDIPlus_BitmapCreateFromHBITMAP ($hBMP) $hGraphic = _GDIPlus_ImageGetGraphicsContext($hImage1) _GDIPLus_GraphicsDrawImageRect($hGraphic, $hImage2, 0 , 0 , $pos[2] * $scale, $pos[3] * $scale) $CLSID = _GDIPlus_EncodersGetCLSID($Ext) ; Set TIFF parameters $tParams = _GDIPlus_ParamInit(2) $tData = DllStructCreate("int ColorDepth;int Compression") DllStructSetData($tData, "ColorDepth", $giTIFColorDepth) DllStructSetData($tData, "Compression", $giTIFCompression) _GDIPlus_ParamAdd($tParams, $GDIP_EPGCOLORDEPTH, 1, $GDIP_EPTLONG, DllStructGetPtr($tData, "ColorDepth")) _GDIPlus_ParamAdd($tParams, $GDIP_EPGCOMPRESSION, 1, $GDIP_EPTLONG, DllStructGetPtr($tData, "Compression")) If IsDllStruct($tParams) Then $pParams = DllStructGetPtr($tParams) ; Save TIFF and cleanup _GDIPlus_ImageSaveToFileEx($hImage1, $sOutImage, $CLSID, $pParams) _GDIPlus_ImageDispose($hImage1) _GDIPlus_ImageDispose($hImage2) _GDIPlus_GraphicsDispose ($hGraphic) _WinAPI_DeleteObject($hBMP) _GDIPlus_Shutdown() EndFunc Test Code: (The 626,148,654,167 parameters specify the screen coordinates to crop the screencapture to. The resulting image is a white "18" on a red background, and is attached to the post) #include <SimpleTesseract.au3> $OCR_Result = _TesseractScreenCapture(0,"",1,1,626,148,654,167,1) However, if I put the screen captured image through http://www.free-ocr.com/ (which itself uses Tesseract), the text always works with 100% accurate results.The FAQ at www.free-ocr.com website says that the only pre-processing they do prior to Tesseract is reducing background noise, and adjusting resolution. This leads me to believe that I need to perform some pre-OCR processing. So, my question is... how can I perform this OCR image pre-processing through autoit? (Maybe through GDI Plus, or through a command line interface). Of course, I'm open to alternatives to Tesseract or OCR altogether if the right solution comes along. I'm relatively new to autoit, and am not too familiar with a lot of the deeper, built-in functionality and interfacing autoit can do with windows, etc. Thanks! OCR Test Image.bmp Edited July 31, 2013 by sureshot Link to comment Share on other sites More sharing options...
JohnOne Posted July 31, 2013 Share Posted July 31, 2013 Post your code, all of it runnable, along with the udf you are using, for better help. AutoIt Absolute Beginners  Require a serial  Pause Script  Video Tutorials by Morthawt  ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
sureshot Posted July 31, 2013 Author Share Posted July 31, 2013 Edited my first post. Thanks Link to comment Share on other sites More sharing options...
JohnOne Posted July 31, 2013 Share Posted July 31, 2013 Try scaling up the image $OCR_Result = _TesseractScreenCapture(0,"",1,3,626,148,654,167,1) AutoIt Absolute Beginners  Require a serial  Pause Script  Video Tutorials by Morthawt  ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
sureshot Posted July 31, 2013 Author Share Posted July 31, 2013 (edited) Scaling up the image doesn't change the result. Still nothing. I suspect that simply blowing up the image won't improve the OCR result, but if I can 1) increase the image resolution, and 2) filter the background noise and edges, through autoit, then I suspect tesseract will find the correct result. (This is what www.free-ocr.com does, anyway, and it works like a charm) Edited July 31, 2013 by sureshot Link to comment Share on other sites More sharing options...
JohnOne Posted July 31, 2013 Share Posted July 31, 2013 Change ShellExecuteWait(@ProgramFilesDir & "tesseract-OCRtesseract.exe", $capture_filename & " " & $ocr_filename & " digits") For ShellExecuteWait(@ProgramFilesDir & "tesseract-OCRtesseract.exe", '"' & $capture_filename & '" "' & $ocr_filename & '"') Worked for me. AutoIt Absolute Beginners  Require a serial  Pause Script  Video Tutorials by Morthawt  ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
sureshot Posted July 31, 2013 Author Share Posted July 31, 2013 Hi John, Unfortunately I am unable to replicate your success - my _TesseractScreenCapture function call still returns nothing, even with the syntax change that you suggested above. Thanks for your help. Can you please confirm that you are able to get the numeric value 18 returned from the Tesseract function call and stored in the variable $OCR_Result? If so then I must be missing something... Thanks again Link to comment Share on other sites More sharing options...
sureshot Posted July 31, 2013 Author Share Posted July 31, 2013 I should clarify that I can get the above autoit script to work in general. Example, if I specify the coordinates around the word "Google" in my desktop's Google Chrome icon, then Tesseract correctly returns the variable "Google". However, in the example that I provided in the initial post, Tesseract cannot determined "18" from the attached image of a white 18 on a red background. This is my motivation for performing image pre-processing through autoit before passing it to Tesseract for OCR. Link to comment Share on other sites More sharing options...
JohnOne Posted July 31, 2013 Share Posted July 31, 2013 Provide the whole exact code you personally are using. AutoIt Absolute Beginners  Require a serial  Pause Script  Video Tutorials by Morthawt  ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
sureshot Posted July 31, 2013 Author Share Posted July 31, 2013 That is the whole exact code that I am using. When I use the code: #include <SimpleTesseract.au3> $OCR_Result = _TesseractScreenCapture(0,"",1,1,626,148,654,167,1) The _TesseractScreenCapture function (via the SimpleTesseract.au3 UDF in the original post) gets the correct image from my display (a white 18 on a red background, image file attached). However, in this case, when the screen-captured image file gets sent to Tesseract, Tesseract doesn't recognize any text (much less the number 18 that I'm trying to get from the function). Even if I send the image directly through Tesseract - not through autoit but from command line interface - Tesseract doesn't recognize the text. However, if I send it through www.free-ocr.com, which supposedly pre-processes the image (filtering, resolution) before utilizing the same Tesseract OCR engine, the number 18 is successfully returned. ScreenCaptureImage.bmp Link to comment Share on other sites More sharing options...
JohnOne Posted July 31, 2013 Share Posted July 31, 2013 Maybe you should convert the image to black and white. With that number 18 image, it works fine for me with scaling of 2 AutoIt Absolute Beginners  Require a serial  Pause Script  Video Tutorials by Morthawt  ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
Geir1983 Posted August 1, 2013 Share Posted August 1, 2013 What version of Tesseract have you installed? I have previously tested the UDF and it worked with 2.01. Link to comment Share on other sites More sharing options...
sureshot Posted August 1, 2013 Author Share Posted August 1, 2013 I still can't replicate John's success with the sample image. I've done much testing, and both the script and tesseract-ocr work with certain other texts I'm looking for (although not reliably), but I still cannot get the sample image to recognize the text in the sample image (white 18 on red background), regardless of the $scale parameter in the _TesseractScreenCapture function call. I tried John's suggestion of converting image to black-and-white (manually, using MS Paint), and greyscale (through autoit, using a GDI Plus function as in ) I have Tesseract 3.01 installed (up-to-date). If anyone can get the number 18 returned to a variable in autoit, using Tesseract or otherwise, can you please post in detail, exactly what steps/code was used. (John, can you please elaborate on your success?) Link to comment Share on other sites More sharing options...
JohnOne Posted August 1, 2013 Share Posted August 1, 2013 I opened the picture in paint and read it from the screen. What scales have you tried? you will probably need a higher scale for higher resolution screen. AutoIt Absolute Beginners  Require a serial  Pause Script  Video Tutorials by Morthawt  ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
Geir1983 Posted August 2, 2013 Share Posted August 2, 2013 (edited) Try installing the 2.01 version instead, the UDF might not be compatible to that version.... Edit: Tested with 3.01 and it works.. (did not try your image just some random text). Edited August 2, 2013 by Geir1983 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now