KaFu Posted April 17, 2022 Share Posted April 17, 2022 (edited) Try “fa-IR”? Also check if it is supported by your OS and install additional language pack if required (and available). Edited April 17, 2022 by KaFu Danyfirex and g0gcd 2 OS: Win10-22H2 - 64bit - German, AutoIt Version: 3.3.16.1, AutoIt Editor: SciTE, Website: https://funk.eu AMT - Auto-Movie-Thumbnailer (2024-Oct-13) BIC - Batch-Image-Cropper (2023-Apr-01) COP - Color Picker (2009-May-21) DCS - Dynamic Cursor Selector (2024-Oct-13) HMW - Hide my Windows (2024-Oct-19) HRC - HotKey Resolution Changer (2012-May-16) ICU - Icon Configuration Utility (2018-Sep-16) SMF - Search my Files (2024-Oct-20) - THE file info and duplicates search tool SSD - Set Sound Device (2017-Sep-16) Link to comment Share on other sites More sharing options...
SaeidN Posted April 22, 2022 Share Posted April 22, 2022 On 4/17/2022 at 11:55 AM, KaFu said: Try “fa-IR”? Also check if it is supported by your OS and install additional language pack if required (and available). Yes, I tried fa-IR, farsi-IR, Per-IR, persian-IR, fa, persian, per. None of them worked. I've also installed additional persian lang. pack on my Windows 11. Link to comment Share on other sites More sharing options...
KaFu Posted April 22, 2022 Share Posted April 22, 2022 #include <array.au3> #include <UWPOCR.au3> _ArrayDisplay(_UWPOCR_GetSupportedLanguages()) This should display all supported languages and codes. OS: Win10-22H2 - 64bit - German, AutoIt Version: 3.3.16.1, AutoIt Editor: SciTE, Website: https://funk.eu AMT - Auto-Movie-Thumbnailer (2024-Oct-13) BIC - Batch-Image-Cropper (2023-Apr-01) COP - Color Picker (2009-May-21) DCS - Dynamic Cursor Selector (2024-Oct-13) HMW - Hide my Windows (2024-Oct-19) HRC - HotKey Resolution Changer (2012-May-16) ICU - Icon Configuration Utility (2018-Sep-16) SMF - Search my Files (2024-Oct-20) - THE file info and duplicates search tool SSD - Set Sound Device (2017-Sep-16) Link to comment Share on other sites More sharing options...
SaeidN Posted April 22, 2022 Share Posted April 22, 2022 2 hours ago, KaFu said: #include <array.au3> #include <UWPOCR.au3> _ArrayDisplay(_UWPOCR_GetSupportedLanguages()) This should display all supported languages and codes. I tried fa, and using UWPOCR nothing showed. Link to comment Share on other sites More sharing options...
KaFu Posted April 23, 2022 Share Posted April 23, 2022 (edited) I've installed Persian and tested it. The GetText throws an error 7 here: _UWPOCR_Log("FAIL __UWPOCR_GetText -> WaitForAsync IOcrResult") So the OCR engine does not seem to respond, that's where it lost me :), sorry, have no further clue. Here's a test sentence: Edited April 23, 2022 by KaFu Parsix and Hashim 2 OS: Win10-22H2 - 64bit - German, AutoIt Version: 3.3.16.1, AutoIt Editor: SciTE, Website: https://funk.eu AMT - Auto-Movie-Thumbnailer (2024-Oct-13) BIC - Batch-Image-Cropper (2023-Apr-01) COP - Color Picker (2009-May-21) DCS - Dynamic Cursor Selector (2024-Oct-13) HMW - Hide my Windows (2024-Oct-19) HRC - HotKey Resolution Changer (2012-May-16) ICU - Icon Configuration Utility (2018-Sep-16) SMF - Search my Files (2024-Oct-20) - THE file info and duplicates search tool SSD - Set Sound Device (2017-Sep-16) Link to comment Share on other sites More sharing options...
VAG Posted June 2, 2022 Share Posted June 2, 2022 I used this udf to OCR the content in Command Prompt. After I adjusted the ClearType settings in Control Panel. The recognition rate becomes very poor. Now I can't get back to to the initial result even I disabled it. Is there any requirement stated in the API reference? Noticed the OCR on number string are not very accurate. Link to comment Share on other sites More sharing options...
KaFu Posted June 2, 2022 Share Posted June 2, 2022 Don't know about the ClearType setting, but maybe using a different font type and size for the command prompt will increase accuracy? Create a shortcut to cmd.exe, in the right-click properties you can adjust the layout settings. OS: Win10-22H2 - 64bit - German, AutoIt Version: 3.3.16.1, AutoIt Editor: SciTE, Website: https://funk.eu AMT - Auto-Movie-Thumbnailer (2024-Oct-13) BIC - Batch-Image-Cropper (2023-Apr-01) COP - Color Picker (2009-May-21) DCS - Dynamic Cursor Selector (2024-Oct-13) HMW - Hide my Windows (2024-Oct-19) HRC - HotKey Resolution Changer (2012-May-16) ICU - Icon Configuration Utility (2018-Sep-16) SMF - Search my Files (2024-Oct-20) - THE file info and duplicates search tool SSD - Set Sound Device (2017-Sep-16) Link to comment Share on other sites More sharing options...
Patrik96 Posted June 27, 2022 Share Posted June 27, 2022 (edited) Is there a min. size the picture has to be? Because for small pictures(139x26 in my case) it doesnt work. However if i make the picture bigger without changing the size of the text, it will detect it properly. I have attached both files. The first one (Test.jpg) doesnt work. The second bigger one (Test2.jpg) does work Any clue whats going on? Edited June 27, 2022 by Patrik96 Link to comment Share on other sites More sharing options...
Nick3399 Posted September 15, 2022 Share Posted September 15, 2022 (edited) Hello, I am interested in trying this out on my own program however I have a quick question. I will be trying to use this on an application (would prefer not to take screenshots) so I would use the second example from @mLipok. 1) if I am trying to find the location of a given text, ex “hello” and then get those location details to eventually left click center of that word, how would I add that? Edited September 15, 2022 by Nick3399 Link to comment Share on other sites More sharing options...
g0gcd Posted September 25, 2022 Share Posted September 25, 2022 Superb toolset, many thanks. In my process, I'm trying to read small black text/white background boxes placed on a page wide graphic. The decode is about 75% reliable and I'm looking for tips to improve that. The code is still way to messy to post here. The process is, in summary: 1. Use a WebCapture routine to capture the full page to a 1280x768 bitmap on a hidden window. 2. Convert bmp, using handle, to image using _GDIPlus_BitmapCreateFromHBITMAP($hBmp) 3. Crop image to extract the required box, using _GDIPlus_BitmapCloneArea() 4. Create a 100x200 blank white canvas and merge the cropped image into the middle of it, using _GDIPlus_ImageGetGraphicsContext() and _GDIPlus_GraphicsDrawImage(). I do this because the OCR is unhappy about small images (but will detect small text on a large enough file!) 5. Finally using _UWPOCR_GetText() to extract the text I have tried enlarging the cropped image to a larger size, using _GDIPlus_ImageResize() instead of step 4 but this introduced extraneous noise (randomly coloured pixels) around the character edges, which affected decode reliability. Any suggestions on process techniques to maximise the OCR reliability, whilst retaining the inherent simplicity of using the built in Win10 OCR capabilities? I'm not, on this occasion, looking for coding; it's more about suggestions on whether, for example, I should try capturing a bigger web page first, or if there's a way of specifying the font/size/content type which would give the OCR module a tighter focus, etc. Thanks John Win10 x64 Autoit 3.3.14.5 (compiling to x86) Link to comment Share on other sites More sharing options...
Danyfirex Posted September 25, 2022 Author Share Posted September 25, 2022 It would be great to see the input image to be sure what suggestions to give you. Saludos Danysys.com AutoIt... UDFs: VirusTotal API 2.0 UDF - libZPlay UDF - Apps: Guitar Tab Tester - VirusTotal Hash Checker Examples: Text-to-Speech ISpVoice Interface - Get installed applications - Enable/Disable Network connection PrintHookProc - WINTRUST - Mute Microphone Level - Get Connected NetWorks - Create NetWork Connection ShortCut Link to comment Share on other sites More sharing options...
g0gcd Posted September 26, 2022 Share Posted September 26, 2022 (edited) Thanks Dany, Typical source snip attached as file TIM2.bmp, with GDIPlus_ImageT.jpg showing how it appears just before submitting to UWPOCR for processing, as follows: #Include <UWPOCR.au3> Local $sTIMTextResult = _UWPOCR_GetText(@ScriptDir & "\GDIPlus_ImageT.jpg", "en-GB", False);True) msgbox(0,"Capture Time", $sTIMTextResult) $sTIMTextResult = StringReplace($sTIMTextResult," ","") $sTIMTextResult = StringReplace($sTIMTextResult,":","") $sTIMTextResult = StringReplace($sTIMTextResult,".","") $sTIMTextResult = StringReplace($sTIMTextResult,"O","0") $sTIMTextResult = StringReplace($sTIMTextResult,"C","0") $sTIMTextResult = StringReplace($sTIMTextResult,"I","1") $sTIMTextResult = StringLeft($sTIMTextResult,2) & ":" & StringMid($sTIMTextResult,3,2) & ":" & StringMid($sTIMTextResult,5,2) msgbox(0,"Modified Capture Time", $sTIMTextResult) I've used the cleanup code with some success to allow for misreads of the colons and 0/1 as O,C or I. Since posting yesterday, I further experimented, a bit more systematically and found that: - So long as the overall image was big enough, UWPOCR would at least try to decode the image - The size of the text in the image, or the image itself, made little difference to the success - Language setting and "UseOCRLine" parameters made no observable difference What seems to be most promising at the moment is that I have just introduced a filter to force each pixel in the source image to either black or white, based on R,G and B being all greater than 240 being white, otherwise black. The decode reliability has increased dramatically. I can get away with the extra time used because the function is called infrequently and has a long window of opportunity to complete; also the images involved are comparatively small. So I think I have a solution to the immediate problem. I believe that my source image may be not pure black and white and that's what is at the root of the poor decoding. To me, that points to the Windows 10 native OCR being weak - the UDF is certainly working well and is remarkably easy to understand, use and integrate. I'd certainly be interested in your thoughts. Thanks John //Edit: Well that theory has just been blown out of the water. I realised that the attached images were captured after I had applied the filter. So I turned it off to run new images... and the OCR is working fine !?! So new images attached without filter described above. TIM2.bmp Edited September 26, 2022 by g0gcd Updated information Link to comment Share on other sites More sharing options...
Danyfirex Posted September 26, 2022 Author Share Posted September 26, 2022 Hello @g0gcd What I would do is to append your image to an image with a similar text pattern so that the OCR engine can get a better result. So you will end up with a joined image like this one: then process it with the OCR. Test Code: #include <ScreenCapture.au3> #include <GDIPlus.au3> #include "..\UWPOCR.au3" _Example() Func _Example() _GDIPlus_Startup() ;hImage/hBitmap GDI Local $hTimer = TimerInit() Local $sImageFilePath = @ScriptDir & "\JoinedImage.jpg" Local $sImageTIM2FilePath = @ScriptDir & "\TIM2.bmp" Local $sText = "0123456789" Local Const $iW = 270, $iH = 40 Local $hImageToProcess = _GDIPlus_ImageLoadFromFile($sImageTIM2FilePath) Local $hBitmap = _GDIPlus_BitmapCreateFromScan0($iW, $iH) ;create an empty bitmap Local $hBmpCtxt = _GDIPlus_ImageGetGraphicsContext($hBitmap) ;get the graphics context of the bitmap _GDIPlus_GraphicsSetSmoothingMode($hBmpCtxt, $GDIP_SMOOTHINGMODE_HIGHQUALITY) _GDIPlus_GraphicsClear($hBmpCtxt, 0xFFFFFFFF) ;clear bitmap with color white _GDIPlus_GraphicsDrawString($hBmpCtxt, $sText, 0, 0, "Arial", 18) ;draw some text to the bitmap _GDIPlus_GraphicsDrawImage($hBmpCtxt, $hImageToProcess,140, -16) _GDIPlus_ImageSaveToFile($hBitmap, $sImageFilePath) ;save bitmap to disk Local $sOCRTextResult = _UWPOCR_GetText($sImageFilePath) MsgBox(0, "Time Elapsed: " & TimerDiff($hTimer),StringStripWS(StringReplace( $sOCRTextResult,$sText,""), $STR_STRIPALL)) _GDIPlus_GraphicsDispose($hBmpCtxt) _GDIPlus_ImageDispose($hImageToProcess) _GDIPlus_BitmapDispose($hBitmap) _GDIPlus_Shutdown() EndFunc ;==>_Example Saludos g0gcd 1 Danysys.com AutoIt... UDFs: VirusTotal API 2.0 UDF - libZPlay UDF - Apps: Guitar Tab Tester - VirusTotal Hash Checker Examples: Text-to-Speech ISpVoice Interface - Get installed applications - Enable/Disable Network connection PrintHookProc - WINTRUST - Mute Microphone Level - Get Connected NetWorks - Create NetWork Connection ShortCut Link to comment Share on other sites More sharing options...
g0gcd Posted September 27, 2022 Share Posted September 27, 2022 (edited) That's inspired! ✔️ I'll try that approach and let you know how it goes. Brilliant, thanks John EDIT: Thank you so much DanyFirex! By experiment I have found: 1. The canvas (combined image) itself needs to be of a significant size. I found that 200x200 pixels was the minimum; any less than this caused intermittent decoding irrespective of the image quality. My solution uses 900 wide by 300 tall. 2. The helper text provided assistance wherever it was placed on the canvas but best improvement came with placing it immediately to the left of the source image text, with about 1 or 2 "spaces" gap between the helper text and the source image text. No further improvement came from adding alphabetic characters or punctuation to the helper text. (Note: en-gb used) 3. The helper text helped immensely whatever font and size was used for it, but in my case, the best improvement came from using the same font and size as the source (Arial, 27pt). 4. I found that the source image text font size should be between 12 and 30. Too big and the OCR misreads and too small, the OCR doesn't see small characteristic differences. My source was 9pt from the capture process, which I enlarged to 27pt. 5. With poor source images, it does help to force pixels into pure black/white before merging into the canvas. There may be a UDF out there that does that but I did it by hand as I needed to find the edge of my source "white" box within a coloured image anyway. 6. The OCR seems to "like" a substantial white border around the two elements. If the helper text was too close to any edge, I had problems decoding. Between 50 and 100 pixels seemed to be a minimum acceptable border. (Note, this may also explain 1.) 7. I inserted __UWPOCR_Initialize() before each _UWPOCR_GetText() function call. (I process several boxes on each run). I can't see why this might be helpful but, whilst running repeated, frequent, testing I found several results were corrupted with previous or non-sensical values. I will continue to hunt my code for an error on my part, where I haven't cleared a variable, or have inadvertently re-used it! I hope these notes are helpful to anyone else who is struggling with decoding a less than perfect source image. My infinite thanks to DanyFirex for the pointer towards using "helper" text, as that unlocked a massive improvement in reliability. Best Regards / Saludos John Edited September 28, 2022 by g0gcd Update with findings Link to comment Share on other sites More sharing options...
CommZ3 Posted January 30, 2023 Share Posted January 30, 2023 Can this script be used to return the coordinates of a specific word in an image please? Link to comment Share on other sites More sharing options...
Danyfirex Posted January 30, 2023 Author Share Posted January 30, 2023 @CommZ3 check Example 4. Saludos Danysys.com AutoIt... UDFs: VirusTotal API 2.0 UDF - libZPlay UDF - Apps: Guitar Tab Tester - VirusTotal Hash Checker Examples: Text-to-Speech ISpVoice Interface - Get installed applications - Enable/Disable Network connection PrintHookProc - WINTRUST - Mute Microphone Level - Get Connected NetWorks - Create NetWork Connection ShortCut Link to comment Share on other sites More sharing options...
ScorpX Posted March 19, 2023 Share Posted March 19, 2023 Hello, I'm wondering if it's possible to obtain the coordinates of text detected by this OCR. I have written a piece of code that captures a screenshot from an emulator and then uses this OCR to extract the text from that image. However, I am unsure how to retrieve the exact location or coordinates of the identified text. Could you please advise me on how to accomplish this task? also what am asking is possible, then what if there is multiple texts, how can i get the coordinates of a specified text. the coordinates should be relative to the image. if you have a better idea on how to do this, let me also do Quote #include <UWPOCR.au3> Run("adb -s emulator-5554 shell screencap -p /storage/emulated/0/picz/screenshot.png", @SW_HIDE) Sleep(1000) Run("adb -s emulator-5554 pull /storage/emulated/0/picz/screenshot.png " & @ScriptDir & "\try.png", @SW_HIDE) Local $sOCRTextResult = _UWPOCR_GetText(@ScriptDir & "\try.png", Default, True) MsgBox(0, "OCR", $sOCRTextResult) Link to comment Share on other sites More sharing options...
Werty Posted March 19, 2023 Share Posted March 19, 2023 (edited) @ScorpX,That question was asked two posts above yours, and answered by @Danyfirex, "Check Example 4". Edited March 19, 2023 by Werty Danyfirex 1 Some guy's script + some other guy's script = my script! Link to comment Share on other sites More sharing options...
ScorpX Posted March 19, 2023 Share Posted March 19, 2023 (edited) 11 hours ago, Werty said: @ScorpX,That question was asked two posts above yours, and answered by @Danyfirex, "Check Example 4". thanks for the udf Edited March 20, 2023 by ScorpX i found what i was looking for Link to comment Share on other sites More sharing options...
ScorpX Posted March 20, 2023 Share Posted March 20, 2023 (edited) if you want to move mouse to the location of the text and you found example 4 was complicated, then here is a simpler code example to let you move mouse to the location of text, change line 15 to the text you want. @Werty @CommZ3 maybe its not efficient but you don't have to use gdi+ if you don't know about it. a shorter code #include <ScreenCapture.au3> #include <UWPOCR.au3> _ScreenCapture_Capture(@ScriptDir & "\screenshot1.png", 0, 0, @DesktopWidth, @DesktopHeight) Local $aWords = _UWPOCR_GetWordsRectTo2DArray(@ScriptDir & "\screenshot1.png") Local $sTargetWord = "New" For $i = 0 To UBound($aWords) - 1 If $aWords[$i][0] = $sTargetWord Then MouseMove($aWords[$i][1] + $aWords[$i][3] / 2, $aWords[$i][2] + $aWords[$i][4] / 2) ExitLoop Next a longer code with explanation #include <ScreenCapture.au3> #include <UWPOCR.au3> ; Set the path and file name for the screenshot Local $sScreenshotPath = @ScriptDir & "\screenshot1.png" ; Capture a screenshot of the desktop and save it to a file _ScreenCapture_Capture($sScreenshotPath, 0, 0, @DesktopWidth, @DesktopHeight) ; Extract the text from the screenshot using UWPOCR Local $sText = _UWPOCR_GetText($sScreenshotPath) ; Set the target word to be searched Local $sTargetWord = "change_this_to_word_to_find" ; Look for the target word Local $aWords = _UWPOCR_GetWordsRectTo2DArray($sScreenshotPath) ; Loop through each word in the array For $i = 0 To UBound($aWords) - 1 ; If the word matches the target word If $aWords[$i][0] = $sTargetWord Then ; Calculate the center of the rectangle enclosing the word Local $iX = $aWords[$i][1] + $aWords[$i][3] / 2 Local $iY = $aWords[$i][2] + $aWords[$i][4] / 2 ; Move the mouse to the center of the rectangle MouseMove($iX, $iY) ; Exit the loop ExitLoop EndIf Next Edited March 20, 2023 by ScorpX Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now