Easiest text recognition

goldieczr · January 13, 2022

Hello.

I'm trying to create a bot that when you press a key it would scan a window and if it finds certain words, the mouse would go to that position and it would press on the word.
The windows I use is SCRCPY, a mirror that lets you use your phone on your computer, connected via USB. For this reason, the window helper doesn't show any visible / hidden text.
I've been looking for scripts around this forum for a couple of days but everything seems very complicated and I barely understand anything. Any ideas how I can achieve this while also keeping the code simple and make it scan fast enough (i can only give it a couple of seconds before the text disappears)?

Nine · January 13, 2022

On Win 7, I currently use MODI OCR (a com object) and it is working quite well.

You can also look at

And there is Tesseract...

goldieczr · January 13, 2022

From what I read, UWPOCR can only read text from image / bitmap. I'd like it to do it in realtime on my screen since the text only appears for a couple of seconds. Either that, or I don't understand how it works, which may be very possible as well.

No idea how MODI OCR or Tesseract work as well. I'm a complete newbie to this. The most I've been able to do was 'when you press <button>, mouse goes there, clicks, scrolls, etc', basic stuff.

Nine · January 13, 2022

See 2nd example of UWPOCR, it is reading from screen...

goldieczr · January 13, 2022

I see, as I said, complete beginner, no idea what bitmap / GDI etc really meant.

Looking at the code I can see that it's gonna read the screen, but how do I configure it?

I want it to scan the active window, check for 9 words and if any of these appear click on them, if not click elsewhere. I kinda know the 'logistics' of how it should work - script takes a screenshot, scans it for words, finds a word, gets coords, inputs coords into mousemove and clicks, sleeps the specified time to account for any lags in the program, returns to scanning for the rest of the words. I just have absolutely 0 idea how to put it into code.

Nine · January 13, 2022

ok, lets start with the beginnings. Show an image of the screen you want to interact with, so we can understand what you are speaking about.

goldieczr · January 13, 2022

https://prnt.sc/26dcu1r - This is a screenshot of the window I'm working with. As a back story, this is SCRPY, a software that allows you to see & control your phone from your computer via USB cable (with USB debugging and ADB). I run multiple of these windows at once. The app is called Snapchat, a popular chatting mobile app. We use it to interact with our followers and clients and I want to semi-automate my process - when people send a message, I will be opening their messages and depending on what they have to say I have premade messages to send them. Automatically sending messages at the push of a button was already done. The next step is:

Snapchat automatically deletes all messages in a chat when you leave that chat, unless you save them. In order to save them, you have to simply tap on the message and it gets saved (the second message in the screenshot is saved - the background is darker and the red line on the left thicker, showing it is saved). Now, unless you unsave it, it will remain there and won't ever be deleted.

After we send our premade messages, I want to push a button, the script would search for that message (for example, in our screenshot, if we want to save the first message it would search for the word 'test'), it would drive the mouse there and click on it. However, we usually send more premade messages one after the other so once it saved the first message we need it to continue scanning for other messages in case there is anything left to be saved.

We also need this script to run only on the specified window. I already did this in my script with 'AutoItSetOption("MouseCoordMode",0)' but I don't know if it will work. We need this because, as stated earlier, I personally run 4 phones connected at the same time with 4 windows opened. I need the script to only save the messages when I press the button and only on the window that's active in that moment.

In summary: Button pressed, script scans image for words, find words, presses on them, continues scanning for others, if others = save them too, else = stop (i don't want the script to exit, just to pause and wait for other keys to be pressed, for example: after we send the messages and save them in a conversation, we move to another conversation and repeat)

I hope this answers your questions, if not, let me know.

goldieczr · January 13, 2022

EDIT: The name of the software is actually SCRCPY (SCReen CoPY). We need to run it because Snapchat doesn't allow their app to run on any emulator so we're required to run actual phones with that software between them and our computers.

goldieczr · January 16, 2022

Any help?

Another idea:
When you send a message there's a red line next to your message on the left side. Wouldn't it be easier to make a script that would search the first column of pixels and once it finds a red pixel, it would move the mouse to the right of it and click?

Also, since our messages always look the same, I can also screenshot the message and make the script search for it instead of reading the text. Would that be easier?

Nine · January 16, 2022

10 minutes ago, goldieczr said:

Any help?

Not sure what type of help you're expecting. Start coding with one of the options I already gave you. When you have a script, we certainly will try to help you. Otherwise it is just a chat convo....

goldieczr · January 16, 2022

56 minutes ago, Nine said:

Not sure what type of help you're expecting. Start coding with one of the options I already gave you. When you have a script, we certainly will try to help you. Otherwise it is just a chat convo....

As I said, I have absolutely no clue where to begin with. Zero experience in making scripts for AutoIt, zero ideas on how to make what I want possible. I searched for tutorials and threads for hours and hours and didn't find anything that could guide me where I need to go. Even UWPOCR, for me it's just a bunch of text with some words that sometimes I understand. I have no clue how to make it work for the active window, how to make it scan my screen, how to make it search for words and drive my mouse there, nothing.

junkew · January 16, 2022

Look at if this then that IFTT app maybe then it is possible or natie APIs from Snapchat. From AutoIt you can only use ocr apps and to use that you have to learn how that works.

TheSaint · January 16, 2022

OP - What I guess Nine is suggesting to you, is learn to walk before you run.

Get to know the basics of coding first, and like good folk we will help you.

A bit like someone teaching another to fish, but not doing the fishing for them.

Kind of - we help you to help yourself.

All of us here are unpaid volunteers, with a varying level of expertise, and don't worry about looking foolish, as we have all been there ... a beginner who doesn't know or understand much.

This site is not a Code It For You site, it is more a teaching interactive site. We work with you to get to where you want to go.

So make a start.

TheDcoder · January 16, 2022

1 hour ago, TheSaint said:

Get to know the basics of coding first

Absolutely agree.

@goldieczr You may want to read the FREE book on AutoIt:

Or if you are a visual learner, you can watch this series on YouTube: https://www.youtube.com/playlist?list=PLNpExbvcyUkOJvgxtCPcKsuMTk9XwoWum

Best of luck in your coding journey

goldieczr · January 18, 2022

I don't want this to come out as rude but I work 3 jobs, barely get any sleep, I don't really have time to learn a programming language for scratch. I don't expect anyone to just write me a script but I was expecting a more direct way of doing it, something more along the lines of 'yeah check this tutorial copy that code then do this and that' rather than 'oh just start learning programming and hopefully after a year or two you'll be able to do it'. This isn't a hobby of mine, it's a necessity to make my work easier. This would shave off around an hour off my daily work so I don't see any worth in learning something for months just for that.

TheDcoder · January 18, 2022

2 hours ago, goldieczr said:

yeah check this tutorial copy that code then do this and that'

Unfortunately programming doesn't quite work like that, even in the best case scenario you'd have to write code to bring it all together, there is no magical solution here.

2 hours ago, goldieczr said:

I don't see any worth in learning something for months just for that.

This is a misconception you have. Depending on the level of skill you have with computers you can easily get started within a couple of weeks assming you dedicate some time each day towards learning it.

AutoIt is especially easy to use even for beginners.

Learning to code is also a useful skill in general.

Having said that, if you still don't want to take any time for learning, then there are other no code automation solutions out there, I have't used them personally so I can't help you with recommendations in that regard.

junkew · January 18, 2022

You are asking for one of the more complicated things beeing text recognition but let me give you some direction on the easy bits and parts

Create a bot install AutoIt and see in help many examples on how to make a program most examples are copy/paste
Move the mouse https://www.autoitscript.com/autoit3/docs/functions/MouseMove.htm
Action based on certain key combination https://www.autoitscript.com/autoit3/docs/functions/HotKeySet.htm
Click https://www.autoitscript.com/autoit3/docs/functions/MouseClick.htm
OCR you have had as first answer an excellent UDF and if you google on OCR you hopefully understand this is the most complex area
Read in detail the first answer you received and direction how to read from screen
" See 2nd example of UWPOCR, it is reading from screen..."

Even if you google around if snapchat has better API's to do this its still a lot of work.

Easier would be if you do not need the words

Check if pixels change in certain area https://www.autoitscript.com/autoit3/docs/functions/PixelChecksum.htm

goldieczr · January 18, 2022

I already completed the first 4 steps in the first 5 minutes after installing autoit because they're really extremely easy to use, however the 5th step with OCR seems like it needs quite some learning of other stuff before I can understand how it works and apply it into my script, which is where I got stuck.

I may actually be interested in the pixelchecksum method, however there's a dilemma: As already said, the message sent by me will have a red line on the left of the screen. However, the location (height-wise) of that red line isn't fixed since it depends on the length of the message before that (if the person that sent us a message wrote a long message, our reply will get showed lower on the screen). Is it possible to make the script do a scan on a rectangular piece of the screen when a key is pressed and once it detects a red pixel, moves mouse a bit to the right and clicks?

Unfortunately Snapchat is very conservative with their app usage. For that reason, no emulators work for snap and we have to use real phones connected to the computer, so there's no API to help us with that.

junkew · January 18, 2022

yes, you need the win* functions for that to get the area

And based on that you can start with the pixel* functions to find your red pixels

To get into all pixel detail you have to use bitblt but would classify this also in area of advanced stuff. Search forum for bitblt and gdi for more advance screen pixel things
Very long ago I did make below maybe there are some things you can reuse

Block all input without UAC	Save/Retrieve Images to/from Text	Monitor Management (VCP commands)
Tool to search in text (au3) files	Date Range Picker	Virtual Desktop Manager
Sudoku Game 2020	Overlapped Named Pipe IPC	HotString 2.0 - Hot keys with string
x64 Bitwise Operations	Multi-keyboards HotKeySet	Recursive Array Display
Fast and simple WCD IPC	Multiple Folders Selector	Printer Manager
GIF Animation (cached) Debug Messages Monitor UDF	Screen Scraping Round Corner GUI UDF	Multi-Threading Made Easy Interface Object based on Tag

Block all input without UAC	Save/Retrieve Images to/from Text	Monitor Management (VCP commands)
Tool to search in text (au3) files	Date Range Picker	Virtual Desktop Manager
Sudoku Game 2020	Overlapped Named Pipe IPC	HotString 2.0 - Hot keys with string
x64 Bitwise Operations	Multi-keyboards HotKeySet	Recursive Array Display
Fast and simple WCD IPC	Multiple Folders Selector	Printer Manager
GIF Animation (cached) Debug Messages Monitor UDF	Screen Scraping Round Corner GUI UDF	Multi-Threading Made Easy Interface Object based on Tag

Block all input without UAC	Save/Retrieve Images to/from Text	Monitor Management (VCP commands)
Tool to search in text (au3) files	Date Range Picker	Virtual Desktop Manager
Sudoku Game 2020	Overlapped Named Pipe IPC	HotString 2.0 - Hot keys with string
x64 Bitwise Operations	Multi-keyboards HotKeySet	Recursive Array Display
Fast and simple WCD IPC	Multiple Folders Selector	Printer Manager
GIF Animation (cached) Debug Messages Monitor UDF	Screen Scraping Round Corner GUI UDF	Multi-Threading Made Easy Interface Object based on Tag

Block all input without UAC	Save/Retrieve Images to/from Text	Monitor Management (VCP commands)
Tool to search in text (au3) files	Date Range Picker	Virtual Desktop Manager
Sudoku Game 2020	Overlapped Named Pipe IPC	HotString 2.0 - Hot keys with string
x64 Bitwise Operations	Multi-keyboards HotKeySet	Recursive Array Display
Fast and simple WCD IPC	Multiple Folders Selector	Printer Manager
GIF Animation (cached) Debug Messages Monitor UDF	Screen Scraping Round Corner GUI UDF	Multi-Threading Made Easy Interface Object based on Tag

Sign In

Easiest text recognition

Recommended Posts

goldieczr

Nine

goldieczr

Nine

goldieczr

Nine

goldieczr

goldieczr

goldieczr

Nine

goldieczr

junkew

TheSaint

TheDcoder

goldieczr

TheDcoder

junkew

goldieczr

junkew

Create an account or sign in to comment

Create an account

Sign in

Recently Browsing 0 members

Browse

AutoIt Resources

Release

Beta