jimg Posted March 24, 2022 Share Posted March 24, 2022 I'm trying to automate a repetitious task in a third party app, and neither the x86 or x64 Spy can see anything on the page except mouse motions.. The app includes multiple screens (that look like windows), and some links and buttons I need to press based on displayed text. It's an SQL based app, so learning enough SQL to perform the task is more than I'm ready to tackle. In the enclosed example, I need to click through the list of entries, and each entry offers different options on on subsequent screen. I've got the script to the point where I can click through all the links and move between pages based solely on mouse coordinates, but I have to pause for human interaction after each click to respond to varying predictable questions with easy answers, typically clicking some boxes or clicking a button, where mouse coordinates aren't useful if I can't identify the questions. Can someone point me at the functions that might help with this endeavor? Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted March 24, 2022 Moderators Share Posted March 24, 2022 Moved to the appropriate forum. Moderation Team Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
Nine Posted March 24, 2022 Share Posted March 24, 2022 Test with UIASpy.au3 to see if you can reach the different controls... “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
jimg Posted March 26, 2022 Author Share Posted March 26, 2022 Using UIASpy F1 and F2 results in the same instance/control name for everything in the list, with no sign of content. If I could read all the content in a block, I could parse it, but I'll need to read the user guide a little more to see if there is more buried somewhere. Link to comment Share on other sites More sharing options...
Danp2 Posted March 26, 2022 Share Posted March 26, 2022 @jimgFrom your screenshots, it appears that you are trying to control an app within an RDP session. If so, then that would explain why you are unable to access the individual controls. Have you tried running your script from inside the RDP session? Latest Webdriver UDF Release Webdriver Wiki FAQs Link to comment Share on other sites More sharing options...
jimg Posted March 26, 2022 Author Share Posted March 26, 2022 I am running the app and UIASpy on the remote computer's desktop. I am connected (at the moment) via RDP, but I don't think that should interfere. I'll be at the office later to try it locally. I'm guessing the app is opening a pseudo-window and using an SQL Query to create the table. SQL may not obey the usual rules for windows/instances/controls.? Although the screen shot looks like multiple windows, you can't iconify or change the window order, so they aren't "real" windows. Link to comment Share on other sites More sharing options...
Nine Posted March 26, 2022 Share Posted March 26, 2022 Hmmm. If F1/F2 didnt give you any content, then you will have to resort on OCR “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
jimg Posted March 26, 2022 Author Share Posted March 26, 2022 (edited) Ugh. Is Tesseract the method of choice? Edited March 26, 2022 by jimg Link to comment Share on other sites More sharing options...
Nine Posted March 26, 2022 Share Posted March 26, 2022 On win10 there is : On Win7 I used quite successfully MODI.document COM object (very easy to implement). Ofc, Tesseract is an obvious option. You can create your own specialized OCR script based on my Screen Scraping UDF (see signature). I have done that twice very effectively. When you know what you are looking for (number vs alphabetic) and the colors (foreground and background), it is not that hard believe me... But my preferred solution (in your situation) would be to call directly the database with SQL statement. “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
jimg Posted March 26, 2022 Author Share Posted March 26, 2022 (edited) Unfortunately, UWPOCR (or the built in Windows software) doesn't parse underscores. Single "_" gets converted to a " "(space) and a double "_" character (e.g. "__") Is treated as a delimiter which scrambles my whole result. Also, given the font size, it mis-recognizes some characters frequently ("D" and "O" in particular). I've tried doing the _screencapture to capture a single line, and it won't recognize small areas for unknown reasons. I've looked at the SQL approach, and given that my app has 1400 interwoven tables, it's not for the meek. Edited March 26, 2022 by jimg Link to comment Share on other sites More sharing options...
Nine Posted March 26, 2022 Share Posted March 26, 2022 I think you are giving up too easily. There is always a solution, always. But you need to stay positive. Your reaction is quite negative atm. Anyway, unless you start showing some code you made with specific examples of DB schema or precise image of what you have tried to automate so far, I am afraid our help will end here. “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
jimg Posted March 26, 2022 Author Share Posted March 26, 2022 I have no documentation of the SQL data, and there are no database diagrams (I assume those might be enlightening). Decoding the 1400 primary tables to extract the appropriate entries and find the links to the files is beyond my capabilities to accomplish in just a few days. This is a 30gb database. I'm a retired hardware guy, not a programmer. The UWPOCR was very promising until I realized the OCR couldn't deal with underscores. In a further twist, it won't recognize the phrase "XML" or "xml" either - it simply acts like they are not there. Almost like it is hiding known filename extensions... My screen copy produces "scrape.jpg" and the single line of code will not recognize it, as shown in "list.txt" .I've overlapped the file manager to show that it recognizes that part of the screen. _ScreenCapture_Capture("./scrape.jpg",365,190,420,700,0)$imagefilepath1="./scrape.jpg" $list = _UWPOCR_GetText($imagefilepath1) I've played with the area dimensions to include more or less space, eliminate the artifacts from the column separators, etc. list.txt Link to comment Share on other sites More sharing options...
jimg Posted March 27, 2022 Author Share Posted March 27, 2022 Curiously, if I create the same data using Notepad, it translates fine. I've enhanced the brightness/contrast, eliminated the alternate background colors etc. and the attached image just doesn't work (on several Win10 desktops). Mystified. Link to comment Share on other sites More sharing options...
jimg Posted April 3, 2022 Author Share Posted April 3, 2022 I've managed to solve my problem by using Tesseract on very small screen regions. If I get all the table lines out of the image, Tesseract is quite good at recognition despite background color, etc. Luckily, in my application, once I know the location of a cell in the table, the other cell locations are just math. So I have a calibration function that allows the user to tweak coordinates using a sort of homebrew magnifier function. Danyfirex 1 Link to comment Share on other sites More sharing options...
junkew Posted April 3, 2022 Share Posted April 3, 2022 Isn't there an export menu or withing the file system some folders with xml files you could read? Did you try MSAA mode in UIAspy? FAQ 31 How to click some elements, FAQ 40 Test automation with AutoIt, Multithreading CLR .NET Powershell CMDLets Link to comment Share on other sites More sharing options...
jimg Posted April 3, 2022 Author Share Posted April 3, 2022 I'm not sure what MSAA mode is. Not anywhere obvious. The entire table lives in one Pane and I don't get any data as I move the cursor around the table. There's no export function, and the data is the result of an SQL query, which I don't have access to, since it's an proprietary EHR package. Link to comment Share on other sites More sharing options...
junkew Posted April 4, 2022 Share Posted April 4, 2022 Within inspect.exe and uiaspy you can enable the msaa accessibility mode in the menu which sometimes can help to identify more on the screen. But its hard to learn on unwilling control element. FAQ 31 How to click some elements, FAQ 40 Test automation with AutoIt, Multithreading CLR .NET Powershell CMDLets Link to comment Share on other sites More sharing options...
jimg Posted April 4, 2022 Author Share Posted April 4, 2022 inspect.exe ? Link to comment Share on other sites More sharing options...
junkew Posted April 4, 2022 Share Posted April 4, 2022 You are in the area of an application thats not easily recognizable and as such you have to google and read a lot. google for inspect.exe and you will find But check also FAQ31 https://www.autoitscript.com/wiki/FAQ#How_can_I_control_.28click.2C_edit_etc.29_an_external_.28html.29_application.3F Tell whats recognized with different spy tools. If you give classnames of windows and tell whats getting highlighted and what not maybe someone can help further. But at first sight it looks you will be stuck with OCR, bitblt functions and calculations. FAQ 31 How to click some elements, FAQ 40 Test automation with AutoIt, Multithreading CLR .NET Powershell CMDLets Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now