goldenix Posted April 24, 2010 Share Posted April 24, 2010 (edited) I have this DL manager that downloaded large amount of files. Files look like this: See below I had to add spaces or browser will convert my text into Japanese. I cant manually rename cuz it will take forever. How to make Browser convert those strings into Japanese strings so I can rename my files? what options do I have? あ ;る ;ま ;じ ;ろ ;う ;別 ;ス ;キ ;ャ ;ン ;.jpg Edited April 24, 2010 by goldenix My Projects:[list][*]Guide - ytube step by step tut for reading memory with autoitscript + samples[*]WinHide - tool to show hide windows, Skinned With GDI+[*]Virtualdub batch job list maker - Batch Process all files with same settings[*]Exp calc - Exp calculator for online games[*]Automated Microsoft SQL Server 2000 installer[*]Image sorter helper for IrfanView - 1 click opens img & move ur mouse to close opened img[/list] Link to comment Share on other sites More sharing options...
JohnOne Posted April 24, 2010 Share Posted April 24, 2010 well if it converts it to japa nese if you dont add spaces then the easiest way to convert it to japanese would be not to add spaces. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
goldenix Posted April 24, 2010 Author Share Posted April 24, 2010 well if it converts it to japa nese if you dont add spaces then the easiest way to convert it to japanese would be not to add spaces.Browser does this, but the source remains the same. the queston is, how to get what I see? Not what is in the page source. My Projects:[list][*]Guide - ytube step by step tut for reading memory with autoitscript + samples[*]WinHide - tool to show hide windows, Skinned With GDI+[*]Virtualdub batch job list maker - Batch Process all files with same settings[*]Exp calc - Exp calculator for online games[*]Automated Microsoft SQL Server 2000 installer[*]Image sorter helper for IrfanView - 1 click opens img & move ur mouse to close opened img[/list] Link to comment Share on other sites More sharing options...
JohnOne Posted April 24, 2010 Share Posted April 24, 2010 stringsplit perhaps using ; as the deliminater AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
doudou Posted April 24, 2010 Share Posted April 24, 2010 Browser does this, but the source remains the same. the queston is, how to get what I see? Not what is in the page source. Try: $img = $htmlDoc.images("someID") $fileName = $img.getAttribute("src") MsgBox(0, "Hi Nippon", "File name is: " & $fileName) UDFS & Apps: Spoiler DDEML.au3 - DDE Client + ServerLocalization.au3 - localize your scriptsTLI.au3 - type information on COM objects (TLBINF emulation)TLBAutoEnum.au3 - auto-import of COM constants (enums)AU3Automation - export AU3 scripts via COM interfacesTypeLibInspector - OleView was yesterday Coder's last words before final release: WE APOLOGIZE FOR INCONVENIENCE Link to comment Share on other sites More sharing options...
goldenix Posted April 24, 2010 Author Share Posted April 24, 2010 Try: $img = $htmlDoc.images("someID") $fileName = $img.getAttribute("src") MsgBox(0, "Hi Nippon", "File name is: " & $fileName) This is only half of the code, I dont understand how to use it. what is someID ? what is $htmlDoc ? must it be filename? My Projects:[list][*]Guide - ytube step by step tut for reading memory with autoitscript + samples[*]WinHide - tool to show hide windows, Skinned With GDI+[*]Virtualdub batch job list maker - Batch Process all files with same settings[*]Exp calc - Exp calculator for online games[*]Automated Microsoft SQL Server 2000 installer[*]Image sorter helper for IrfanView - 1 click opens img & move ur mouse to close opened img[/list] Link to comment Share on other sites More sharing options...
doudou Posted April 24, 2010 Share Posted April 24, 2010 This is only half of the code, I dont understand how to use it. what is someID ? what is $htmlDoc ? must it be filename?You haven't posted any code at all yet, in order to build a common base of understanding it would be helpful to see some of it. I just assumed you use MSHTML.HTMLDocument this is what $htmlDoc is. If I am wrong - show what you really do. UDFS & Apps: Spoiler DDEML.au3 - DDE Client + ServerLocalization.au3 - localize your scriptsTLI.au3 - type information on COM objects (TLBINF emulation)TLBAutoEnum.au3 - auto-import of COM constants (enums)AU3Automation - export AU3 scripts via COM interfacesTypeLibInspector - OleView was yesterday Coder's last words before final release: WE APOLOGIZE FOR INCONVENIENCE Link to comment Share on other sites More sharing options...
goldenix Posted April 24, 2010 Author Share Posted April 24, 2010 (edited) I took your idea & made this atm, but I dont like it. I was thinking maybe I can do it quietly with your code. without using IE.MSHTML.HTMLDocument _ I dont have a lithest clue how to use it. Never seen this before. And what is some ID ?#include <IE.au3> $oIE = _IECreate ("file:///C:/Documents%20and%20Settings/biteme/Desktop/New%20AutoIt%20v3%20Script.html") $oImgs = _IEImgGetCollection ($oIE) $iNumImg = @extended For $oImg In $oImgs ConsoleWrite($oImg.src & @CRLF) $split = StringSplit($oImg.src,'/',1) ; get file name $filename = $split[$split[0]] ConsoleWrite($filename & @CRLF) FileMove('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.rar', _ $filename) Next _IEQuit($oIE) ExitEdit: ok its just I have never seen this before. I think I figured it out. But I still need to create IE window so its the same as I made. guess ill just have to list all files in the 1 html file & loop it. If im wrong, feel free & correct me.Thanx.$ObjIE=ObjCreate("InternetExplorer.Application") With $ObjIE .Visible = True .Navigate("C:\Documents and Settings\biteme\Desktop\New AutoIt v3 Script.html") while .ReadyState <> 4 Sleep(50) wend EndWith $document = $objIE.document $img = $document.getElementsByTagName("img").item(0) $fileName = $img.getAttribute("src") $split = StringSplit($fileName,'/',1) ; get file name $filename = $split[$split[0]] ConsoleWrite($filename & @CRLF) FileMove([あ.rar', _ $filename) Edited April 24, 2010 by goldenix My Projects:[list][*]Guide - ytube step by step tut for reading memory with autoitscript + samples[*]WinHide - tool to show hide windows, Skinned With GDI+[*]Virtualdub batch job list maker - Batch Process all files with same settings[*]Exp calc - Exp calculator for online games[*]Automated Microsoft SQL Server 2000 installer[*]Image sorter helper for IrfanView - 1 click opens img & move ur mouse to close opened img[/list] Link to comment Share on other sites More sharing options...
jchd Posted April 24, 2010 Share Posted April 24, 2010 (edited) The issue is simple: the sequence 〹 is the html way to denote Unicode codepoint 12345 (decimal). It's about the same as "&" coding for the ampersand character by itself "&". So when you get such sequence, regexp it into ChrW(12345) then execute and it should work. Try this: Local $s = 'Unicode html string with some fractional ((2n+1)/8) html codepoints ⅛=1/8 ⅜=3/8 ⅝=5/8 ⅞=7/8' $s = Execute('"' & StringRegexpReplace($s, "&#(\d+);", '"&chrw($1)&"') & '"') MsgBox(0, "Html Unicode test", $s & @LF) Edited April 24, 2010 by jchd This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
KaFu Posted April 24, 2010 Share Posted April 24, 2010 So when you get such sequence, regexp it into ChrW(12345) then execute and it should work.Really nice solution . I remembered that you've written something similar some weeks ago, but I couldn't get it to work in this case ... OS: Win10-22H2 - 64bit - German, AutoIt Version: 3.3.16.1, AutoIt Editor: SciTE, Website: https://funk.eu AMT - Auto-Movie-Thumbnailer (2024-Oct-13) BIC - Batch-Image-Cropper (2023-Apr-01) COP - Color Picker (2009-May-21) DCS - Dynamic Cursor Selector (2024-Oct-13) HMW - Hide my Windows (2024-Oct-19) HRC - HotKey Resolution Changer (2012-May-16) ICU - Icon Configuration Utility (2018-Sep-16) SMF - Search my Files (2024-Oct-20) - THE file info and duplicates search tool SSD - Set Sound Device (2017-Sep-16) Link to comment Share on other sites More sharing options...
jchd Posted April 24, 2010 Share Posted April 24, 2010 Which version/target string did you try? I use this about 100000 time/day and Execute never went on strike. (OK, I admit I bribe it with plenty of $nnn with nnn > 0) This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
goldenix Posted April 25, 2010 Author Share Posted April 25, 2010 (edited) So when you get such sequence, regexp it into ChrW(12345) then execute and it should work. Oo I see, it works. If you dont mind, can you please explain: Why we need execute? And this is what I do not understand what are these for? I can not grasp the logic hire: (\d+) "&chrw($1)&" StringRegexpReplace($s, "&#(\d+);", '"&chrw($1)&"') Edited April 25, 2010 by goldenix My Projects:[list][*]Guide - ytube step by step tut for reading memory with autoitscript + samples[*]WinHide - tool to show hide windows, Skinned With GDI+[*]Virtualdub batch job list maker - Batch Process all files with same settings[*]Exp calc - Exp calculator for online games[*]Automated Microsoft SQL Server 2000 installer[*]Image sorter helper for IrfanView - 1 click opens img & move ur mouse to close opened img[/list] Link to comment Share on other sites More sharing options...
doudou Posted April 25, 2010 Share Posted April 25, 2010 (edited) People, I don't get it, why bother with RegExp when MSHTML DOM's getAttribute() gives you all entities decoded to system codepage for free? RegExp not only consumes much more resources but is also proven to be unreliable in decoding entities, especially because you have to catch all possible forms and encodings: "&named;", "〹 ;" (16 bit), " ; ;" (8 bit) etc. Edited April 25, 2010 by doudou UDFS & Apps: Spoiler DDEML.au3 - DDE Client + ServerLocalization.au3 - localize your scriptsTLI.au3 - type information on COM objects (TLBINF emulation)TLBAutoEnum.au3 - auto-import of COM constants (enums)AU3Automation - export AU3 scripts via COM interfacesTypeLibInspector - OleView was yesterday Coder's last words before final release: WE APOLOGIZE FOR INCONVENIENCE Link to comment Share on other sites More sharing options...
jchd Posted April 25, 2010 Share Posted April 25, 2010 @Doudou,The OP didn't reference any _IE functions at all. I also don't believe it consumes much resources and I hardly see how "&#(\d+);" could possibly match anything else than an html Unicode codepoint (that , or html grammar is flawed beyond repair, which should have been noticed before BTW).Since you mention it, decoding html codepoints in the system codepage sounds weird to me. Do you mean it would decode characters in the 0x80..0xFF as per codepage, but any codepoint above as per Unicode? That would be wrong, very wrong.I agree that regexp are not the only truth on earth, but when specific issues like these arise they reveal handy, nothing less.@goldenix,No black magic involved. Let's take it in small pieces. The sequences you mentionned are composed of: an ampersand followed by sharp followed by a decimal value followed by a semicolumn. The goal is to pick up the decimal value and feed it to ChrW(), like you would do in a simple AutoIt statement, like Local $char = ChrW(169) to produce a Copyright sign.If we stick to this task, one way to do it simply is to use a regular expression: StringRegExpReplace($s, "&#(\d+);", '"&chrw($1)&"')The pattern is &#(\d+); where& matches an ampersand# matches sharp() the parenthesis is a capturing group, the first one, so it can be referred to later by $1\d+ what do we capture? \d stands for decimal digit and + means one or more; matches a semicolumnIn short, the pattern recognize the sequence we are after, and captures the decimal string within it under the "name" $1. Now let's look at the replace part.StringRegExpReplace will ... replace matched parts of the string by the replace pattern, which is "&chrw($1)&"We've seen that $1 contains the decimal value that we want and we want to feed it to ChrW(). That's what the replace pattern is doing! But it's just concatenating the required strings for doing so.Now, we need to enclose the whole string returned by StringRegExpReplace in double quotes. Why? Because since we couldn't execute the ChrW(<value>) function and the parts we have demand opening and closing quotes to be valid AutoIt grammar:Local $s = 'abc{def' ;; ChrW(123) = '{'$s = StringRegExpReplace($s, "&#(\d+);", '" & ChrW($1) & "') ;; I put whitespace to make things clearerwill produce exactly abc" & ChrW(123) & "def so you see why we need enclosing quotes, to obtain "abc" & ChrW(123) & "def" which is now valid syntax. The final bit is to execute this very statement, as if you had typed it in a line of source, and assign the result to a variable (which can still be $s). $s = Execute("abc" & ChrW(123) & "def") produces $s = "abc{def" which is what we wanted.To preserve compactness and once such construct don't surprise you anymore, you can chain them as I did in a single statement. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now