hamohd70 Posted September 5, 2014 Share Posted September 5, 2014 how can I use the StringRegEx function to extract any line that has one of the following words: http,rtmp,rtmps,https from a file? thanks Link to comment Share on other sites More sharing options...
JohnOne Posted September 5, 2014 Share Posted September 5, 2014 I'm no regexpert but I think it's wise to post the file as it is, you can spent all day guessing what it uses as line endings otherwise. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
hamohd70 Posted September 5, 2014 Author Share Posted September 5, 2014 here is the file.. Link to comment Share on other sites More sharing options...
BrewManNH Posted September 5, 2014 Share Posted September 5, 2014 here is the file.. I think you missed something If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag GudeHow to ask questions the smart way! I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from. Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays. - ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script. - Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label. - _FileGetProperty - Retrieve the properties of a file - SciTE Toolbar - A toolbar demo for use with the SciTE editor - GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI. - Latin Square password generator Link to comment Share on other sites More sharing options...
MikahS Posted September 5, 2014 Share Posted September 5, 2014 here is the file.. I think you forgot the file Snips & Scripts My Snips: graphCPUTemp ~ getENVvarsMy Scripts: Short-Order Encrypter - message and file encryption V1.6.1 ~ AuPad - Notepad written entirely in AutoIt V1.9.4 Feel free to use any of my code for your own use. Forum FAQ Link to comment Share on other sites More sharing options...
JonathanD Posted September 5, 2014 Share Posted September 5, 2014 #include <Array.au3> Global $fileRead = FileRead(@ScriptDir & 'teste.txt') Global $results = StringRegExp($fileRead,'.*[http][rtmp][rtmps][https].*',3) _ArrayDisplay($results) just replace @ScriptDir & 'teste.txt' for you text file path Link to comment Share on other sites More sharing options...
jguinch Posted September 5, 2014 Share Posted September 5, 2014 (edited) For extracting lines containing urls : $aLines = StringRegExp($sContent, "(?mi)(\N*(?:(?:http)|(?:https)|(?:rtmp)|(?:rtmps)):\N*)", 3) _ArrayDisplay($aLines) Edited September 5, 2014 by jguinch hamohd70 1 Spoiler Network configuration UDF, _DirGetSizeByExtension, _UninstallList Firefox ConfigurationArray multi-dimensions, Printer Management UDF Link to comment Share on other sites More sharing options...
hamohd70 Posted September 5, 2014 Author Share Posted September 5, 2014 Sorry, i forgot to attach the file. Thanks jguinch for the snippnet. I did this way.. $file = FileOpenDialog("Select your file",@DesktopDir,"Text document (*.txt)|All files (*.*)") $data = FileRead($File) $lines = StringSplit($data, @lf) If IsArray($lines) Then $linecount = $lines[0] Else $linecount = 0 Endif For $i = 1 to $linecount $txt = StringStripWS(FileReadLine($file, $i),1) if StringInStr($txt,'"') then $txt = StringReplace($txt,'"',"") if StringRegExp($txt, "(?mi)(\N*(?:(http)|(https)|(rtmp)|(rtmps)):\N*)") Then if StringInStr($txt,"http") Then $txt = StringMid($txt, StringInStr($txt,"http")) ElseIf StringInStr($txt,"https") Then $txt = StringMid($txt, StringInStr($txt,"https")) ElseIf StringInStr($txt,"rtmp") Then $txt = StringMid($txt, StringInStr($txt,"rtmp")) ElseIf StringInStr($txt,"rtmps") Then $txt = StringMid($txt, StringInStr($txt,"rtmps")) EndIf $upscount = $upscount + 1 ConsoleWrite ($txt &@CRLF) EndIf If StringinStr($txt, '#EXT') Then $downscount = $downscount + 1 EndIf Next Exit works just fine. Any comments to make it more efficient are welcome !! boc.txt Link to comment Share on other sites More sharing options...
mikell Posted September 5, 2014 Share Posted September 5, 2014 (edited) #Include <Array.au3> $text = FileRead("boc.txt") $aRes = StringRegExp($text, '(?:http|rtmp)s?[^"\r\n]+', 3) _ArrayDisplay($aRes) This gets urls To get the whole lines, use this : $aRes = StringRegExp($text, '\N*(?:http|rtmp)s?[^"\r\n]+', 3) Edited September 5, 2014 by mikell Link to comment Share on other sites More sharing options...
jguinch Posted September 5, 2014 Share Posted September 5, 2014 (edited) Thanks Mikell,for the simplified expression. Maybe we can add simple quotes ? $aRes = StringRegExp($text, '(?:http|rtmp)s?[^"''\r\n]+', 3) Edited September 5, 2014 by jguinch Spoiler Network configuration UDF, _DirGetSizeByExtension, _UninstallList Firefox ConfigurationArray multi-dimensions, Printer Management UDF Link to comment Share on other sites More sharing options...
Solution mikell Posted September 5, 2014 Solution Share Posted September 5, 2014 (edited) It doesn't seem very useful as there are no single quotes in the OP's file BTW using this file, this one is funny too : #Include <Array.au3> $text = FileRead("boc.txt") $aRes = StringRegExp($text, '(?m)^(?:.*?(?<=\A|\v{4}|,|el=)([^":,\r\n]+))|(?:http|rtmp|rtsp)s?[^"\r\n]+', 3) Dim $a[UBound($aRes)/2][2] For $i = 0 to UBound($aRes)-1 If Mod($i, 2) = 0 Then $a[$i/2][0] = $aRes[$i] Else $a[($i-1)/2][1] = $aRes[$i] EndIf Next _ArrayDisplay($a) Quite unsafe in case of file changes then intended for playing only Edit : exp simplification Edited September 5, 2014 by mikell Link to comment Share on other sites More sharing options...
jguinch Posted September 5, 2014 Share Posted September 5, 2014 $mikell = "RegExpMan" Spoiler Network configuration UDF, _DirGetSizeByExtension, _UninstallList Firefox ConfigurationArray multi-dimensions, Printer Management UDF Link to comment Share on other sites More sharing options...
hamohd70 Posted September 6, 2014 Author Share Posted September 6, 2014 It doesn't seem very useful as there are no single quotes in the OP's file BTW using this file, this one is funny too : #Include <Array.au3> $text = FileRead("boc.txt") $aRes = StringRegExp($text, '(?m)^(?:.*?(?<=\A|\v{4}|,|el=)([^":,\r\n]+))|(?:http|rtmp|rtsp)s?[^"\r\n]+', 3) Dim $a[UBound($aRes)/2][2] For $i = 0 to UBound($aRes)-1 If Mod($i, 2) = 0 Then $a[$i/2][0] = $aRes[$i] Else $a[($i-1)/2][1] = $aRes[$i] EndIf Next _ArrayDisplay($a) Quite unsafe in case of file changes then intended for playing only Edit : exp simplification interesting code. can you please explain the StringRegEx part? $aRes = StringRegExp($text, '(?m)^(?:.*?(?<=\A|\v{4}|,|el=)([^":,\r\n]+))|(?:http|rtmp|rtsp)s?[^"\r\n]+', 3) Link to comment Share on other sites More sharing options...
mikell Posted September 6, 2014 Share Posted September 6, 2014 (edited) $aRes = StringRegExp($text, '(?mx) ^ (?:.*? (?<=A|v{4}|,|el=) ([^":,v]+) ) | (?:http|rtmp|rtsp)s?[^"v]+' , 3) (?m) multiline allows ^ to match at start of each line The main | causes the regex to match alternatively a channel or an url First part (channels) :(?:.*? everything, until the capturing group matching the sequence defined by the character class ([^":,v]+) and preceded (?<= by either A (beginning of text), v{4} (i.e.@crlf & @crlf ) , a comma , or el= Returns the capturing group 2nd part (urls) : matches http, rtmp, rtsp AND an optional s AND the sequence defined by the character class [^"v]+ The lack of capturing group in this part causes the regex to return the whole match Edit For details and more definitions please have a look at the helpfile where jchd wrote a very nice StringRegExp explaining chapter Edited September 6, 2014 by mikell Link to comment Share on other sites More sharing options...
hamohd70 Posted September 6, 2014 Author Share Posted September 6, 2014 thanks Link to comment Share on other sites More sharing options...
jchd Posted September 6, 2014 Share Posted September 6, 2014 For anyone interessed, you can read a step-by-step, correct, plain english, breakdown of a PCRE pattern by pasting it there. Applied to the above pattern, you get this: expandcollapse popup/(?m)^(?:.*?(?<=\A|\v{4}|,|el=)([^":,\r\n]+))|(?:http|rtmp|rtsp)s?[^"\r\n]+/ 1st Alternative: (?m)^(?:.*?(?<=\A|\v{4}|,|el=)([^":,\r\n]+)) (?m) Match the remainder of the pattern with the following options: m modifier: multi-line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string) ^ assert position at start of a line (?:.*?(?<=\A|\v{4}|,|el=)([^":,\r\n]+)) Non-capturing group .*? matches any character (except newline) Quantifier: Between zero and unlimited times, as few times as possible, expanding as needed [lazy] (?<=\A|\v{4}|,|el=) Positive Lookbehind - Assert that the regex below can be matched 1st Alternative: \A \A assert position at start of the string 2nd Alternative: \v{4} \v{4} matches any vertical whitespace character Quantifier: Exactly 4 times 3rd Alternative: , , matches the character , literally 4th Alternative: el= el= matches the characters el= literally (case sensitive) 1st Capturing group ([^":,\r\n]+) [^":,\r\n]+ match a single character not present in the list below Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy] ":, a single character in the list ":, literally (case sensitive) \r matches a carriage return (ASCII 13) \n matches a fine-feed (newline) character (ASCII 10) 2nd Alternative: (?:http|rtmp|rtsp)s?[^"\r\n]+ (?:http|rtmp|rtsp) Non-capturing group 1st Alternative: http http matches the characters http literally (case sensitive) 2nd Alternative: rtmp rtmp matches the characters rtmp literally (case sensitive) 3rd Alternative: rtsp rtsp matches the characters rtsp literally (case sensitive) s? matches the character s literally (case sensitive) Quantifier: Between zero and one time, as many times as possible, giving back as needed [greedy] [^"\r\n]+ match a single character not present in the list below Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy] " a single character in the list " literally (case sensitive) \r matches a carriage return (ASCII 13) \n matches a fine-feed (newline) character (ASCII 10) Actually you get even more if you leave colorizing ON. Option 3 can be simulated by typing g in the modifier input. You can use the very useful "regex debugger" tool to follow stepwise how the engine proceeds thru subject and pattern. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now