BAM5 Posted October 23, 2009 Posted October 23, 2009 (edited) Hello all, I'm creating a library that will allow you to do all things http(hopefully). I'm trying to make this user friendly so a url can be entered as "http://etc", "etc", "http://etc/etc", "http://etc/", etc. So in order to make this easy on me I thought I'd do some SRE. Unfortunately for me I don't know SRE. So I read up. I read the StringRegExp tutorial by neogia and I downloaded the SRE Tester by Szhlopp. I've been experimenting and so far I've come up with: "(?i)(?:http://|)(.*)/" Which works with http://www.google.com/ But not with http://www.google.com. I understand why, just not how to circumvent it. I also need to make that last forward slash be able to be the end of the line too. "[/|\z]"? You might also get to answer my future questions about getting pages and stuff, but I think I'll be able to handle it once I have this figured out. Thanks in advance, hope to get some meaningful replies. Edited October 23, 2009 by BAM5 [center]JSON Encoding UDF[/center]
memoryoverflow Posted October 23, 2009 Posted October 23, 2009 (edited) "(?i)(?:http://|)(.*)/" ... "[/|\z]"?Not absolutely sure what the desired result is, but... in your first pattern, you wouldn't need the null-string aternative, just a ?-quatifier (but the null alternitive spoils it anyway) in the latter particle, the aternation specifier is wrong, the charater group [] includes aternation between those charakters. So, something likeLocal $i = 0, $asTest[4] = ['http://www.google.com/','http://www.google.com','www.google.com','www.google.com/sub'] For $i = 0 To UBound($asTest) - 1 ConsoleWrite($asTest[$i] & ' => ' & StringRegExpReplace($asTest[$i], '(?i)(?>http\://|\A)([^/]*)(?>/\z|\z)','$1') & @crlf ) Nextwould strip the surrounds and let you work on with a unified representation. @ ResNullius: I bet that there's a simpler pattern doing the same job? edit: typos Edited October 23, 2009 by memoryoverflow (The signature is placed on the back of this page to not disturb the flow of the thread.)
ResNullius Posted October 23, 2009 Posted October 23, 2009 Don't know if it's simpler, just more compact: Local $i = 0, $asTest[5] = ['http://www.google.com/','http://www.google.com','www.google.com','www.google.com/sub','http://google.com'] For $i = 0 To UBound($asTest) - 1 ConsoleWrite($asTest[$i] & ' => ' & StringRegExpReplace($asTest[$i], '(?:https*:\/\/)([\w.]+)\/?[\w\.$]*','\1') & @crlf ) Next
BAM5 Posted October 23, 2009 Author Posted October 23, 2009 (edited) I've adapted this off of your previous posts '(?:https*://|\A)([\w.]+)' And it works perfectly, thanks! Edit:Revised again Edited October 23, 2009 by BAM5 [center]JSON Encoding UDF[/center]
GEOSoft Posted October 23, 2009 Posted October 23, 2009 Or another one "(?i)([hf]t+ps?:[\w\./]+)" George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!"
BAM5 Posted October 23, 2009 Author Posted October 23, 2009 All things HTTP, not FTP [center]JSON Encoding UDF[/center]
GEOSoft Posted October 23, 2009 Posted October 23, 2009 All things HTTP, not FTP No matter. That gets either as well as https: George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!"
Moderators SmOke_N Posted October 23, 2009 Moderators Posted October 23, 2009 This may come in handy for you down the road if you're going to be playing with URLs. _URLSplit : http://www.autoitscript.com/forum/index.php?showtopic=36679&st=0&p=271254&#entry271254 Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.
BAM5 Posted October 23, 2009 Author Posted October 23, 2009 (edited) Ok, now there's more. I want to get the path of the url after the domain so I have this: (?:https*://|\A)([\w.]+)(/.+)\?{0,1} It works, but I don't want the question mark to be captured, which it will when there is one. Thanks for all your help so far! Edit: O hey, thanks SmOke_N, I'll look at that when I get home, I have to get to class now. Edited October 23, 2009 by BAM5 [center]JSON Encoding UDF[/center]
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now