Jump to content

Best way to split an URL in 2 strings


Recommended Posts

Hello,

i'd like to know what is the shorter or the more "elegant" way to split an URL in 2 strings like this :

$url = "http://www.mywebsite.com/folder1/folder2/index.html"

and i'd like to get it splited that way:

$var1 = "www.mywebsite.com" ; So we removed HTTP:// and everything after the .com
$var2 = "/folder1/folder2/index.html"

Stringsplit is not a good idea as you never know how many "/" there could be in the url

Edited by cetipabo
Link to comment
Share on other sites

@cetipabo
You may use StringRegExp():

#include <Array.au3>
#include <StringConstants.au3>

Global $strString = "http://www.mywebsite.com/folder1/folder2/index.html", _
       $arrResult


$arrResult = StringRegExp($strString, 'http://([^/]+)(.*)', $STR_REGEXPARRAYGLOBALMATCH)
If IsArray($arrResult) Then _ArrayDisplay($arrResult)

:)

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

or

;~ $url = "http://www.mywebsite.com/folder1/folder2/index.html"
$url = "http://www.mywebsite.com/folder1/fol/index.html"
;~ $url = "http://www.mywebsite.com/folder1/index.html"
$iPos= StringInStr($url,"/",1,3)
$var1 = StringTrimLeft(StringLeft($url,$iPos-1),7)
$var2= StringTrimLeft($url,$ipos-1)
MsgBox(64+262144, Default, "var1: "&$var1 &@lf&"var2: "&$var2,0)

 

Edited by Exit
Corrected code

App: Au3toCmd              UDF: _SingleScript()                             

Link to comment
Share on other sites

Global $strString = "http://www.mywebsite.com/folder1/folder2/index.html", _
       $arrResult[2]

$strString = StringReplace($strString, "http://", "", 1, $STR_CASESENSE)
$i = StringInStr($strString, '/',  $STR_CASESENSE) ; first occurrence

If $i > 0 Then
    $arrResult[0] = StringLeft($strString, $i-1)
    $arrResult[1] = StringMid($strString, $i)
Else
    $arrResult[0] = $strString
    $arrResult[1] = ""
EndIf

_ArrayDisplay($arrResult)

 

Link to comment
Share on other sites

Minor tweak to @FrancescoDiMuro code to allow for multiple url variations:

#include <Array.au3>
#include <StringConstants.au3>

Global $arrResult, $arrStrings[][2] = [ _
[1, "https://www.mywebsite.com/folder1/folder2/index.html"], _
[2, "http://www.mywebsite.com/folder1/folder2/index.html"],  _
[3, "https://mywebsite.com/folder1/folder2/index.html"], _
[4, "http://mywebsite.com/folder1/folder2/index.html"], _
[5, "www.mywebsite.com/folder1/folder2/index.html"], _
[6, "mywebsite.com/folder1/folder2/index.html"], _
[7, "http://www.mywebsite.com/index.html"], _
[8, "https://mywebsite.com/index.html"] _
]

For $i = 0 To UBound($arrStrings) - 1
    $arrResult = StringRegExp($arrStrings[$i][1], '(?:https?:\/\/)?([^/]+)(.*)', $STR_REGEXPARRAYGLOBALMATCH)
    If IsArray($arrResult) Then _ArrayDisplay($arrResult, $arrStrings[$i][0] & ":- " & $arrStrings[$i][1])
Next

 

Link to comment
Share on other sites

33 minutes ago, cetipabo said:

@Exit

your solution is not working as expected if i add or remove a folder in the url.

Changed the code:

;~ $url = "http://www.mywebsite.com/folder1/folder2/index.html"
$url = "http://www.mywebsite.com/folder1/fol/index.html"
;~ $url = "http://www.mywebsite.com/folder1/index.html"
$iPos = StringInStr($url, "/", 1, 3)
$var1 = StringTrimLeft(StringLeft($url, $iPos - 1), 7)
$var2 = StringTrimLeft($url, $ipos - 1)
MsgBox(64 + 262144, Default, "var1: " & $var1 & @LF & "var2: " & $var2, 0)

 

Edited by Exit

App: Au3toCmd              UDF: _SingleScript()                             

Link to comment
Share on other sites

2 hours ago, cetipabo said:

Stringsplit is not a good idea as you never know how many "/" there could be in the url

I don't agree  :)

#Include <Array.au3>

$strString = "http://www.mywebsite.com/folder1/folder2/index.html"
Local $arrResult[2]

$aSplit = StringSplit($strString, "/", 1)
$arrResult[0] = $aSplit[3]
$arrResult[1] = StringReplace($strString, $aSplit[1] & "//" & $aSplit[3], "")

_ArrayDisplay($arrResult)

yet my heart still belongs to regex   :P

Link to comment
Share on other sites

$strString = "http://www.mywebsite.com/folder1/folder2/index.html"

execute('assign("domain" , stringleft($strString , StringInStr($strString , "/" , 0 , 3) - 1)) assign("page" , stringright($strString , StringInStr($strString , "/" , 0 , 3) + 1))')

msgbox(0, '' , eval("domain") & @LF & eval("page"))

 

edit: suppose it's Exit's.  But I was focused on 'elegant:)

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

OK. Let's fable about elegance. :gathering:

$url = "http://www.mywebsite.com/folder1/folder2/index.html"
MsgBox(64 + 262144, Default, "var1:   " & StringTrimLeft(StringLeft($url, StringInStr($url, "/", 1, 3) - 1), 7) & @LF & "var2:   " & StringTrimLeft($url, StringInStr($url, "/", 1, 3) - 1), 0)

 

App: Au3toCmd              UDF: _SingleScript()                             

Link to comment
Share on other sites

Or you can just use the _WinHttpCrackUrl() function in the winhttp udf by @trancexx and @ProgAndy .

#include "MyIncludes\WinHttp\WinHttp.au3"  ; <== Change to your location
#include <Array.au3>

#cs

winhttp udf info: https://www.autoitscript.com/forum/topic/84133-winhttp-functions/?tab=comments#comment-602598

_winhttpcrackurl(): Separates a URL into its component parts such as host name and path.

Success - Returns array with 8 elements:
                    $array[0] - scheme name
                    $array[1] - internet protocol scheme
                    $array[2] - host name
                    $array[3] - port number
                    $array[4] - user name
                    $array[5] - password
                    $array[6] - URL path
                    $array[7] - extra information
#ce

; Cracking URL
Global $aUrl = _WinHttpCrackUrl("http://www.mywebsite.com/folder1/folder2/index.html")
_ArrayDisplay($aUrl, "_WinHttpCrackUrl()")

 

Link to comment
Share on other sites

2 hours ago, Exit said:

OK. Let's fable about elegance. :gathering:

indeed, if you arent going to actually assign them, then you can get away with a single replace (provided there is a static target for Stringmid, otherwise some sanitizing will be needed first).

$strString = "http://www.mywebsite.com/folder1/folder2/index.html"

msgbox(0, '' , "Domain: " & stringreplace(stringmid($strString , stringinstr($strString , "www.")) , "/" , @LF & "Page: " , 1))

 

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

11 hours ago, Zedna said:

My solution has advantage of speed as it will be the fastest one as I think.
RegExp is slow so if this URL split would be called many times inside some cycle then use my method ... 

 

#include <StringConstants.au3>

Global $strString = "http://www.mywebsite.com/folder1/folder2/index.html"
Global $arrResult1[2], $arrResult2[2]

$t1 = timerinit()
For $i = 1 to 100
   $arrResult1 = StringRegExp($strString, 'http://([^/]+)(.*)', $STR_REGEXPARRAYGLOBALMATCH)
Next
$d1 = Timerdiff($t1)

Sleep(100)

$t2 = timerinit()
For $k = 1 to 100
  $strString = StringReplace($strString, "http://", "", 1, $STR_CASESENSE)
  $i = StringInStr($strString, '/',  $STR_CASESENSE) ; first occurrence
  If $i > 0 Then
     $arrResult2[0] = StringLeft($strString, $i-1)
     $arrResult2[1] = StringMid($strString, $i)
  Else
     $arrResult2[0] = $strString
     $arrResult2[1] = ""
 EndIf
Next
$d2 = Timerdiff($t2)

Msgbox(0,"", "regex way = " & $d1 & @crlf & "string way = " & $d2)

 

Link to comment
Share on other sites

@mikell
I'm flattered by your words; you're not only THE cuddly cat on these forums, but behind that, I strongly believe that there is a wonderful and amazing person, and I'm really happy to know you (as the cuddly cat, obviousely).
Once again, thanks :)

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

:)

#include <APIShPathConstants.au3>
#include <WinAPIShPath.au3>

Local $sUrl = "http://www.mywebsite.com/folder1/folder2/index.html"

Local $sHost = _WinAPI_UrlGetPart($sUrl, $URL_PART_HOSTNAME)
MsgBox(0, "Host part " & ChrW(9786), $sHost & @CRLF & _
        StringReplace(StringMid($sUrl, StringInStr($sUrl, $sHost)), $sHost, "", 1))

 

Edited by Chimp
also added the second missing part of the url

 

image.jpeg.9f1a974c98e9f77d824b358729b089b0.jpeg Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...