Jump to content

Questions about regex


 Share

Recommended Posts

I need to get the first word after slash in a url in autoit.


Here's an idea of what the URLs can possibly look like :

http://autoitscript.com/blabla1/
http://autoitscript.com/blabla2/blabla/
http://autoitscript.com/bla-bla-bla3/


The output should be as follows.

blabla1
blabla2
bla-bla-bla3


In bold is what I need the regex to match for each scenario, so basically only the first portion after the slash, no matter how many further slashes there are.

 

Edited by youtuber
Link to comment
Share on other sites

@youtuber Ironically the first character after the first slash in your example URLs is a slash itself... (https://)

EasyCodeIt - A cross-platform AutoIt implementation - Fund the development! (GitHub will double your donations for a limited time)

DcodingTheWeb Forum - Follow for updates and Join for discussion

Link to comment
Share on other sites

Look i got an avatar that no one have :0.

My video tutorials : ( In construction )  || My Discord : https://discord.gg/S9AnwHw

How to Ask Help ||  UIAutomation From Junkew || WebDriver From Danp2 || And Water's UDFs in the Quote

Spoiler

 Water's UDFs:
Active Directory (NEW 2018-10-19 - Version 1.4.10.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
PowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & Support
Excel - Example Scripts - Wiki
Word - Wiki
 
Tutorials:

ADO - Wiki

 

Link to comment
Share on other sites

;~ $str = 'http://autoitscript.com/blabla1/'
$str = 'http://autoitscript.com/blabla2/blabla/'
;~ $str = 'http://autoitscript.com/bla-bla-bla3/'

msgbox(0 , '' , stringsplit($str , "/" , 2)[3])

 

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

23 minutes ago, youtuber said:

I tried something similar, but I failed

Your expression is too complex, think simple, try this :): https://regex101.com/r/WjCbT1/2

EasyCodeIt - A cross-platform AutoIt implementation - Fund the development! (GitHub will double your donations for a limited time)

DcodingTheWeb Forum - Follow for updates and Join for discussion

Link to comment
Share on other sites

i'm so sorry i forgot to write
If you do not have a slash at the end of the word, it fails.

local $aURLs[4] = ["http://autoitscript.com/blabla1", _
        "http://autoitscript.com/blabla2/bla", _
        "http://autoitscript.com/blabla3/blabla/", _
        "http://autoitscript.com/bla-bla-bla4/"]

For $i = 0 To 3
    $RegExp1 = StringRegExp($aURLs[$i], 'https?:\/\/\S+?\/(\S+?\/)', 3)
    If IsArray($RegExp1) Then ConsoleWrite($RegExp1[0] & @CRLF)
Next

console output

blabla2/
blabla3/
bla-bla-bla4/

What I wanted was to be
blabla1
blabla2
blabla3
bla-bla-bla4

Link to comment
Share on other sites

Here's another way it can be done:

#include <Array.au3>
#include <Constants.au3>

example()

Func example()
    Local $aData = [ _
                    "http://autoitscript.com/blabla1/", _
                    "http://autoitscript.com/blabla2/blabla/", _
                    "http://autoitscript.com/bla-bla-bla3/", _
                    "http://autoitscript.com", _
                    "http://autoitscript.com/" _
                   ]
    Local $sData = _ArrayToString($aData, @CRLF)

    Local $aResult = StringRegExp($sData, "https?://[^/]+/([^/]+)", $STR_REGEXPARRAYGLOBALMATCH)
    If IsArray($aResult) Then _ArrayDisplay($aResult)
EndFunc

 

Link to comment
Share on other sites

This one is almost the regex version of iamtheky's code :

local $aURLs[4] = ["http://autoitscript.com/blabla1", _
        "http://autoitscript.com/blabla2/bla", _
        "http://autoitscript.com/blabla3/blabla/", _
        "http://autoitscript.com/bla-bla-bla4/"]

For $i = 0 To 3
    $s = StringRegExp($aURLs[$i] , '[^/]+', 3)[2]
    If not @error Then ConsoleWrite($s & @CRLF)
Next

:)
 

Link to comment
Share on other sites

Thank you for your answers.
Why do I fail to get the last 10 characters?:(

Local $aURLs[4] = ["bla-1234567890", _
        "bla-abcdefghij", _
        "1234567890-abcdefghij", _
        "asdf-autoit-bla"]

For $i = 0 To 3
    $a = StringRegExp($aURLs[$i], '.{10}$', 3)
    If Not @error Then ConsoleWrite($a & @CRLF)

    $b = StringRegExp($aURLs[$i], '(?=.{1,10}$).*', 3)
    If Not @error Then ConsoleWrite($b & @CRLF)

    $c = StringRegExp($aURLs[$i], '.{0,10}\z', 3)
    If Not @error Then ConsoleWrite($c & @CRLF)

    $d = StringRegExp($aURLs[$i], '?s).{10}\z', 3)
    If Not @error Then ConsoleWrite($d & @CRLF)

    $e = StringRegExp($aURLs[$i], '?i).{10}\z', 3)
    If Not @error Then ConsoleWrite($e & @CRLF)

    $f = StringRegExp($aURLs[$i], '\A.{0,10}\Z', 3)
    If Not @error Then ConsoleWrite($f & @CRLF)

    $g = StringRegExp($aURLs[$i], '(.{0,10})(.*?)\z', 3)
    If Not @error Then ConsoleWrite($g & @CRLF)

    $h = StringRegExp($aURLs[$i], '(.{0,10})$', 3)
    If Not @error Then ConsoleWrite($h & @CRLF)

    $i = StringRegExp($aURLs[$i], '(..........)\z', 3)
    If Not @error Then ConsoleWrite($i & @CRLF)
Next

 

Link to comment
Share on other sites

First, in this case, the result of your StringRegexp is an array.  So you can't use consolewrite() to display a whole array.  What you want in this case is element 0 of the array.

Second, you are modifying your looping variable ($i) so your loop would have never ended.

 

Local $aURLs[4] = ["bla-1234567890", _
        "bla-abcdefghij", _
        "1234567890-abcdefghij", _
        "asdf-autoit-bla"]

For $i = 0 To 3
    $a = StringRegExp($aURLs[$i], '.{10}$', 3)
    If Not @error Then ConsoleWrite($a[0] & @CRLF)

    $b = StringRegExp($aURLs[$i], '(?=.{1,10}$).*', 3)
    If Not @error Then ConsoleWrite($b[0] & @CRLF)

    $c = StringRegExp($aURLs[$i], '.{0,10}\z', 3)
    If Not @error Then ConsoleWrite($c[0] & @CRLF)

    $d = StringRegExp($aURLs[$i], '?s).{10}\z', 3)
    If Not @error Then ConsoleWrite($d[0] & @CRLF)

    $e = StringRegExp($aURLs[$i], '?i).{10}\z', 3)
    If Not @error Then ConsoleWrite($e[0] & @CRLF)

    $f = StringRegExp($aURLs[$i], '\A.{0,10}\Z', 3)
    If Not @error Then ConsoleWrite($f[0] & @CRLF)

    $g = StringRegExp($aURLs[$i], '(.{0,10})(.*?)\z', 3)
    If Not @error Then ConsoleWrite($g[0] & @CRLF)

    $h = StringRegExp($aURLs[$i], '(.{0,10})$', 3)
    If Not @error Then ConsoleWrite($h[0] & @CRLF)

    $j = StringRegExp($aURLs[$i], '(..........)\z', 3)
    If Not @error Then ConsoleWrite($j[0] & @CRLF)
Next

 

Edited by TheXman
Link to comment
Share on other sites

I call this the "2 1 2" knock-out method.  It eliminates all the text that is not required.

Local $aURLs[] = [ _
        "http://autoitscript.com/blabla1", _
        "http://autoitscript.com/blabla2/bla", _
        "http://autoitscript.com/blabla3/blabla/", _
        "http://autoitscript.com/bla-bla-bla4/" _
        ]

For $i = 0 To UBound($aURLs) - 1
    $s = StringRegExpReplace($aURLs[$i], '^([^/]+(/))(?2)(?1)|(?2).*$', "")
    ;     Where (?1) is the 1st capture pattern, "([^/]+/)"; and,
    ;           (?2) is the 2nd capture pattern, "(/)".

    If Not @error Then ConsoleWrite($s & @CRLF)
Next

#cs ; Returns:-
blabla1
blabla2
blabla3
bla-bla-bla4
#ce

 

Link to comment
Share on other sites

10 hours ago, youtuber said:

Why do I fail to get the last 10 characters?

In addition to the 2 reasons mentioned by TheXman, only 1 expression is really safe :

$c = StringRegExp($aURLs[$i], '.{0,10}$', 3)[0]

because even if the string is empty, you never get an error. If the string is less than 10 chars long, all the chars are returned

Link to comment
Share on other sites

I don't know whether you are asking why your example didn't work or for a suggestion that will work.  If it is the latter, then the snippet below is one way that you could do it.

#include <Array.au3>
#include <Constants.au3>

example()

Func example()
    Local $aData = [ _
                    "http://autoitscript.com/blabla1/", _
                    "http://autoitscript.com/blabla2/blabla/", _
                    "http://autoitscript.com/bla-bla-bla3/", _
                    "http://autoit%20script.com", _
                    "http://autoit_script.com/" _
                   ]
    Local $sData = _ArrayToString($aData, @CRLF)

    Local $aResult = StringRegExp($sData, "https?://([-\w.~%]+)", $STR_REGEXPARRAYGLOBALMATCH)
    If IsArray($aResult) Then _ArrayDisplay($aResult)
EndFunc

 

Link to comment
Share on other sites

@TheXman That's not what I want.
I would like to output the following

#include <Array.au3>
#include <Constants.au3>

example()

Func example()
    Local $aData = [ _
                    "http://autoit.script.com/blabla1/", _
                    "http://autoit-script.com/blabla2/blabla/", _
                    "http://autoitscript.com/bla-bla-bla3/", _
                    "http://autoit%20script.com", _
                    "http://autoit_script.com/" _
                   ]
    Local $sData = _ArrayToString($aData, @CRLF)

    Local $aResult = StringRegExp($sData, "https?:\/\/(?:www.)?([^.]+)", $STR_REGEXPARRAYGLOBALMATCH)
    If IsArray($aResult) Then _ArrayDisplay($aResult)
EndFunc

 

autoit.script
autoit-script
autoitscript
autoit%20script
autoit_script

 

Edited by youtuber
Link to comment
Share on other sites

is it always .com? 

edit: i was messing this up fiece

#include <Array.au3>
#include <Constants.au3>

example()

Func example()
    Local $aData = [ _
                    "http://autoit.script.com/blabla1/", _
                    "http://autoit-script.com/blabla2/blabla/", _
                    "http://autoitscript.com/bla-bla-bla3/", _
                    "http://autoit%20script.com", _
                    "http://autoit_script.com/" _
                   ]
    Local $sData = _ArrayToString($aData, @CRLF)

    Local $aResult = StringRegExp($sData, "\/\/(.+?)\.com", $STR_REGEXPARRAYGLOBALMATCH)

    If IsArray($aResult) Then _ArrayDisplay($aResult)
EndFunc

 

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

and if not all .com do they all end with 'script' as exampled?

idk why it doesnt work on those sites, lots of flavors of PCRE.  Most of my autoit regex dont work with Splunk and vice versa 

#include <Array.au3>
#include <Constants.au3>

example()

Func example()
    Local $aData = [ _
                    "http://autoit.script.com/blabla1/", _
                    "http://autoit-script.com/blabla2/blabla/", _
                    "http://autoitscript.com/bla-bla-bla3/", _
                    "http://autoit%20script.com", _
                    "http://autoit_script.com/" _
                   ]
    Local $sData = _ArrayToString($aData, @CRLF)

    Local $aResult = StringRegExp($sData, "\/\/(.+?script)", $STR_REGEXPARRAYGLOBALMATCH)

    If IsArray($aResult) Then _ArrayDisplay($aResult)
EndFunc

 

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...