Jump to content

Count links in a txt file


 Share

Recommended Posts

How large is the file?

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

Links not always have https they also can have http and someothers but there maybe would be a function but it depends on somehow how you the links are formatted within the txt file

Edited by RaiNote
  • C++/AutoIt/OpenGL Easy Coder
  • I will be Kind to you and try to help you
  • till what you want isn't against the Forum
  • Rules~

 

Link to comment
Share on other sites

The OP posted that all his links start with "https://"

"https://" which is in every link.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

If the file isn't too long I would do it this way

Global $sFile = FileRead("Your filename goes here") ; Read the whole file into a variable
StringReplace($sFile, "https://", "https://") ; Replace the link with itself
ConsoleWrite("Number of links in the file: " & @extended) ; @extended holds the number of replacements

 

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

Did you insert the space intentionally?

"https: //"

 

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

@water Little question @Extended this does what exactly? Does it Returns the Count of operations of a Function does or something other?

Edited by RaiNote
  • C++/AutoIt/OpenGL Easy Coder
  • I will be Kind to you and try to help you
  • till what you want isn't against the Forum
  • Rules~

 

Link to comment
Share on other sites

@extended is set by StringReplace and returns the number of replacements that have been done.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

It is described in the help file: StringRegExp

Return Value

Returns the new string with the number of replacements performed stored in the @extended macro.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

If the file isn't too long I would do it this way

Global $sFile = FileRead("Your filename goes here") ; Read the whole file into a variable
StringReplace($sFile, "https://", "https://") ; Replace the link with itself
ConsoleWrite("Number of links in the file: " & @extended) ; @extended holds the number of replacements

 

Sorry if my question is really stupid but how can I save the counted links number into a variable?

Link to comment
Share on other sites

try this: (CODE TESTED AND VERIFIED) this will check for links and will check them if they are real (requires internet connection added to make your functions better)

#include <string.au3>
#include <Array.au3>

$occur = _FindlinkOcuurance("https://www.autoitscript.com text in between https://www.autoitscript.com just some text " & @CRLF & "https://www.google.com https://thereisnoserverlikethis.com")
_ArrayDisplay($occur)

; #FUNCTION# ====================================================================================================================
; Name ..........: _FindlinkOcuurance
; Description ...:
; Syntax ........: _FindlinkOcuurance($string[, $check = True[, $timout = 4000]])
; Parameters ....: $string              - the main string to be checked
;                  $check               - [optional] True or false. Default is True.if true the link will be
;                                         checken if exists in theinternet (requires data connection)
;                  $timout              - [optional] the timeout period to check for the link in the internet
;                                         set a large value for poor network connection and vice versa
; Return values .: $ary                 - A two dimensional array where the first element of the first coulmn
;                                         is the number of links found and the first elemnt in the second column
;                                         is the links that are found.the second element in first column is the
;                                         number of true links that exists in the internet and the second element
;                                         in the second column has the true links that exists in the internet
; Author ........: Surya Saradhi.B
; Modified ......: 05/09/15
; Remarks .......: Requires internet connection if the link is to be checked,the second element in the first column and the
;                  second element in the second column are set if the links are to be verified in the internet
; ===============================================================================================================================
Func _FindlinkOcuurance($string, $check = True, $timout = 4000)
    $strs = StringSplit(StringReplace($string, @CRLF, " "), " ")
    Local $find[2][2] = [[0, ""], [0, ""]]
    For $i = 1 To $strs[0]
        If StringInStr($strs[$i], "https://") Then
            $find[0][0] += 1
            $subs = _StringBetween($strs[$i], "https://", ".com")
            If Not @error Then $find[0][1] = $find[0][1] & "|" & "https://" & $subs[0] & ".com"
            If $check Then
                $linked = _StringBetween($strs[$i], "https://", ".com")
                If Not @error Then
                    $link = $linked[0] & ".com"
                    $pin = Ping($link, $timout)
                    If Not @error Then
                        $find[1][0] += 1
                        $find[1][1] = $find[1][1] & "|" & "https://" & $link
                    EndIf
                EndIf
            EndIf
        EndIf
    Next
    $find[1][1] = StringTrimLeft($find[1][1], 1)
    $find[0][1] = StringTrimLeft($find[0][1], 1)
    Return $find
EndFunc   ;==>_FindlinkOcuurance

 

No matter whatever the challenge maybe control on the outcome its on you its always have been.

MY UDF: Transpond UDF (Sent vriables to Programs) , Utter UDF (Speech Recognition)

Link to comment
Share on other sites

just because it starts with https:// would not assume it ends with .com, moreover would not assume that it could be pinged.  I dont really know what would be a solid method, maybe testing the @extended from _inetgetsource for a value greater than 0?

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

$sContent = FileRead("source.html")

$timer = TimerInit()
StringReplace($sContent, "https://", "")
$count = @extended
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
StringRegExpReplace($sContent, "https://", "")
$count = @extended
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
$count = UBound( StringRegExp($sContent, "https://", 3) )
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

 

Link to comment
Share on other sites

if you bring it in with stripped white space its even quicker, naturally.

 

#include <Inet.au3>
 $sContent = _INetGetSource("https://autoitscript.com")


$timer = TimerInit()
StringReplace($sContent, "https://", "https://")
$count = @extended
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
StringRegExpReplace($sContent, "https://", "")
$count = @extended
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
$count = UBound( StringRegExp($sContent, "https://", 3) )
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
$count = UBound( Stringsplit($sContent, "https://", 3)) - 1
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

;stripping ws

$sContent = stringstripws(_INetGetSource("https://autoitscript.com") , 8)

$timer = TimerInit()
StringReplace($sContent, "https://", "https://")
$count = @extended
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
StringRegExpReplace($sContent, "https://", "")
$count = @extended
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
$count = UBound( StringRegExp($sContent, "https://", 3) )
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
$count = UBound( Stringsplit($sContent, "https://", 3)) - 1
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

 

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

These RE Replace examples returns all the links from the HTML document, and not just the links from within the body tag.
The "s?" in the RE pattern means "s" can appear once or not at all.

#include <Inet.au3>

$sContent = _INetGetSource("https://autoitscript.com")

StringRegExpReplace($sContent, 'https://', "")
$count = @extended
ConsoleWrite($count & @TAB & 'https://' & @CRLF)

StringRegExpReplace($sContent, 'https?://', "")
$count = @extended
ConsoleWrite($count & @TAB & 'https?://' & @CRLF)

StringRegExpReplace($sContent, '"https?://', "")
$count = @extended
ConsoleWrite($count & @TAB & '"https?://' & @CRLF)

#cs ; Returns:
    80  https://
    89  https?://
    71  "https?://
#ce

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...