Jump to content

_INetGetSource returns only a three character string despite StringLen saying otherwise


Seminko
 Share

Recommended Posts

I have this code:

#include <Inet.au3>
#include <Debug.au3>
#include <Array.au3>

$pageSource = _INetGetSource("https://www.csfd.cz/film/508676-triple-threat/")
If @error Then
    MsgBox(16, "", "inet error")
ElseIf StringLen($pageSource) < 50 Then
    MsgBox(16, "short source", $pageSource)
ElseIf StringLen($pageSource) >= 50 Then
    MsgBox(1, "", StringLen($pageSource)) ; shows 12681 characters
    $arr = StringRegExp($pageSource, '(?is)(?:<h1 itemprop="name">\s*|<li>\s*<img[^>]+>\s*<h3>)(.*?)\s*<', 3)
    If @error Then
        MsgBox(16, "regex error", $pageSource) ; shows only �‹�
    EndIf
    ClipPut($pageSource) ; puts only �‹� into the clip
    MsgBox(1, "", BinaryToString(StringToBinary(_ArrayToString($arr), 1), 4))
EndIf

When I check for the length of the string returned by _INetGetSource it is 12681 characters, however when I put the $pageSource into a msgbox or clip, it only shows three characters '‹', triggering the regex error. When I go to the page manually and copy the source, the regex works.

Any ideas what might be causing it?

 

EDIT: I should add that this does happen only for a handfull of URLs out of the set which appear to be random.

Edited by Seminko
Link to comment
Share on other sites

The site most of the times returns a variable bunch of spurious characters, including strings of NULs (0x00).

You should get binary then convert that to string before use (else the conversion is forced by functions you call.)

Local $pageSource = _INetGetSource("https://www.csfd.cz/film/508676-triple-threat/", 0)
If @error Then
    MsgBox(16, "", "inet error")
    Exit
EndIf
ConsoleWrite(BinaryLen($pageSource) & @LF & $pageSource & @LF)

The length is very variable, as well as the content.

Sample successive runs:

13319
0x1F8B0800000000000003ED7D5B731B57B6DEB3FC2BF6608E87E40C1AF70B419174C9926C59B6648D45DB734652B...

12947
0x1F8B0800000000000003ED7D49931B4796E699FC152E54AB32B384C0BE24929929E326519448B1446A2991B4340...

12592
0x1F8B0800000000000003ED7D5B731B4796E6B3F42BB2D1E3263946E17E2128920E5D6CCBB225AB2D4AEEB6A4602...

13568
0x1F8B0800000000000003ED9D4B731C579698D7D4AFC8AE1935C0E97A3F0190808222A5564B22C51129A9A745062...

 

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Thanks @jchd.

Have you ever encountered something like that? Any idea what might be causing it and if it can be avoided?

First thing that comes to mind is try using a get request.

EDIT: WinHTTP same behavior, basic Curl get doesn't return anything

Edited by Seminko
Link to comment
Share on other sites

I've no idea.  Maybe compressed content or a picture or ... who knows?

Browsers know how to handle that but only web gurus can tell.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Probably but what's strange is that the received binary is always one of the 4 types I dumped above.  Maybe it's hidden ad or womething .

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...