mocro Posted November 17, 2004 Posted November 17, 2004 Hello, I was looking to write a program to extract text from an html file. There are some unusually long lines in most html files that are downloaded using "InetGet" however when I use the "FileReadLine" function, it only seems to grab a maximum of 65534 characters. Is this a limitation of this function? Any help would be appreciated, Thanks, Mocro
this-is-me Posted November 18, 2004 Posted November 18, 2004 Use FileRead($filename,FileGetSize($filename)) and stringsplit the result by @CRLF Who else would I be?
scriptkitty Posted November 18, 2004 Posted November 18, 2004 (edited) Just a side note, you can't split a string by more than one character (@CRLF is 2), I usually use something like this: ; use $x=File2Array("c:\mytempfile.html") ; returns $x as an array. ; by ScriptKitty func file2array($filename) $_file=FileRead($filename,FileGetSize($filename)) $_file=StringReplace($_file,@lf,@cr); make all @lf into @cr $_file=StringReplace($_file,@cr&@cr,@cr); remove extras you made from @CRLFs $_file=StringSplit($file,@cr); splits by @cr character return $_file endfunc I usually do it in two lines, but this explains it a bit more. Web documents are written in many forms, and some use @cr, some @lf, and some @crlf. This splits them all pretty well. Edited November 18, 2004 by scriptkitty AutoIt3, the MACGYVER Pocket Knife for computers.
mocro Posted November 18, 2004 Author Posted November 18, 2004 Thanks for all the help. I also went looking in the library reference and found the "_FileReadToArray" function which also may be of help to noobs like me.:">
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now