BatMan22 Posted October 20 Share Posted October 20 Hi All, I've been playing with my own home server and one of the cool things that I've been playing with is running my own LLM for privacy purposes plus to mainly screw around. The only programming language that I really 'know' is autoit. I would love to use my LLM to help me with AutoIT, in order to really get it to understand that language, I would need to scrape the forums / help files. I'm sure I could upload the help files if it was ok but I'm not comfortable scraping the forums, especially without permission. Is it possible instead to get like monthly 'scrapes' or something so I don't have to do my own? I imagine that scraping every page uses up a decent amount of resources and I don't feel comfortable doing that without permission / sharing / finding a better alternative. Any ideas? Admins? Better options? Also anyone know how to turn all the help files into PDF's or something for easier LLM digestion? Link to comment Share on other sites More sharing options...
Werty Posted October 20 Share Posted October 20 As per forum rules... Quote Other abuses which are either examples of excessive bandwidth usage or automation of the site. Some guy's script + some other guy's script = my script! Link to comment Share on other sites More sharing options...
argumentum Posted October 20 Share Posted October 20 35 minutes ago, BatMan22 said: Also anyone know how to turn all the help files into PDF's or something for easier LLM digestion? Yes. The help file is just HTML compiled into a CHM file. ( https://www.autoitscript.com/autoit3/files/archive/autoit/autoit-docs-v3.3.16.1-src.zip ) Follow the link to my code contribution ( and other things too ). FAQ - Please Read Before Posting. Link to comment Share on other sites More sharing options...
TheXman Posted October 20 Share Posted October 20 (edited) 1 hour ago, BatMan22 said: I would need to scrape the forums...Any ideas? A while back I created a script that does something similar. My script periodically checks the forum activity looking for new topics whose titles or text contain one or more specified keywords. My keywords are related to my UDFs and areas of interest. For instance, if a new topic's title contains a keyword like CryptoNG, jq, encryption, hash, json, HTTPAPI, etc., the script will send me an email alert. Instead of scraping the whole forum, I just use the forum's RSS feeds. The RSS activity feeds contain something like the last 25 posts. The RSS feeds include the post's date, time, title, and text in an XML format. Since it only has the last 25 or so entries, it isn't a lot of data to parse and it uses far less bandwidth than the normal user that hangs out monitoring, reading, or searching the forums. And depending on the tools you use for parsing and processing the information, it can be quite fast to go thru the feeds. I convert the XML feed to JSON and process that JSON using jq. Of course you can use whatever tools and processes you're most comfortable with. For an example, here's the URL for the ALL ACTIVITY feed: https://www.autoitscript.com/forum/discover/all.xml Edited October 20 by TheXman CryptoNG UDF: Cryptography API: Next Gen jq UDF: Powerful and Flexible JSON Processor | jqPlayground: An Interactive JSON Processor Xml2Json UDF: Transform XML to JSON | HttpApi UDF: HTTP Server API | Roku Remote: Example Script About Me How To Ask Good Questions On Technical And Scientific Forums (Detailed) | How to Ask Good Technical Questions (Brief) "Any fool can know. The point is to understand." -Albert Einstein "If you think you're a big fish, it's probably because you only swim in small ponds." ~TheXman Link to comment Share on other sites More sharing options...
argumentum Posted October 20 Share Posted October 20 41 minutes ago, BatMan22 said: running my own LLM for privacy purposes plus to mainly screw around. ..what's your setup ? ( hardware and software ) Follow the link to my code contribution ( and other things too ). FAQ - Please Read Before Posting. Link to comment Share on other sites More sharing options...
Developers Jos Posted October 20 Developers Share Posted October 20 No forum scraping as per our forum rules. *click* SciTE4AutoIt3 Full installer Download page - Beta files Read before posting How to post scriptsource Forum etiquette Forum Rules Live for the present, Dream of the future, Learn from the past. Link to comment Share on other sites More sharing options...
Recommended Posts