sebgg Posted November 28, 2011 Share Posted November 28, 2011 So just a little project I want to do, see the longest/most frequent word in a massive (several billion characters) list of seemingly random letters. so need alist of known words to compare to. wondered if anyone has one compiled already or if anyone knows where i could get a hold of one? Cheers, Sebastian. GC - Program to rapidly manipulate DNA SequencesRotaMol - Program to measure Protein Size Link to comment Share on other sites More sharing options...
Mat Posted November 28, 2011 Share Posted November 28, 2011 (edited) http://www.manythings.org/vocabulary/lists/l/I imagine you may want to find a better way to store them... And I'd do a lot of reading into string search algorithms (the one I know of is by three guys, one called pratt). I'd also consider another language other than AutoIt (not something I suggest very often ). Given that the task is pretty simple, avoiding the overhead you'll get with AutoIt should be easy. Edited November 28, 2011 by Mat AutoIt Project Listing Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted November 28, 2011 Moderators Share Posted November 28, 2011 (edited) sebgg,You cannot be arsed to search - why should we? But you might want to click here. M23Edit: typo. Edited November 28, 2011 by Melba23 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
GEOSoft Posted November 28, 2011 Share Posted November 28, 2011 (edited) Just follow the link to my website and you will find it on there in both zip and txt formats. It's in the Miscellaneous section. Edited November 28, 2011 by GEOSoft George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
sebgg Posted November 28, 2011 Author Share Posted November 28, 2011 sebgg,You cannot be arsed to search - why should we? But you might want to click here. M23Edit: typo.because then i wouldnt have the chance to chat to all you lovely people!thanks all for help. in the end went with a 17x,xxx word list space separated, perfect!sebs GC - Program to rapidly manipulate DNA SequencesRotaMol - Program to measure Protein Size Link to comment Share on other sites More sharing options...
GEOSoft Posted November 28, 2011 Share Posted November 28, 2011 (edited) If there are only 170K of words it's no where near complete.EDIT: Out of curiosity where did you find your word list (link)? Every time I find a new list I just merge it into mine. Sometimes I actually find a few words that are missing from my list. Edited November 28, 2011 by GEOSoft George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
kylomas Posted November 28, 2011 Share Posted November 28, 2011 GEOSoft,If there are only 170K of words it's no where near complete.The list at your site has appx 121K words...what would you consider a complete list?kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
Mat Posted November 28, 2011 Share Posted November 28, 2011 You could circumbobulate and obambulate around this topic, but it would be a case of acrasia. You are better off aucupating... Let's face it, any dictionary is going to be macilent, given that our language is motatorious. It's all rather ostrobogulous... (pandiculates due to delassation). I imagine this would be a good topic for a deipnosophist though VelvetElvis 1 AutoIt Project Listing Link to comment Share on other sites More sharing options...
kylomas Posted November 28, 2011 Share Posted November 28, 2011 Thank you, Mat, precicely what I was thinking. kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
Thornhunt Posted November 29, 2011 Share Posted November 29, 2011 (edited) You could circumbobulate and obambulate around this topic, but it would be a case of acrasia. You are better off aucupating... Let's face it, any dictionary is going to be macilent, given that our language is motatorious.It's all rather ostrobogulous... (pandiculates due to delassation). I imagine this would be a good topic for a deipnosophist though i think my avatar pretty much summed up my face when i read this... Edited November 29, 2011 by Thornhunt Budweiser + room = warm beerwarm beer + fridge = too long!warm beer + CO2 fire extinguisher = Perfect![quote]Protect the easly offended ... BAN EVERYTHING[/quote]^^ hmm works for me :D Link to comment Share on other sites More sharing options...
sebgg Posted November 29, 2011 Author Share Posted November 29, 2011 If there are only 170K of words it's no where near complete.EDIT: Out of curiosity where did you find your word list (link)? Every time I find a new list I just merge it into mine. Sometimes I actually find a few words that are missing from my list. i went with this as it was longer than the one on your webpage, but combining them might add afew i have no idea.and yes 170k is far from complete but its a nice start for fun.http://homepage.ntlworld.com/adam.bozon/Dictionary.htmseb GC - Program to rapidly manipulate DNA SequencesRotaMol - Program to measure Protein Size Link to comment Share on other sites More sharing options...
GEOSoft Posted November 29, 2011 Share Posted November 29, 2011 (edited) Damn! I thought I had updated that file. It should be just over 188K words and should not include any single character words since it was written to update my wifes Scrabble program. I'll update it in a few minutes. EDIT: It's been updated. I also have a copy of the list that does include single character words if you need it. I know what happened with that origional list too; It only included words that my script was able to verify on a couple of dictionary sites. I knew that when I was working on the script I was over 188K words because I remember mentioning that figure to SmOke_N at the time in an IM. Edited November 29, 2011 by GEOSoft George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
BrewManNH Posted November 29, 2011 Share Posted November 29, 2011 You can also check this site for a word list that might be of use. I looked at the SCOWL list and there's over 290,000 lines in the list, but there's a lot of words that are plural versions, or possessives of other words in the list so it might take some culling to get a good list out of it. If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag GudeHow to ask questions the smart way! I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from. Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays. - ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script. - Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label. - _FileGetProperty - Retrieve the properties of a file - SciTE Toolbar - A toolbar demo for use with the SciTE editor - GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI. - Latin Square password generator Link to comment Share on other sites More sharing options...
GEOSoft Posted November 29, 2011 Share Posted November 29, 2011 It will take a lot of culling since it also contains things like Acer which is a tree genus and would not normally be included in a word list. Thanks for the link though. I'll run through them with a different script and see if I can add any to my list. George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
GEOSoft Posted November 29, 2011 Share Posted November 29, 2011 @BrewManNH Thanks to a link I found via that page you linked to. My new list is 255,329 words. I'll be uploading it later today. Thanks George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
BrewManNH Posted November 29, 2011 Share Posted November 29, 2011 I think the Oxford English Dictionary only has that many words in it. If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag GudeHow to ask questions the smart way! I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from. Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays. - ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script. - Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label. - _FileGetProperty - Retrieve the properties of a file - SciTE Toolbar - A toolbar demo for use with the SciTE editor - GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI. - Latin Square password generator Link to comment Share on other sites More sharing options...
GEOSoft Posted November 29, 2011 Share Posted November 29, 2011 Probably correct but with thousands or even tens of thousands of wordlists available it takes time to generate a new one. Every once in a while I come across another list that is worth checking and this time it was thanks to you. George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
sebgg Posted November 30, 2011 Author Share Posted November 30, 2011 cheers geo, ill try with the updated word list youve got, thanks all for help Seb GC - Program to rapidly manipulate DNA SequencesRotaMol - Program to measure Protein Size Link to comment Share on other sites More sharing options...
Chimaera Posted November 30, 2011 Share Posted November 30, 2011 i think my avatar pretty much summed up my face when i read this...I think i need to temporarily borrow your avatar too If Ive just helped you ... miracles do happen. Chimaera CopyRobo() * Hidden Admin Account Enabler * Software Location From Registry * Find Display Resolution * _ChangeServices() Link to comment Share on other sites More sharing options...
Sn3akyP3t3 Posted November 30, 2011 Share Posted November 30, 2011 It seems like your compiled list of words is quite massive and this information may no longer assist you, but check out the spell check dictionaries and language packs provided by the Mozilla foundation. The en-US dictionary consists of around 62,000 words that you may find don't exist in your list.Also, what are you using to compare and import text so duplicates are not brought in? Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now