Gui Posted October 6, 2012 Share Posted October 6, 2012 Essentially my goal is to retrieve the most popular word (or most repeated word) within in list of data, such as inside a text file. Any already super efficient ways of accomplishing this? Thanks GUI Link to comment Share on other sites More sharing options...
FireFox Posted October 6, 2012 Share Posted October 6, 2012 Hi, A quick reply because someone is maybe building you a snippet of what you want : Split the file into words, then go through words and add them into an array if they have not been added yet, otherwise increment the subindex. Br, FireFox. VelvetElvis 1 Link to comment Share on other sites More sharing options...
Gui Posted October 6, 2012 Author Share Posted October 6, 2012 Hi,A quick reply because someone is maybe building you a snippet of what you want :Split the file into words, then go through words and add them into an array if they have not been added yet, otherwise increment the subindex.Br, FireFox.Hmm, thanks I think that'll do the trick! I'll just have to split everything into words. Link to comment Share on other sites More sharing options...
jchd Posted October 6, 2012 Share Posted October 6, 2012 Correct, and for doing so you need first to precisely define what "word" means to you in the problem at hand. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
UEZ Posted October 6, 2012 Share Posted October 6, 2012 Something like this here? #include <Array.au3> $sText = 'AutoIt v3 is a freeware BASIC-like scripting language designed for automating the Windows GUI and general scripting.' & @CRLF & _ 'It uses a combination of simulated keystrokes, mouse movement and window/control manipulation in order to automate tasks in a way not possible or reliable with other languages (e.g. VBScript and SendKeys).' & @CRLF & _ 'AutoIt is also very small, self-contained and will run on all versions of Windows out-of-the-box with no annoying "runtimes" required!' $aTest = MostRepeatedWords($sText) _ArrayDisplay($aTest) Func MostRepeatedWords($sText) Local $aSplit = StringRegExp($sText, "(w+)", 3) Local $aUnique = _ArrayUnique($aSplit) Local $aResult[UBound($aUnique)][2], $i, $c, $aTmp For $i = 1 To $aUnique[0] $aResult[$i][0] = $aUnique[$i] $aTmp = _ArrayFindAll($aSplit, $aUnique[$i], 0, 0, 2) $aResult[$i][1] = UBound($aTmp) Next $aResult[0][0] = $aUnique[0] _ArraySort($aResult, 1, 1, 0, 1) Return $aResult EndFunc Br, UEZ FireFox and Gianni 2 Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
Spiff59 Posted October 6, 2012 Share Posted October 6, 2012 (edited) Don;t have time to refine it, off to a Huskers versus Buckeyes party, but here's a conceptual start: #include <Array.au3> Global $str = "Now is the time for all good men to come to the aid of their country" Global $array = StringSplit($str, " ") Global $count, $idx For $x = 0 to UBound($array) - 1 ; count 'em up $y = "__" & $array[$x] If IsDeclared($y) Then $z = Eval($y) + 1 If $z > $count Then $count = $z Assign($y, $z) Else Assign($y, 1) EndIf Next For $x = 0 to UBound($array) - 1 ; crunch 'em If Eval("__" & $array[$x]) = $count Then $idx += 1 $array[$idx] = $array[$x] EndIf Next Redim $array[$idx + 1] $array[0] = $count _ArrayDisplay($array) A cleaned-up version: #include <Array.au3> Global $str = "I am the eggman, they are the eggmen, I am the walrus, coo coo ka choo." $str = StringRegExpReplace($str, "[.,?]", "") ; remove punctuation Global $array = StringSplit($str, " ") Global $count = 1, $idx For $x = 1 to $array[0] ; count 'em up, drop dupes $y = "__" & $array[$x] If IsDeclared($y) Then $z = Eval($y) + 1 If $z > $count Then $count = $z Assign($y, $z) Else Assign($y, 1) $idx += 1 $array[$idx] = $array[$x] EndIf Next Redim $array[$idx + 1] $array[0] = $idx $idx = 0 For $x = 1 to $array[0] ; pick the winners If Eval("__" & $array[$x]) = $count Then $idx += 1 $array[$idx] = $array[$x] EndIf Next Redim $array[$idx + 1] $array[0] = $idx _ArrayDisplay($array, "occurs " & $count & " times") I think the Assign() / IsDeclared() trick I first saw Yashied use in an alternate ArrayUnique() function is the closest thing to "super efficient" you'll find. Edited October 7, 2012 by Spiff59 Link to comment Share on other sites More sharing options...
czardas Posted October 7, 2012 Share Posted October 7, 2012 (edited) I have created a word frequency stats function, but I haven't turned it into a UDF that I can quickly post. It's a follow up to I thought perhaps the link it might be helpful. Edited October 7, 2012 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now