Hash Table UDF

mrider · May 18, 2005

Attached is a hash table UDF - and my unit tests. Both are a bit too long to include in the post inline...

Many, many thanks to ezzetabi for the timely sorting post - without which I would have used recursion for my QuickSort.

The Hash UDF has the following functions:

_HashF_KillAll() Destroys the hash and releases the memory it consumes. Returns nothing.

_HashF_NewInstance() Creates a new hash holder. Returns a pointer to the new hash created. Use the pointer for all methods which call for "$ptr". $ptr will always be zero if only one hash exists, and so can be ignored in those circumstances.

_HashF_Put($key, $val[, $ptr]) Puts a key->val pair in the hash. Overwrites the current pair if it exists.

_HashF_Get($key[, $ptr]) Gets the value associated with the key passed. $ptr refers to a hash mapping instance.

_HashF_RemoveHash($ptr) Removes a hash mapping instance from the storage array. Silently fails if $ptr is not valid.

_HashF_RemovePair($key[, $ptr]) Removes a key->value mapping from the hash. Silently fails if key or pointer does not exist.

_HashF_Grow([$adder]) Grows the hash storage space by the amount supplied in $adder - or by the default amount. See _HashF_DefaultResizeAmt. This function is called internally, and is not normally needed, but can save a bit of time if called before adding a large number of items.

_HashF_Shrink([$shrinkType]) Shrinks the hash. Useful if a number of key->value pairs have been removed. Valid Shrink Types: 0 - (default) Shrinks the hash to minimum size plus the default size adder. 1 - Shrinks the hash to the absolute minimum size.

_HashF_DefaultResizeAmt($amt) Sets the default number of slots that will be added each time the hash grows. This is useful if items are added in a known quantity. For example, set this to ten if items are always added in groups of ten.

_HashF_KeyPairCount([$ptr]) Returns the number of key->value pairs in the hash pointed to by pointer.

_HashF_Keys([$ptr]) Returns an array of the keys in the hash pointed to by pointer. Element zero contains the number of elements in the array.

_HashF_Values([$ptr]) Returns an array of the values in the hash pointed to by pointer. Element zero contains the number of elements in the array.

_HashF_Entries([$ptr]) Returns a two-dimensional array of the keys and values in the hash. The first two values both contain the number of key->value pairs. The first array dimension will be sized to hold all elements, the second dimension has two elements - the first is the key, the second is the value. (The code comments has an excellent example)

_HashF_ValidatePtr($ptr) Ensures the pointer passed points to an actual hash.

_HashF_SortKeysAsc([$ptr]) Sorts the hash by key in ascending order.

_HashF_SortKeysDesc([$ptr]) Sorts the hash by key in descending order.

_HashF_SortValsAsc([$ptr]) Sorts the hash by value in ascending order.

_HashF_SortValsDesc([$ptr]) Sorts the hash by value in descending order.

ezzetabi · May 19, 2005

I am too lazy to seek in your code 0_o ...

How did you managed collisions? And what hash functions you used (I am surprised you found a good one for a generic type like autoit variants without reinterpet casting...)?

mrider · May 19, 2005

When all else fails, there's always brute force...

I create a three dimensional array:

The first dimension resolves to a hash instance - and starts out with one element. I store and return an identity integer for each hash instance. The identity number corresponds to the index when no hash instances have been deleted. When a hash instance is deleted, the last hash instance is copied to the hash instance to be deleted (unless it's the last instance), then the array is redimmed to be one element smaller.

The second dimension resolves to the last key-value pair in the hash in question (almost certainly there're a few empty slots - sometimes there are LOTS).

The third dimension has two elements: element 0 is the key, element 1 is the value.

When adding a key->value pair, I first search through the hash identified by the pointer passed using a simple == comparison. If the key is found, I update the value entry. If the key is not found, I add the key-value pair to the end of the array, and then add one to the "last entry" pointer. If adding an element would exceed the boundaries of the array, the whole thing is redimmed to be a bit larger.

Example array values:

[0][0][0] = number of elements in this hash

[0][0][1] = identity number (probably == 0)

Continues to

[n][0][0] = number of elements in this hash

[n][0][1] = identity number (probably == n)

[x][1][0] = first key in hash x

[x][1][1] = first value in hash x

Continues to

[x][n][0] = key n in hash x

[x][n][1] = value n in hash x

And

[EDIT]

[x][0][0] == n

[x][0][1] == identity number

Finding a hash instance by pointer means searching down through the array until [?][0][1] == pointer passed. From then on, the index number (?) is used to reference the actual array.

It's ugly, but it mostly works. It can waste a lot of space - and it'll probably fail if the key is something vague like a floating point number.

As an improvement, I'd like to use a series of anonymous arrays - where a two-dimensional hash array is dimmed inside a function. I would store a global array that holds references to those arrays. Originally I thought I wouldn't be able to resize the anonymous arrays - but I've since figured out that I could copy an anonymous array to a known array, redim the known array, then copy the reference back to the global array. This would have the advantage that each hash could be a different size. I need to test this in AutoIt though. I'm not sure if the memory would be released with my methodology.

I don't really know how to get past the vague key references.

Edited May 19, 2005 by mrider

Smed · May 20, 2005

I'm shocked that this topic hasn't had more replies. I read it, and thought, "I don't need it, but its cool." more than twice. I guess there aren't that many people used to programming in Perl here. I think its a great idea, but by the time I need it, I'll probably have already done it in another language.

Fur · May 20, 2005

#1 I would have loved it about a month ago. As it was, I needed this back then, so I wrote a whole set of hash functions in C++ STL and DLLCall() them all.

#2 I think most AutoIt scripters aren't classically trained programmers, and the idea of a hash is not intutive to them.

#3 It still great stuff!! Might use it in the future!

Fur · May 20, 2005

Ooops #4, maybe you could add a pair of functions to load/save the hash to a file? I think that's the only thing my stuff did yours doesn't.

mrider · May 20, 2005

Ooops #4, maybe you could add a pair of functions to load/save the hash to a file? I think that's the only thing my stuff did yours doesn't.

That's a capital idea! Sure, why not. Give me a day or two. I'd think that something like a Java Properties file would work nicely.

I'm shocked that this topic hasn't had more replies.

What makes me laugh is that my unit tests have been downloaded more frequently than the UDF.

I guess there aren't that many people used to programming in Perl here.

I suppose not. The funny thing is, I'm trying to NOT use Perl in my current project. See my original post and my follow up post for more info.

GaryFrost · May 20, 2005

More people use perl than you think

mrider · May 20, 2005

I think you are misinterpreting what I'm saying - and I'm definitely not trying to be insulting.

The point is that Perl is big on hashes, and those that use Perl regularly miss that functionality when using a language without that functionality.

The fact that only a few people have even looked at my hash UDF - and that NOBODY responded to my first post - and that I only got one response to my follow up post, suggests that not many AutoIt programmers are Perl programmers.

I personally use Perl more than AutoIt. However, I've had troubles with the freebie compilers when using the Win32 library - and my target machines don't have the Perl environment. I need AutoIt to control some functions associated with my latest project, and I don't want the hassles of Win32...

Edited May 20, 2005 by mrider

GaryFrost · May 20, 2005

didn't misinterpret and not insulted, I also use Perl more than AutoIt, and you are correct Hashes are used alot in Perl, Use alot myself.

Fortunatly my company pays for komodo, dev kit etc...

so I don't have to worry about what is on the end-user's system.

blindwig · June 4, 2005

I'm writing a UDF library for dealing with 2-dimensional arrays that I am calling "tables". I'm trying to understand your hases library. I've never used perl (or, I think I have, but obviously whatever I was doing didn't require hashes).

Anyway, I don't get why you need 3 dimensions? Is it that in a hash such as [a][c], a is the hash table number and [c] is the actual table? So [0][1][0] would be unique in [0][x][0], but could be duplicated in [1][x][0]. Am I understanding right? If that's the case, then I really don't understand - that means all your tables would have to be the same size?

Well, anyway, the routines I'm writing would be suited to andling the output of the IniReadSection function, or, if I understand correctly, your _HashF_Entries function.

JSThePatriot · June 4, 2005

I would like to understand more about "Hash Tables" and Hashing. I have just read a definition online, and it sounds to me it is basically a small database. Tell me if I am wrong in that.

I came upon the Hash Table posts when I wanted to just encrypt a string. I had heard of Hash Tables before and such. I just want to be sure I understand them properly.

JS

jftuga · June 5, 2005

It would be *really* nice if hash tables were a built-in part of the AutoIt language. I have written these is C++ before, using double-hashing. It's been so long ago that I don't remember which hash function I used.

Would it be possible to incorporate the VB Scripting.Dictionary to do this?

-John

Insolence · June 5, 2005

Be nice if we had Stacks and arraylists too

edit - confused hashtables with stacks for a second

Edited June 5, 2005 by Insolence

JSThePatriot · June 5, 2005

Be nice if we had Stacks and arraylists too
edit - confused hashtables with stacks for a second
<{POST_SNAPBACK}>

Could you explain the difference between all three things you are wanting... and how a hash table differs from a database?

Thanks,

JS

jftuga · June 5, 2005

I would love to see hash tables built into the language -- not a UDF.

So you could have code like this:

$myvar["Version"]="3.1.1"
$myvar[311]="vers."

Ideally, you could have any data type for the key and for the value. Then you could iterate over the collection like this:

for $key in $myvar
   MsgBox(0,"key,value", $key, $myvar[$key])
endfor

Since I have not looked at the AutoIt source code, I don't know if it uses STL or not. If it did, maybe it could use the HashMap container.

What do you think?

-John

Edited June 5, 2005 by jftuga

Valik · June 5, 2005

I would love to see hash tables built into the language -- not a UDF.
So you could have code like this:
$myvar["Version"]="3.1.1"
$myvar[311]="vers."
Ideally, you could have any data type for the key and for the value. Then you could iterate over the collection like this:
for $key in $myvar
   MsgBox(0,"key,value", $key, $myvar[$key])
endfor
Since I have not looked at the AutoIt source code, I don't know if it uses STL or not. If it did, maybe it could use the HashMap container.
What do you think?
-John
<{POST_SNAPBACK}>

It doesn't use STL.

Insolence · June 5, 2005

I'm not entirely sure about any of this

A Hashtable is kind of like an array, with a lookup time of O(1). At least in C# it looks something like this:

Hashtable Table = new Hashtable();
Table.Add("key","value");
System.Windows.Forms.MessageBox.Show( Table["key"] ); //Would spit out 'value'

A Stack is a First In First Out collection, example:

Stack myStack = new Stack();
myStack.Push("value");
System.Windows.Forms.MessageBox.Show( myStack.Pop ); //Would spit out 'value'

An ArrayList is an array that doesn't have a defined 'end', so you can always add stuff to it:

ArrayList Array = new ArrayList();
Array.Add ("value");
foreach ( string value in Array) 
System.Windows.Forms.MessageBox.Show( value  ); //Would spit out 'value'

Wrote these all off the top of my head before going to work, don't be surprised if you see errors =\

Valik · June 5, 2005

A Stack is a First In First Out collection, example:

Correction, a stack is a last in first out container. The last element added to the container is the first element popped off. The very first element added to a stack will be the very last element to be popped off.

Also, a stack does not always return the value being popped. Pop() is supposed to remove the last element from the stack. In some situations, yes, it would be perfectly safe to return the value, but in other circumstances, it wouldn't be. The convention for this is probably language dependent, though.

Insolence · June 5, 2005

Ah yes my bad, I was writing that as fast as I could Thanks for checking those Valik.

Also, a stack does not always return the value being popped. Pop() is supposed to remove the last element from the stack. In some situations, yes, it would be perfectly safe to return the value, but in other circumstances, it wouldn't be. The convention for this is probably language dependent, though.

I think in C# it's pretty safe. You can use <stack>.Peek() to look at the top value without popping it, if that's what you mean.

Sign In

Hash Table UDF

Recommended Posts

mrider

ezzetabi

mrider

Smed

Fur

Fur

mrider

GaryFrost

mrider

GaryFrost

blindwig

JSThePatriot

jftuga

Insolence

JSThePatriot

jftuga

Valik

Insolence

Valik

Insolence

Create an account or sign in to comment

Create an account

Sign in

Browse

AutoIt Resources

Release

Beta