Implementing Simple HashTable

Implementing Simple HashTable

Edit this problem on GitHub

Hash table is a very important and interesting structure. If I were you, I would love to have my own hash table implemented. So let’s get started with the most simple implementation. The following graph might help:

As everyone already knows, hash table contains `(key, value)` pairs. For example, when you take “John Smith” as key of hash table, there will be a hash function to transform the `String` type key value to a `Integer` as index. With the index, you could easily get the value. In this case, the `(key, value)`  pair will be called as entry of hash table.

Now you are ready to implement it. Straight forward huh. Let’s get to the simplest part.

The code for hash table entry is:

```public class HashEntry {
private String key;
private int value;

HashEntry(String key, int value) {
this.key = key;
this.value = value;
}

public String getKey() {
return key;
}

public int getValue() {
return value;
}
}```

The code for hash table is:

```public class SimpleHashTable {
private int TABLE_SIZE;
private int size;
private HashEntry[] table;

/* Constructor */
public SimpleHashTable(int ts) {
size = 0;
TABLE_SIZE = ts;
table = new HashEntry[TABLE_SIZE];
for (int i = 0; i < TABLE_SIZE; i++)
table[i] = null;
}
/* Function to get number of key-value pairs */
public int getSize() {
return size;
}
/* Function to clear hash table */
public void clean() {
for (int i = 0; i < TABLE_SIZE; i++)
table[i] = null;
size = 0;
}

/* function to retrieve value from the table according to key */
public int get(String key) {
int index = findIndex(key);
return table[index] == null ? -1 : table[index].getValue();
}

/* function to add value to the table */
public void put(String key, int value) {
table[findIndex(key)] = new HashEntry(key, value);
size++;
}

/* function to remove value to the table */
public void remove(String key) {
table[findIndex(key)] = null;
size--;
}

/* value to create the Hash code from he name entered */
private int calculateHashCode(String key) {
int mod = key.hashCode() % TABLE_SIZE;
return mod < 0 ? mod + TABLE_SIZE : mod;
}

private int findIndex(String key) {
int index = calculateHashCode(key);
while (table[index] != null && !table[index].getKey().equals(key)) {
index = (index + 1) % TABLE_SIZE;
}
return index;
}
}```

I guess I don’t have to explain everything, just the core, `findIndex()`  and `calculateHashCode()`.

Explanation On Important Functions

The responsibility of function `calculateHashCode()` is to transform the `String` type key value to a `Integer` as index. Java has a function `String.hashCode()` to similar things, we could exploit that.

```Take s = key.toCharArray(),
Then s.hashCode() = s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1].```

Now we have a 1 to 1 map from `String` to `Integer`, the only thing to do is to transform the integer to index of hash table. We can do this by modulo:

```int mod = key.hashCode() % TABLE_SIZE;
return mod < 0 ? mod + TABLE_SIZE : mod;```

But what if we got collisions on index of hash table? For example:

```key1.hashCode() = 4;
key2.hashCode() = 8;
TABLE_SIZE = 4;```

Then we will need function `findIndex()`:

```int index = calculateHashCode(key);
while (table[index] != null && !table[index].getKey().equals(key)) {
index = (index + 1) % TABLE_SIZE;
}```

If `index` has already been used in hash table, we use the larger index. Now we can get the efficiency of this simple hash table:

```                Average	               Worst Case
Space	         O(n)                       O(n)
Search	        O(1)	                     O(n)
Insert	        O(1)	                     O(n)
Delete	        O(1)	                     O(n)```

Now if you pay enough attention, you will find out there is a flaw inside my simple hash table: I never handle hash-full situation. Basically, this hash table will enter a infinite loop in function `findIndex()` if the hash table is full. How do we handle that case?

We could fix that by using hash table chaining. Instead of having simple `(key, value)`  pairs, we will have the `value` as

a linked list. If we are in a hash full situation, we just append it to the end of linked list.

The code for hash table entry is:

```public class LinkedHashEntry {
String key;
int value;

/* Constructor */
{
this.key = key;
this.value = value;
this.next = null;
}
}```

The code for hash table is:

```public class NotSoSimpleHashTable {
private int TABLE_SIZE;
private int size;

/* Constructor */
public NotSoSimpleHashTable(int ts) {
size = 0;
TABLE_SIZE = ts;
for (int i = 0; i < TABLE_SIZE; i++)
table[i] = null;
}
/* Function to get number of key-value pairs */
public int getSize() {
return size;
}
/* Function to clear hash table */
public void clean() {
for (int i = 0; i < TABLE_SIZE; i++)
table[i] = null;
size = 0;
}
/* Function to get value of a key */
public int get(String key) {
int hash = (calculateHashCode( key ) % TABLE_SIZE);
if (table[hash] == null)
return -1;
else {
while (entry != null && !entry.key.equals(key))
entry = entry.next;
if (entry == null)
return -1;
else
return entry.value;
}
}
/* Function to insert a key value pair */
public void insert(String key, int value) {
int hash = (calculateHashCode( key ) % TABLE_SIZE);
if (table[hash] == null)
else {
while (entry.next != null && !entry.key.equals(key))
entry = entry.next;
if (entry.key.equals(key))
entry.value = value;
else
}
size++;
}

public void remove(String key) {
int hash = (calculateHashCode( key ) % TABLE_SIZE);
if (table[hash] != null) {
while (entry.next != null && !entry.key.equals(key)) {
prevEntry = entry;
entry = entry.next;
}
if (entry.key.equals(key)) {
if (prevEntry == null)
table[hash] = entry.next;
else
prevEntry.next = entry.next;
size--;
}
}
}
/* Function calculateHashCode which gives a hash value for a given string */
private int calculateHashCode(String key) {
int mod = key.hashCode() % TABLE_SIZE;
return mod < 0 ? mod + TABLE_SIZE : mod;
}
}```

Improvements on Performance

Of course, like everything else, there’s always a price for this improvement. The performance may be affected now that we have to go through a linked list instead of getting a single value. Besides, if some specific linked list is getting too long, we will need a rebalance function to improve performance. Those will not be included in this article 😀

However, we could make some other improvement. We could use bit-masking instead of modulo.

```private final int tableSize = Integer.highestOneBit(TABLE_SIZE) << 1;
private final int tableMask = tableSize - 1;```

Then we could get the remainder by `key.hashCode() & tableMask` instead of `key.hashCode() % TABLE_SIZE`.

Take the following as example:

```  decimal ->  binary
12345 ->  0011000000111001
32 ->  0000000000100000
32 - 1 ->  0000000000011111

12345 & 31 ->  0000000000011001
25 ->  0000000000011001```