HashTable

MikeAbout 2 min

About HashTable

Concept

In computing, a hash table, also known as a hash map, is a data structure that implements an associative array, also called a dictionary, which is an abstract data type that maps keys to values. A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found. During lookup, the key is hashed and the resulting hash indicates where the corresponding value is stored.

Ideally, the hash function will assign each key to a unique bucket, but most hash table designs employ an imperfect hash function, which might cause hash collisions where the hash function generates the same index for more than one key. Such collisions are typically accommodated in some way.

Hash Function

A hash function h maps the universe U of keys h : U -> {0,...,m − 1} to array indices or slots within the table for each h(x) ∈ 0,...,m - 1 where x ∈ S and m < n. The conventional implementations of hash functions are based on the integer universe assumption that all elements of the table stem from the universe U = {0,...,u - 1}, where the bit length of u is confined within the word size of a computer architecture.

A perfect hash function h is defined as an injective function such that each element x in S maps to a unique value in 0,...,m − 1. A perfect hash function can be created if all the keys are known ahead of time.

Collision Resolution

1. Separate chaining

The collided items are chained together through a single linked list, which can be traversed to access the item with a unique search key.
separate chaining

2. Open addressing

Linear probing, in which the interval between probes is fixed (usually 1).
Quadratic probing, in which the interval between probes is increased by adding the successive outputs of a quadratic polynomial to the value given by the original hash computation. An example sequence using quadratic probing is: H + 1^2, H + 2^2, H + 3^2, H + 4^2,...,H + k^2
Double hashing, in which the interval between probes is computed by a secondary hash function.

Load Factor

A load factor is a critical statistic of a hash table, and is defined as follows:

load factor(a) = n / m, where n is the number of entries occupied in the hash table, m is the number of buckets.

The performance of the hash table deteriorates in relation to the load factor a. Therefore a hash table is resized or rehashed if the load factor a approaches 1. A table is also resized if the load factor drops below a(max) / 4. Acceptable figures of load factor a should range around 0.6 to 0.75.

Exercise

HashTable

# About HashTable

# Concept

# Hash Function

# Collision Resolution

# 1. Separate chaining

# 2. Open addressing

# Load Factor

# Exercise

# Valid Anagram

# Intersection of Two Arrays

# Happy Number

# Two Sum

# Three Sum

# Four Sum

# Summary

About HashTable

Concept

Hash Function

Collision Resolution

1. Separate chaining

2. Open addressing

Load Factor

Exercise

Valid Anagram

Intersection of Two Arrays

Happy Number

Two Sum

Three Sum

Four Sum

Summary