This content originally appeared on Level Up Coding - Medium and was authored by Anshumaan Tiwari
In hashing we map data to a specific value. Hashing is used in many applications like database, crypto, and caching because it allows for good data retrieval. It works by converting a given key to an index with help of a hash function, where the data is stored and can be rapidly accessed.
Uses of Hashing
- Databases: For Good indexing and Good retrieval of data.
- Cryptography: For good data integrity and security.
- Caches: Fast lookup in memory caches.
What is a Hash Table?
As we can see below hash table is storing key in a particular slot, due to which fast data retrival can occur. Lets say we took a box and we have a formula which determine at what index we should place the data that is basically hashing.
+-------+-------+-------+-------+
| Slot0 | Slot1 | Slot2 | Slot3 |
+-------+-------+-------+-------+
| | Key1 | | Key2 |
+-------+-------+-------+-------+
In this example:
- Key1 is placed in Slot1.
- Key2 is placed in Slot3.
What is a Hash Function and What is the Division Method?
A hash function basically converts data into the specific index where data is to be putted. One of the most basic hash functions is the division method, which takes a key and divides it by the number of slots in the table, using the remainder as the slot number.
Example:
If the key is 10 and the table has 7 slots, we divide:
10 ÷ 7 = 1 remainder 3
So, 3 is the slot number where the key will be stored.
+-------+-------+-------+-------+-------+-------+-------+
| Slot0 | Slot1 | Slot2 | Slot3 | Slot4 | Slot5 | Slot6 |
+-------+-------+-------+-------+-------+-------+-------+
| | | | Key10 | | | |
+-------+-------+-------+-------+-------+-------+-------+
What is Simple Uniform Hashing?
Simple uniform hashing assumes that each key is equally likely to be hashed into any of the slots, resulting in an even distribution of keys. Imagine rolling a fair die — each side has an equal chance of landing face up. In practice, perfect uniform hashing is hard to achieve, but we try to approximate it using various hash functions.
Collision Resolution Techniques
Even with a good hash function, collisions can occur — when two keys hash to the same slot. This is similar to two cars trying to park in the same space. There are several ways to handle these collisions.
1. Separate Chaining
In separate chaining, each slot in the hash table contains a linked list of keys that share the same slot number.
+-------+-------+-------+-------+-------+
| Slot0 | Slot1 | Slot2 | Slot3 | Slot4 |
+-------+-------+-------+-------+-------+
| | Key1 | Key2 | Key3 | |
| | ↓ | ↓ | | |
| | Key4 | Key5 | | |
+-------+-------+-------+-------+-------+
Here:
- Slot1 holds a linked list of keys: Key1, Key4.
- Slot2 holds a linked list of keys: Key2, Key5.
2. Linear Probing
In linear probing, when a collision occurs, the algorithm checks the next slot in the table sequentially until an empty slot is found.
+-------+-------+-------+-------+-------+
| Slot0 | Slot1 | Slot2 | Slot3 | Slot4 |
+-------+-------+-------+-------+-------+
| | Key1 | Key2 | | |
+-------+-------+-------+-------+-------+
↓ ↓ ↓
(Probing Sequence: Next available slot)
↓ ↓ ↓
If Slot1 is occupied, the algorithm checks Slot2, and so on, until it finds an empty spot.
3. Quadratic Probing
Quadratic probing is similar to linear probing but checks slots in a quadratic sequence (1, 4, 9, 16, etc.), reducing clustering.
+-------+-------+-------+-------+-------+
| Slot0 | Slot1 | Slot2 | Slot3 | Slot4 |
+-------+-------+-------+-------+-------+
↓ ↓ ↓
(Quadratic Probing: Check slot at 1, 4, 9 distance)
↓ ↓ ↓
If Slot1 is taken, it will check further away in a non-linear pattern, like jumping to Slot4, Slot9, etc.
4. Double Hashing
In double hashing, when a collision occurs, a second hash function is used to calculate a step size to determine the next slot.
+-------+-------+-------+-------+-------+
| Slot0 | Slot1 | Slot2 | Slot3 | Slot4 |
+-------+-------+-------+-------+-------+
| | Key1 | Key2 | | |
↓ ↓
(Second Hash: Steps based on new hash value)
If Slot1 is occupied, the second hash function determines how many slots to skip before trying again.
What is Load Factor?
The load factor is the ratio of the number of elements in the hash table to the total number of slots. It measures how full the hash table is. A higher load factor increases the likelihood of collisions, so when the load factor becomes too high, the table may need to resize to maintain efficiency.
What is Chaining and Open Addressing?
Chaining
In chaining, each slot has a list of keys that map to that slot. If multiple keys are hashed to the same slot, they are stored in a linked list.
+-------+-------+-------+-------+-------+
| Slot0 | Slot1 | Slot2 | Slot3 | Slot4 |
+-------+-------+-------+-------+-------+
| | Key1 | Key2 | Key3 | |
| | ↓ | ↓ | | |
| | Key4 | Key5 | | |
+-------+-------+-------+-------+-------+
Open Addressing
In open addressing, all elements are stored directly in the hash table. If a slot is occupied, the algorithm searches for the next available slot using a probing sequence.
+-------+-------+-------+-------+-------+
| Slot0 | Slot1 | Slot2 | Slot3 | Slot4 |
+-------+-------+-------+-------+-------+
| | Key1 | Key2 | | |
↓ ↓ ↓
(Probing Sequence: Find next available slot)
↓ ↓ ↓
Conclusion
Hashing and hash tables are crucial tools in computer science, offering efficient ways to store and retrieve data. From simple hash functions to more advanced collision resolution techniques like chaining and probing, understanding these concepts helps in designing better systems that optimize both speed and storage.
Thank You
Introduction to Hashing was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.
This content originally appeared on Level Up Coding - Medium and was authored by Anshumaan Tiwari
Anshumaan Tiwari | Sciencx (2024-09-08T14:50:33+00:00) Introduction to Hashing. Retrieved from https://www.scien.cx/2024/09/08/introduction-to-hashing/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.