This content originally appeared on DEV Community and was authored by Sonu Kumar
KUID, a universally unique identifier, is essentially a shorter version of a UUID and is commonly used to identify records in various applications.
What's UUID?
A UUID (Universally Unique Identifier) is a unique identifier that is widely used across various applications and systems to identify records. It is a 128-bit number typically represented as a sequence of 32 hexadecimal digits, grouped in 5 sections, separated by hyphens, as in the following format: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.
UUIDs are designed to be globally unique, meaning that it is extremely unlikely for two UUIDs to be the same, even when generated independently by different systems. This makes UUIDs useful for identifying and tracking records in distributed systems, databases, and other contexts where unique identification is necessary.
There are different variants and versions of UUIDs, each with its own way of generating unique identifiers. Some commonly used variants include UUIDv1, UUIDv4, and UUIDv5. The specific variant and version of a UUID can be determined from the format and content of the identifier.
KUID is generated in two steps
- Generate UUID (use any algorithm like V1,V2,V3,V4 etc)
- Encode 128-bit number
KUID = Generate UUID and encode it in the higher base like base 32, 36, or 62 or even higher.
func encoder
input: number # 128 bit number
output: encoded string
func KUID:
input UUID, encoder
output KUID
return encode.encode( UUID )
The length of the KUID depends on the encoding scheme employed. If the base62 encoding scheme ([0–9A-Za-z]) is used, KUID can be encoded to 22 bytes. However, if you wish to store KUID in upper case, then encoding it using the base36 encoding scheme, which utilizes characters [0–9A-Z], will result in an encoded representation that can be stored in 26 bytes. For this post, we'll base 62 encoding.
Why should we use KUID?
KUID generates a smaller string, which results in less storage space and faster processing of many operations.
NOTE: Following statements are only valid when KUID is treated as a string, not as 128-bit number.
- Hashing: Using KUID with its shorter length of 22 bytes reduces the amount of data that needs to be hashed. This results in fewer instructions being required to perform the hashing operation compared to UUID, which requires a hash of 36 bytes.
- Comparison: Due to the shorter length of KUID, only 22 bytes need to be compared instead of 36 bytes, resulting in fewer instructions needed for comparison. Read/Write: Using KUID, which can be encoded in 22 bytes, results in a reduction in the memory and disk space required for read and write operations. This is in contrast to UUID, which requires 36 bytes for the same operations. With KUID, fewer instructions are needed to read or write data due to its smaller size.
IO Calls: Sending a UUID over a network consumes 36 bytes of data (without considering compression), while a KUID only requires 22 bytes. Therefore, KUID is a more efficient choice for network transmission due to its smaller size.
Database indexing: Due to the smaller size of KUID, it leads to a decrease in the size of the database index. In addition, combining KUID with another column for secondary indexing can be beneficial as databases have a limited index width (e.g., 767 bytes in MySQL) that is dependent on the encoding used for the column or table. For instance, MySQL's UTF8 encoding may limit the index width to as little as 255 bytes.URL safe: Like UUID, KUID is URL safe since it only uses [0–9A-Za-z] characters to represent the data. KUID is commonly used as a unique column, which means that URLs containing KUIDs may be used to represent specific resources, such as /api/v1/products/:productId. As the KUID is included in the URL, it must be URL safe to avoid issues such as encoding errors or unintended interpretation by the web server or browser.
Smaller Memory/Disk Usage: Using KUID results in lower memory/disk usage compared to UUID. Since KUID only requires 22 bytes, it uses less storage space than UUID, which uses 36 bytes. This can be beneficial for systems that require frequent read/write operations or have limited storage capacity. For instance, if we consider the storage of 200 million records, assuming each KUID is stored as a 22-byte string, the space consumed by KUIDs would be approximately 4.4 GB. In contrast, the space required for UUIDs would be around 7.2 GB. Therefore, using KUIDs can result in a storage saving of approximately 39%.
KUID Usage
public class Main {
public static void main(String[] args) {
for (int i = 0; i < 10; i++) {
KUID kuid = KUID.randomKUID();
System.out.println(kuid.toString());
}
}
}
This will produce something like this
LJVa90PC5N0LJVa90PC5N0
8SYCyPz9NXp8SYCyPz9NXp
8ucdhnCeYdk8ucdhnCeYdk
7QaHvqldPIL7QaHvqldPIL
8031jRPF3DS8031jRPF3DS
9oec5uzuIpg9oec5uzuIpg
4EAtrqG7CmF4EAtrqG7CmF
1vywu4ZzLz71vywu4ZzLz7
19YC4JE4W7619YC4JE4W76
8al7UWmt4Q28al7UWmt4Q2
If you found this post helpful, please share and give a thumbs up.
This content originally appeared on DEV Community and was authored by Sonu Kumar
Sonu Kumar | Sciencx (2023-02-26T05:55:05+00:00) KUID: Compressed Universally Unique Identifier. Retrieved from https://www.scien.cx/2023/02/26/kuid-compressed-universally-unique-identifier/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.