TIL: Search Optimized Database (full-text search) for System Design

Traditional databases run a table scan to find a search term in the database. This is slow and efficient if a table stores a large dataset (1000+ rows). To improve, “search optimized database” can be used.

It uses indexing, tokenization and stemming t…


This content originally appeared on DEV Community and was authored by Daniel The Developer

Traditional databases run a table scan to find a search term in the database. This is slow and efficient if a table stores a large dataset (1000+ rows). To improve, "search optimized database" can be used.

It uses indexing, tokenization and stemming to make search queries fast and efficient (by building "inverted indexes"):

  • Tokenization is a process of reducing words to their root form. For example, "running" and "runs" can be reduced to "run".
  • Stemming is a process of breaking a piece of task into individual words. It helps mapping words to documents containing those words in the inverted indexes.

Something to note is the underlying data structure of search mechanisms of search optimized database - Inverted Indexes.

It is a data structure that maps words to the documents that contain them. For example:
{
"word1": [doc1,doc2,doc3],
"word2": [doc2,doc3,doc6]
}

Most search optimized database also support "Fuzzy Search" out of box as a configuration. Fuzzy Search works by leveraging "edit distance calculation" technique, which measures how many letters to be changed/added/removed to transform one word into another. Thus, results with minor misspellings or discrepancy relative to the search term can be returned efficiently in case of human errors.

One of the popular search optimized database is "ElasticSearch".

Source: https://www.hellointerview.com/learn/system-design/in-a-hurry/key-technologies


This content originally appeared on DEV Community and was authored by Daniel The Developer


Print Share Comment Cite Upload Translate Updates
APA

Daniel The Developer | Sciencx (2024-10-16T01:16:58+00:00) TIL: Search Optimized Database (full-text search) for System Design. Retrieved from https://www.scien.cx/2024/10/16/til-search-optimized-database-full-text-search-for-system-design/

MLA
" » TIL: Search Optimized Database (full-text search) for System Design." Daniel The Developer | Sciencx - Wednesday October 16, 2024, https://www.scien.cx/2024/10/16/til-search-optimized-database-full-text-search-for-system-design/
HARVARD
Daniel The Developer | Sciencx Wednesday October 16, 2024 » TIL: Search Optimized Database (full-text search) for System Design., viewed ,<https://www.scien.cx/2024/10/16/til-search-optimized-database-full-text-search-for-system-design/>
VANCOUVER
Daniel The Developer | Sciencx - » TIL: Search Optimized Database (full-text search) for System Design. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/10/16/til-search-optimized-database-full-text-search-for-system-design/
CHICAGO
" » TIL: Search Optimized Database (full-text search) for System Design." Daniel The Developer | Sciencx - Accessed . https://www.scien.cx/2024/10/16/til-search-optimized-database-full-text-search-for-system-design/
IEEE
" » TIL: Search Optimized Database (full-text search) for System Design." Daniel The Developer | Sciencx [Online]. Available: https://www.scien.cx/2024/10/16/til-search-optimized-database-full-text-search-for-system-design/. [Accessed: ]
rf:citation
» TIL: Search Optimized Database (full-text search) for System Design | Daniel The Developer | Sciencx | https://www.scien.cx/2024/10/16/til-search-optimized-database-full-text-search-for-system-design/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.