Idea: Converting from Relational Database to Vector Database

What is Vector Database?

Complex data is growing at break-neck speed. These are unstructured forms of data that include documents, images, videos, and plain text on the web. Many organizations would benefit from storing and analyzing complex…


This content originally appeared on DEV Community and was authored by Moeki Kawakami

What is Vector Database?

Complex data is growing at break-neck speed. These are unstructured forms of data that include documents, images, videos, and plain text on the web. Many organizations would benefit from storing and analyzing complex data, but complex data can be difficult for traditional databases built with structured data in mind. Classifying complex data with keywords and metadata alone may be insufficient to fully represent all of its various characteristics.

Fortunately, Machine Learning (ML) techniques can offer a far more helpful representation of complex data by transforming it into vector embeddings. Vector embeddings describe complex data objects as numeric values in hundreds or thousands of different dimensions.

Vector databases are purpose-built to handle the unique structure of vector embeddings. They index vectors for easy search and retrieval by comparing values and finding those that are most similar to one another.

Vector Search is more powerful than structured data search with the rise of machine learning

With the release of the OpenAI API, tools such as Langchain, and vector database services such as Pinecone, the use of vector search has become much more accessible than ever before.

I see a lot of people starting to work with unstructured data in the past few months and experimenting a lot. So what about structured data?

The method introduced by LLM toolchains such as Langchain appears to be very simple, and that is to have the language model generate SQL queries. This looks amazing at first glance, but when you actually try it, it is simply like changing the SQL controller from wired to wireless, and the actual game image on the screen does not change at all. In other words, in contrast to vector search, which seems to return search results in a fairly natural way, relational database search seems to have been left behind by the times.

However, one can expect some people to say that they are structured for rigorous searches, and that it is not wrong to say that they do not respond adequately to natural language.

Is this really the case?

Main unstructured Document and the Time/Person associated with it

I think the documents and messages are the ones that suffer the most from this problem. This is because documents and messages themselves have a strong unstructured aspect, but they always contain information about when they were updated and who wrote them, so both unstructured and structured perspectives are necessary. I think this is related to the fact that ChatGPT confuses old and new information in some cases.

Of course, this is not so much of a problem when the document itself is likely to contain date and author information, as in the Web, but the problem is with internal documents and internal chats that are managed entirely in RDBs.

Proposed Solution

I have decided to take the following tentative steps to address this document chat issue. Embed relevant information in the text to be vectorized.

Title: {{title}}
Author: {{author}}
UpdatedAt: {{updatedAt}}
Body: {{boday}}

In my experiments, I was able to give reasonable answers to questions such as who is knowledgeable about which information and which information is correct.

This is a suggestion. I would like to hear your opinions.


This content originally appeared on DEV Community and was authored by Moeki Kawakami


Print Share Comment Cite Upload Translate Updates
APA

Moeki Kawakami | Sciencx (2023-04-16T09:11:21+00:00) Idea: Converting from Relational Database to Vector Database. Retrieved from https://www.scien.cx/2023/04/16/idea-converting-from-relational-database-to-vector-database/

MLA
" » Idea: Converting from Relational Database to Vector Database." Moeki Kawakami | Sciencx - Sunday April 16, 2023, https://www.scien.cx/2023/04/16/idea-converting-from-relational-database-to-vector-database/
HARVARD
Moeki Kawakami | Sciencx Sunday April 16, 2023 » Idea: Converting from Relational Database to Vector Database., viewed ,<https://www.scien.cx/2023/04/16/idea-converting-from-relational-database-to-vector-database/>
VANCOUVER
Moeki Kawakami | Sciencx - » Idea: Converting from Relational Database to Vector Database. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2023/04/16/idea-converting-from-relational-database-to-vector-database/
CHICAGO
" » Idea: Converting from Relational Database to Vector Database." Moeki Kawakami | Sciencx - Accessed . https://www.scien.cx/2023/04/16/idea-converting-from-relational-database-to-vector-database/
IEEE
" » Idea: Converting from Relational Database to Vector Database." Moeki Kawakami | Sciencx [Online]. Available: https://www.scien.cx/2023/04/16/idea-converting-from-relational-database-to-vector-database/. [Accessed: ]
rf:citation
» Idea: Converting from Relational Database to Vector Database | Moeki Kawakami | Sciencx | https://www.scien.cx/2023/04/16/idea-converting-from-relational-database-to-vector-database/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.