This content originally appeared on DEV Community and was authored by Abraham Mekonnen
Before diving into the world of AI within Spring Boot, it's essential to grasp some fundamental concepts. Especially before referring to any Spring Boot AI documentation, it's helpful to understand these basics. In this guide, I'll explain these ideas in the way that I understand them, and I hope it proves useful to anyone reading.
Let's start with the core technology behind many text-based AI chat tools, including image generation systems - LLMs (Large Language Models). According to IBM, "Large Language Models (LLMs) are a category of foundation models trained on vast amounts of data, enabling them to understand and generate natural language and other types of content for a wide range of tasks." So, how do we integrate them into projects using Spring Boot AI? Before answering that, let's discuss some important points about LLMs that will provide a deeper understanding.
The first thing to know is that LLMs have a training data cutoff date. This means that the model is trained only up until a specific point in time, and it won't be aware of anything that happened after that date. To address this limitation, several solutions have been developed, including:
- Prompt Stuffing
- RAG (Retrieval Augmented Generation)
- Function Calling
- Fine-tuning
Before diving into these solutions, let's talk about tokens. In the world of LLMs, tokens are essentially the currency. LLMs process tokens rather than raw words, so everything inputted has to be converted into tokens. Every prompt sent to an LLM gets converted into tokens, and there is a limit to how many tokens an LLM can handle in a single request.
Now, let's explore the solutions mentioned earlier:
1. Prompt Stuffing:
This involves adding relevant information along with the user's question. For example, if a user asks about the dosage of a specific medication, you include additional data about that medication in the prompt. This allows the LLM to refer to the provided information when formulating its response.
2. RAG (Retrieval Augmented Generation):
This technique helps overcome the token limit in LLMs. Suppose a user wants to ask a question about a 900-page book, but the LLM has a token limit of 90,000 tokens (approximately 120,000 tokens for a 900-page book). It's impossible to stuff the entire content into the prompt. Enter embedding, a method of converting digital content into vectors, which are then stored in a vector database. This type of database, unlike traditional ones, performs similarity searches. When a question is asked, relevant data is retrieved from the vector database using similarity search, and this data is used in the prompt to help the LLM generate an answer.
3. Function Calling:
In this approach, the LLM is provided with several functions it can call upon to answer user queries. When the LLM encounters a question it cannot answer with its trained data, it calls an appropriate function to fetch the necessary information. The response from the function is then used to answer the question.
4. Fine-Tuning:
Fine-tuning involves training an already pre-trained LLM for a specific role or use case. This technique is mainly used by data scientists and, as of this writing, is not typically necessary when working with Spring Boot AI projects.
This content originally appeared on DEV Community and was authored by Abraham Mekonnen
Abraham Mekonnen | Sciencx (2024-10-22T20:21:46+00:00) Key Concepts to Understand Before Starting with Spring Boot and AI. Retrieved from https://www.scien.cx/2024/10/22/key-concepts-to-understand-before-starting-with-spring-boot-and-ai-2/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.