Improving Text Embeddings with Large Language Models: Analysis of Training Hyperparameters

\ Figure 5: Accuracy of personalized passkey retrieval as a function of input context length. For each context length, we randomly generate 50 queries and compute the top 1 accuracy

\ Table 4: Results on the MTEB benchmark with various hyperparameters. The first row corresponds to the default setting, which employs last-token pooling, LoRA rank 16, and natural language instructions. Unless otherwise stated, all models are trained on the synthetic and MS-MARCO passage ranking data.

5.3 Analysis of Training Hyperparameters

Table 4 presents the results under different configurations. We notice that the Mistral-7B initialization holds an advantage over LLaMA-2 7B, in line with the findings from Mistral-7B technical report [19]. The choice of pooling types and LoRA ranks does not affect the overall performance substantially, hence we adhere to the default setting despite the marginal superiority of LoRA rank 8. On the other hand, the way of adding instructions has a considerable impact on the performance. We conjecture that natural language instructions better inform the model regarding the embedding task at hand, and thus enable the model to generate more discriminative embeddings. Our framework also provides a way to customize the behavior of text embeddings through instructions without the need to fine-tune the model or re-built document index.

:::info This paper is available on arxiv under CC0 1.0 DEED license.

:::

This content originally appeared on HackerNoon and was authored by Auto Encoder: How to Ignore the Signal Noise

Print Share Comment Cite Upload Translate Updates

APA

Auto Encoder: How to Ignore the Signal Noise | Sciencx (2024-10-09T20:00:31+00:00) Improving Text Embeddings with Large Language Models: Analysis of Training Hyperparameters. Retrieved from https://www.scien.cx/2024/10/09/improving-text-embeddings-withlarge-language-models-analysis-of-training-hyperparameters/

MLA

" » Improving Text Embeddings with Large Language Models: Analysis of Training Hyperparameters." Auto Encoder: How to Ignore the Signal Noise | Sciencx - Wednesday October 9, 2024, https://www.scien.cx/2024/10/09/improving-text-embeddings-withlarge-language-models-analysis-of-training-hyperparameters/

HARVARD

Auto Encoder: How to Ignore the Signal Noise | Sciencx Wednesday October 9, 2024 » Improving Text Embeddings with Large Language Models: Analysis of Training Hyperparameters., viewed ,<https://www.scien.cx/2024/10/09/improving-text-embeddings-withlarge-language-models-analysis-of-training-hyperparameters/>

VANCOUVER

Auto Encoder: How to Ignore the Signal Noise | Sciencx - » Improving Text Embeddings with Large Language Models: Analysis of Training Hyperparameters. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/10/09/improving-text-embeddings-withlarge-language-models-analysis-of-training-hyperparameters/

CHICAGO

" » Improving Text Embeddings with Large Language Models: Analysis of Training Hyperparameters." Auto Encoder: How to Ignore the Signal Noise | Sciencx - Accessed . https://www.scien.cx/2024/10/09/improving-text-embeddings-withlarge-language-models-analysis-of-training-hyperparameters/

IEEE

" » Improving Text Embeddings with Large Language Models: Analysis of Training Hyperparameters." Auto Encoder: How to Ignore the Signal Noise | Sciencx [Online]. Available: https://www.scien.cx/2024/10/09/improving-text-embeddings-withlarge-language-models-analysis-of-training-hyperparameters/. [Accessed: ]

rf:citation

» Improving Text Embeddings with Large Language Models: Analysis of Training Hyperparameters | Auto Encoder: How to Ignore the Signal Noise | Sciencx | https://www.scien.cx/2024/10/09/improving-text-embeddings-withlarge-language-models-analysis-of-training-hyperparameters/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

Table of Links

5.2 Extending to Long Text Embeddings

5.3 Analysis of Training Hyperparameters

Related Posts