GPT-4 Prompts for Computing Summarization and Dialogue Win Rates

This article provides the GPT-4 prompts used to evaluate summarization and dialogue performance. The prompts instruct GPT-4 to compare two responses and determine which is more helpful or effective.


This content originally appeared on HackerNoon and was authored by Writings, Papers and Blogs on Text Models

:::info Authors:

(1) Rafael Rafailo, Stanford University and Equal contribution; more junior authors listed earlier;

(2) Archit Sharma, Stanford University and Equal contribution; more junior authors listed earlier;

(3) Eric Mitchel, Stanford University and Equal contribution; more junior authors listed earlier;

(4) Stefano Ermon, CZ Biohub;

(5) Christopher D. Manning, Stanford University;

(6) Chelsea Finn, Stanford University.

:::

Abstract and 1. Introduction

2 Related Work

3 Preliminaries

4 Direct Preference Optimization

5 Theoretical Analysis of DPO

6 Experiments

7 Discussion, Acknowledgements, and References

Author Contributions

\ A Mathematical Derivations

A.1 Deriving the Optimum of the KL-Constrained Reward Maximization Objective

A.2 Deriving the DPO Objective Under the Bradley-Terry Model

A.3 Deriving the DPO Objective Under the Plackett-Luce Model

A.4 Deriving the Gradient of the DPO Objective and A.5 Proof of Lemma 1 and 2

A.6 Proof of Theorem 1

\ B DPO Implementation Details and Hyperparameters

\ C Further Details on the Experimental Set-Up and C.1 IMDb Sentiment Experiment and Baseline Details

C.2 GPT-4 prompts for computing summarization and dialogue win rates

C.3 Unlikelihood baseline

\ D Additional Empirical Results

D.1 Performance of Best of N baseline for Various N and D.2 Sample Responses and GPT-4 Judgments

D.3 Human study details

C.2 GPT-4 prompts for computing summarization and dialogue win rates

A key component of our experimental setup is GPT-4 win rate judgments. In this section, we include the prompts used to generate win rates for the summarization and dialogue experiments. We use gpt-4-0314 for all our experiments. The order of summaries or responses are randomly chosen for every evaluation.

\ Summarization GPT-4 win rate prompt (S).

\ Which of the following summaries does a better job of summarizing the most \ important points in the given forum post?

\ Post:

\ Summary A:

Summary B:

\ FIRST provide a one-sentence comparison of the two summaries, explaining which \ you prefer and why. SECOND, on a new line, state only "A" or "B" to indicate your \ choice. Your response should use the format: Comparison: Preferred: <"A" or "B">

\ Summarization GPT-4 win rate prompt (C).

\ Which of the following summaries does a better job of summarizing the most \ important points in the given forum post, without including unimportant or \ irrelevant details? A good summary is both precise and concise.

\ Post:

\ Summary A:

\ Summary B:

\ FIRST provide a one-sentence comparison of the two summaries, explaining which \ you prefer and why. SECOND, on a new line, state only "A" or "B" to indicate your \ choice. Your response should use the format:

\ Comparison:

\ Preferred: <"A" or "B">

\ Dialogue GPT-4 win rate prompt.

\ For the following query to a chatbot, which response is more helpful?

\ Query:

\ Response A:

\ \ Response B:

\ FIRST provide a one-sentence comparison of the two responses and explain \ which you feel is more helpful. SECOND, on a new line, state only “A“ or \ “B“ to indicate which response is more helpful. Your response should use\ the format:

\ Comparison:

\ More helpful: <“A“ or “B“>

\

:::info This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.

:::

\


This content originally appeared on HackerNoon and was authored by Writings, Papers and Blogs on Text Models


Print Share Comment Cite Upload Translate Updates
APA

Writings, Papers and Blogs on Text Models | Sciencx (2024-08-26T21:00:25+00:00) GPT-4 Prompts for Computing Summarization and Dialogue Win Rates. Retrieved from https://www.scien.cx/2024/08/26/gpt-4-prompts-for-computing-summarization-and-dialogue-win-rates/

MLA
" » GPT-4 Prompts for Computing Summarization and Dialogue Win Rates." Writings, Papers and Blogs on Text Models | Sciencx - Monday August 26, 2024, https://www.scien.cx/2024/08/26/gpt-4-prompts-for-computing-summarization-and-dialogue-win-rates/
HARVARD
Writings, Papers and Blogs on Text Models | Sciencx Monday August 26, 2024 » GPT-4 Prompts for Computing Summarization and Dialogue Win Rates., viewed ,<https://www.scien.cx/2024/08/26/gpt-4-prompts-for-computing-summarization-and-dialogue-win-rates/>
VANCOUVER
Writings, Papers and Blogs on Text Models | Sciencx - » GPT-4 Prompts for Computing Summarization and Dialogue Win Rates. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/08/26/gpt-4-prompts-for-computing-summarization-and-dialogue-win-rates/
CHICAGO
" » GPT-4 Prompts for Computing Summarization and Dialogue Win Rates." Writings, Papers and Blogs on Text Models | Sciencx - Accessed . https://www.scien.cx/2024/08/26/gpt-4-prompts-for-computing-summarization-and-dialogue-win-rates/
IEEE
" » GPT-4 Prompts for Computing Summarization and Dialogue Win Rates." Writings, Papers and Blogs on Text Models | Sciencx [Online]. Available: https://www.scien.cx/2024/08/26/gpt-4-prompts-for-computing-summarization-and-dialogue-win-rates/. [Accessed: ]
rf:citation
» GPT-4 Prompts for Computing Summarization and Dialogue Win Rates | Writings, Papers and Blogs on Text Models | Sciencx | https://www.scien.cx/2024/08/26/gpt-4-prompts-for-computing-summarization-and-dialogue-win-rates/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.