Deductive Verification with Natural Programs: Case Studies

We present examples of deductive verification using Natural Program formats, showcasing how ChatGPT identifies ungrounded information and logical errors, and highlighting cases where the model struggles with premise numbers and grounded values.


This content originally appeared on HackerNoon and was authored by Cosmological thinking: time, space and universal causation

:::info Authors:

(1) Zhan Ling, UC San Diego and equal contribution;

(2) Yunhao Fang, UC San Diego and equal contribution;

(3) Xuanlin Li, UC San Diego;

(4) Zhiao Huang, UC San Diego;

(5) Mingu Lee, Qualcomm AI Research and Qualcomm AI Research

(6) Roland Memisevic, Qualcomm AI Research;

(7) Hao Su, UC San Diego.

:::

Abstract and Introduction

Related work

Motivation and Problem Formulation

Deductively Verifiable Chain-of-Thought Reasoning

Experiments

Limitations

Conclusion, Acknowledgements and References

\ A Deductive Verification with Vicuna Models

B More Discussion on Improvements of Deductive Verification Accuracy Versus Improvements on Final Answer Correctness

C More Details on Answer Extraction

D Prompts

E More Deductive Verification Examples

E More Deductive Verification Examples

In this section, we present more deductive verification examples using our Natural Program-based approach on single reasoning steps.

\ In Tab. 18, we demonstrate that the language model (ChatGPT) not only successfully identifies ungrounded information, but also identifies logical errors within the given solutions.

\ In Tab. 19, we illustrate a case where the language model fails to detect ungrounded premise numbers, mistakenly assuming that these numbers can be derived from grounded ones.

\ Lastly, in Tab. 20, we illustrate a case where the language model is sometimes unable to correctly identify grounded numbers.

\ Table 12: Two-shot prompt for direct reasoning chain verification without Natural Program format.

\ Table 13: One-shot Natural Program prompt for reasoning chain generation on math word problems.

\ Table 14: One-shot Natural Program prompt for reasoning chain generation on math word problems with multiple choice.

\ Table 15: Two-shot Natural Program prompt for reasoning chain generation on the Date dataset.

\ Table 16: One-shot Natural Program prompt for reasoning chain generation on the Last Letters dataset.

\ Table 17: One-shot prompt for deductive verification of a single reasoning step, following our Natural Program format and step-by-step reasoning chain decomposition.

\ Table 18: Successful case: our deductive verification approach successfully discovers ungrounded information and reasoning mistakes.

\ Table 19: Failure case: our deductive verification process fails to find out ungrounded information in the reasoning step. The number 240 in the reasoning step is ungrounded, but the model states that it can be calculated from grounded numbers.

\ Table 20: Failure case: our deductive verification process sometimes treats grounded information as if they were ungrounded. The number 120 is provided in the given information, but the model states that it is ungrounded.

\

:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\


This content originally appeared on HackerNoon and was authored by Cosmological thinking: time, space and universal causation


Print Share Comment Cite Upload Translate Updates
APA

Cosmological thinking: time, space and universal causation | Sciencx (2024-09-08T13:59:54+00:00) Deductive Verification with Natural Programs: Case Studies. Retrieved from https://www.scien.cx/2024/09/08/deductive-verification-with-natural-programs-case-studies/

MLA
" » Deductive Verification with Natural Programs: Case Studies." Cosmological thinking: time, space and universal causation | Sciencx - Sunday September 8, 2024, https://www.scien.cx/2024/09/08/deductive-verification-with-natural-programs-case-studies/
HARVARD
Cosmological thinking: time, space and universal causation | Sciencx Sunday September 8, 2024 » Deductive Verification with Natural Programs: Case Studies., viewed ,<https://www.scien.cx/2024/09/08/deductive-verification-with-natural-programs-case-studies/>
VANCOUVER
Cosmological thinking: time, space and universal causation | Sciencx - » Deductive Verification with Natural Programs: Case Studies. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/09/08/deductive-verification-with-natural-programs-case-studies/
CHICAGO
" » Deductive Verification with Natural Programs: Case Studies." Cosmological thinking: time, space and universal causation | Sciencx - Accessed . https://www.scien.cx/2024/09/08/deductive-verification-with-natural-programs-case-studies/
IEEE
" » Deductive Verification with Natural Programs: Case Studies." Cosmological thinking: time, space and universal causation | Sciencx [Online]. Available: https://www.scien.cx/2024/09/08/deductive-verification-with-natural-programs-case-studies/. [Accessed: ]
rf:citation
» Deductive Verification with Natural Programs: Case Studies | Cosmological thinking: time, space and universal causation | Sciencx | https://www.scien.cx/2024/09/08/deductive-verification-with-natural-programs-case-studies/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.