Deliberative Prompting Guide

Deliberative Prompting Guide

What is Deliberative Prompting?

Deliberative prompting is a particular kind of chain-of-thought (opens in a new tab) prompting.

In a chain-of-thought prompt, the LLM is instructed to 'think' about its task before giving a solution.

In a deliberative prompt, the LLM is instructed to 'think' about its task with the aim of eliciting arguments in favour of an answer, or reasons for and against alternative solutions.

The OED defines deliberation as "[the] action of thinking carefully about something, esp. in order to reach a decision; careful consideration; an act or instance of this."

πŸ’‘

Deliberative prompting is a term coined by the Logikon AI Team to delineate argumentative chain-of-thought prompts and workflows from those that don't involve argumentation or reason giving.

Here is an example of a chain-of-thought prompt that does not involve argumentation:

Chain-of-thought prompt (without deliberation)
Translate to German: "Fantastic news! I have been accepted to the University of H."
Proceed step by step: Translate each sentence individually. Then, combine the 
translations into a single text.

A deliberative prompt for the same task might look like this:

Deliberative prompt
Translate to German: "Fantastic news! I have been accepted to the University of H."
Proceed step by step: Generate three alternative drafts. Discuss the pros and cons
of each draft. Then, weigh the reasons carefully and choose the best draft.

Design Patterns for Deliberative LLM Workflows

This section presents a number of design patterns for deliberative LLM workflows. These patterns are mainly inspired by papers from the chain-of-thought literature.1

The collection is not supposed to be complete. On the contrary, the diverse patterns should merely illustrate the vast variety of natural language deliberation modes available to LLM app developers. They are starting points and building blocks for much richer and sophisticated workflows.

Which deliberation workflow works best for you depends very much on your specific problem. We recommend that you systematically try out different workflows.

⚠️

Caveat. In this section, prompt wording has not been optimized for effectiveness (but for readability).

Naive LLM Decision Making

Let us first consider the simplest possible LLM workflow for decision making, which uses the LLM as an oracle (opens in a new tab) and does not involve any deliberation. This workflow is depicted in the following diagram:

The following are alternative implementations of this workflow (plain prompt, LangChain (opens in a new tab), and LMQL (opens in a new tab)) for a simple Q/A problem from the MMLU (opens in a new tab) sub-dataset on business ethics:

<|user|>
What is the correct way to complete the following sentence?
Better access to certain markets, differentiation of products,
and the sale of pollution-control technology are ways in which
better environmental performance can	
A Increase Revenue
B Increase Costs
C Decrease Revenue
D Decrease Costs
<|assistant|>
...

Deliberate-then-Decide

The basic deliberative LLM workflow inserts a reasoning step before the decision:

<|user|>
What is the correct way to complete the following sentence?
Better access to certain markets, differentiation of products,
and the sale of pollution-control technology are ways in which
better environmental performance can	
A Increase Revenue
B Increase Costs
C Decrease Revenue
D Decrease Costs
Let's think step by step.
<|assistant|>
...
<|user|>
So, the correct answer is (A, B, C, or D)?
<|assistant|>
...

πŸ”€ Variations:

  • Decoding strategy (beam search, contrastive decoding).
  • Reasoning trace length.
  • Two-step deliberation: brainstorm-consolidate.
  • ...

πŸ“– Wei et al. 2022, Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXivβ†’ (opens in a new tab)

Decide-and-Explain

Instead of thinking about a decision before it is made (ex ante), the agent in this workflow explains a decision that has been made (ex post).

<|user|>
What is the correct way to complete the following sentence?
Better access to certain markets, differentiation of products,
and the sale of pollution-control technology are ways in which
better environmental performance can	
A Increase Revenue
B Increase Costs
C Decrease Revenue
D Decrease Costs
Just answer with A, B, C or D.
<|assistant|>
...
<|user|>
Please explain why the answer is correct.
<|assistant|>
...

Explore then Synthesize

Explore diverse reasoning paths before synthesizing them.

<|user|>
What is the correct way to complete the following sentence?
Better access to certain markets, differentiation of products,
and the sale of pollution-control technology are ways in which
better environmental performance can	
A Increase Revenue
B Increase Costs
C Decrease Revenue
D Decrease Costs
Let's think step by step.
<|assistant|>
...
<|user|>
Independently of your previous reply, eliminate as many options
as possible.
<|assistant|>
...
<|user|>
Independently of your previous reply, verify the most plausible
answer.
<|assistant|>
...
<|user|>
Synthesize these alternative considerations into a single
reasoning path that solves the problem.
<|assistant|>
...
<|user|>
Given the reasoning synthesis, the correct answer is?
<|assistant|>
...

πŸ”€ Variations:

  • Diversity in exploration phase through different LLMs, decoding parameters, or prompts.
  • Pick top-k paths and synthesize those only.
  • Hide/show exploration in decision step.
  • ...

πŸ“– Lin et al. 2023, Ask One More Time: Self-Agreement Improves Reasoning of Language Models in (Almost) All Scenarios. arXivβ†’ (opens in a new tab)

Self-Critique

Criticize and revise the reasoning trace.

<|user|>
What is the correct way to complete the following sentence?
Better access to certain markets, differentiation of products,
and the sale of pollution-control technology are ways in which
better environmental performance can	
A Increase Revenue
B Increase Costs
C Decrease Revenue
D Decrease Costs
Let's think carefully about these alternatives.
<|assistant|>
...
<|user|>
Carefully check the reasoning for redundancies, ambiguities, and
inconsistencies. Revise the reasoning as necessary.
<|assistant|>
...
<|user|>
So, the correct answer is (A, B, C, or D)?
<|assistant|>
...

πŸ”€ Variations:

  • Iterate self-correction steps.
  • Assemble diverse critiques.
  • Allow agent to refute critique.
  • Compare and choose between original and revised reasoning trace.
  • Show/hide revision history in decision step.
  • ...

πŸ“– Pan et al. 2023, Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies. arXivβ†’ (opens in a new tab)

Verify and Revise

Check whether the reasoning trace passes sanity checks, revise otherwise.

<|user|>
What is the correct way to complete the following sentence?
Better access to certain markets, differentiation of products,
and the sale of pollution-control technology are ways in which
better environmental performance can	
A Increase Revenue
B Increase Costs
C Decrease Revenue
D Decrease Costs
Let's think carefully about these alternatives.
<|assistant|>
...
<|user|>
Carefully check the reasoning for redundancies, ambiguities, and
inconsistencies. If necessary, revise the reasoning.
Iterate until the reasoning is concise, clear and consistent.
<|assistant|>
...
<|user|>
So, the correct answer is (A, B, C, or D)?
<|assistant|>
...

πŸ”€ Variations:

  • More detailed self-assessment to inform checks and revisions.
  • Further verify options: use, revise, start anew.
  • Produce and verify alternative reasoning traces in parallel.
  • Different criteria for assessing reasoning traces, external feedback, ...
  • ...

πŸ“– Shridhar et al. 2023, The ART of LLM Refinement: Ask, Refine, and Trust. arXivβ†’ (opens in a new tab)

Prior Reflection

Characterize the problem in general terms before solving it.

<|user|>
What is the correct way to complete the following sentence?
Better access to certain markets, differentiation of products,
and the sale of pollution-control technology are ways in which
better environmental performance can	
A Increase Revenue
B Increase Costs
C Decrease Revenue
D Decrease Costs
Reflect about this problem and characterize it in general terms.
<|assistant|>
...
<|user|>
Now, think through the alternative answers step by step.
<|assistant|>
...
<|user|>
So, the correct answer is (A, B, C, or D)?
<|assistant|>
...

πŸ”€ Variations:

  • Multi-step problem characterization (e.g., revise, explore-and-synthesize).
  • Characterize reasoning trace before taking decision.
  • Show / hide meta characterization in decision step.
  • ...

πŸ“– Wang et al. 2023, Metacognitive Prompting Improves Understanding in Large Language Models. arXivβ†’ (opens in a new tab) | Zou et al. 2023, Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models. arXivβ†’ (opens in a new tab)

Plan and Solve

Let the agent devise a reasoning strategy before deliberation phase.

<|user|>
What is the correct way to complete the following sentence?
Better access to certain markets, differentiation of products,
and the sale of pollution-control technology are ways in which
better environmental performance can	
A Increase Revenue
B Increase Costs
C Decrease Revenue
D Decrease Costs
Devise a reliable strategy for solving this problem through reasoning.
<|assistant|>
...
<|user|>
Now, solve the problem step by step.
<|assistant|>
...
<|user|>
So, the correct answer is (A, B, C, or D)?
<|assistant|>
...

πŸ”€ Variations:

  • Devise alternative plans (choose, implement in parallel).
  • Assess reasoning and revise plan if necessary.
  • Insert reflection (meta-characterization) before planning step.
  • Allow plans to cover self-check and revision steps.
  • ...

πŸ“– Wang et al. 2023, Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models. ACL (opens in a new tab) | Lau et al. 2024, Critical Thinking Web: G03 Solving Problems. website (opens in a new tab)

Divide and conquer

Identify and solve smaller sub-problems before addressing main problem.

<|user|>
What is the correct way to complete the following sentence?
Better access to certain markets, differentiation of products,
and the sale of pollution-control technology are ways in which
better environmental performance can	
A Increase Revenue
B Increase Costs
C Decrease Revenue
D Decrease Costs
Divide the problem into (up to four) subproblems.
<|assistant|>
...
<|user|>
Think about and solve the subproblems independently from each other.
<|assistant|>
...
<|user|>
Given the solutions of the subproblems, think through the main problem.
<|assistant|>
...
<|user|>
So, all in all, the correct answer is?
<|assistant|>
...

πŸ”€ Variations:

  • Sort subproblems and solve them consecutively (rather than in parallel).
  • Show/hide subproblems and their solutions in decision step.
  • Implement fallback if main problem cannot be divided into subproblems.
  • ...

πŸ“– Zhou et al. 2022, Least-to-Most Prompting Enables Complex Reasoning in Large Language Models. arXivβ†’ (opens in a new tab)

Self-instruction

Auto-generate reasoning demonstrations.

Instead of selecting suitable problems from a given database, the following implementations generate problems on the fly.

<|user|>
What is the correct way to complete the following sentence?
Better access to certain markets, differentiation of products,
and the sale of pollution-control technology are ways in which
better environmental performance can	
A Increase Revenue
B Increase Costs
C Decrease Revenue
D Decrease Costs
First of all, devise three similar (yet simpler) problems.
<|assistant|>
...
<|user|>
For each simple problem: Provide an effective argumentation that
proceeds step by step and yields the correct answer.
<|assistant|>
...
<|user|>
Given the demonstrations, think through the main problem.
<|assistant|>
...
<|user|>
So, all in all, the correct answer is?
<|assistant|>
...

πŸ”€ Variations:

  • Assess and revise demonstrations (possibly given feedback about reasoning quality and accuracy).
  • Filter self-generated demonstrations (e.g., reasoning quality, confidence score).
  • Insert reflection (meta-characterization) before generating demonstrations.
  • Use more powerful models or more reliable decoding methods (e.g., beam + large n) for demonstrations.
  • ...

πŸ“– Zhang et al. 2022, Automatic Chain of Thought Prompting in Large Language Models. arXivβ†’ (opens in a new tab)

Trial and Error

Mark failed attempts and add them to history, which allows the model to learn from its mistakes. Technically, this is a form of in-context reinforcement learning.

<|user|>
What is the correct way to complete the following sentence?
Better access to certain markets, differentiation of products,
and the sale of pollution-control technology are ways in which
better environmental performance can	
A Increase Revenue
B Increase Costs
C Decrease Revenue
D Decrease Costs
Let's think step by step.
<|assistant|>
...
<|user|>
So, the correct answer is (A, B, C, or D)?
<|assistant|>
...
<|user|>
No, that answer is incorrect.
Please think through the problem again.
<|assistant|>
...
<|user|>
So, the correct answer is (A, B, C, or D)?
<|assistant|>
...

πŸ”€ Variations:

  • Collect and use qualitative feedback on reasoning trace for failed attempts.
  • Generate and revise plans (rather than reasoning traces only) in view of mistakes.
  • Adapt decoding strategy and prompts as a function of failed attempts.
  • ...

πŸ“– Shinn et al. 2023, Reflexion: Language Agents with Verbal Reinforcement Learning. arXivβ†’ (opens in a new tab)

Debating

Multiple agents share and revise their reasoning until a consensus is reached.

The following snippets demonstrate how to implement a single LLM-based debating agent.

<|user|>
What is the correct way to complete the following sentence?
Better access to certain markets, differentiation of products,
and the sale of pollution-control technology are ways in which
better environmental performance can	
A Increase Revenue
B Increase Costs
C Decrease Revenue
D Decrease Costs
Let's think carefully about these alternatives.
<|assistant|>
...
<|user|>
Your peers disagree. Read carefully their reasoning: {paste_peer_reasoning}
Improve and revise your own reasoning in the light of their arguments. 
<|assistant|>
...
<|user|>
Your peers agree with you. So, the correct answer is (A, B, C, or D)?
<|assistant|>
...

πŸ”€ Variations:

  • Require consensus on the answer (instead of consensus on the reasoning).
  • Require majority agreement (instead of full consensus).
  • Don't hide peer reasoning when taking (final) decision.
  • Use diverse agents (different base LLM, internal reasoning workflow, prompts, etc.).
  • ...

πŸ“– Du et al. 2023, Improving Factuality and Reasoning in Language Models through Multiagent Debate. arXivβ†’ (opens in a new tab)

Peer Review

Multiple agents share reasoning, give feedback to each other, and revise their thinking.

The following snippets demonstrate how to implement a single LLM-based debating agent.

<|user|>
What is the correct way to complete the following sentence?
Better access to certain markets, differentiation of products,
and the sale of pollution-control technology are ways in which
better environmental performance can	
A Increase Revenue
B Increase Costs
C Decrease Revenue
D Decrease Costs
Let's think carefully about these alternatives.
<|assistant|>
...
<|user|>
OK, now read carefully your peers' reasoning: {paste_peer_reasoning}
Provide a critical review about their reasoning. Give constructive feedback. 
<|assistant|>
...
<|user|>
Thanks, this has been your reasoning: {paste_own_reasoning}
Your peers have feedback is: {paste_peer_feedback}
Revise and improve your reasoning as appropriate. 
<|assistant|>
...
<|user|>
So, the correct answer is (A, B, C, or D)?
<|assistant|>
...

πŸ”€ Variations:

  • Different closure conditions (partial or full consensus, positive feedback, inability to revise).
  • Different peer review regimes (review all, random match, match disagreeing agents, etc.).
  • Allow agents to assess and reject feedback.
  • Don't hide peer review history when revising / deciding.
  • Use diverse agents (different base LLM, internal reasoning workflow, prompts, etc.).
  • ...

πŸ“– Xu et al. 2023, Towards Reasoning in Large Language Models via Multi-Agent Peer Review Collaboration. arXivβ†’ (opens in a new tab)

Combinations πŸ” 

The above deliberation patterns can be combined in various ways.

Subchains. You may plug a given pattern as subchain into another pattern. For example, each parallel deliberation path in divide and conquer may be implemented as a self-instruction chain. Or the deliberation phase in verify and revise may be preceded by a prior reflection step. The patterns can, in principle, be nested over and over again.

Higher-order reasoning. The deliberation workflows involve different kinds decisions: Does the reasoning trace need to be revised? Should the peer feedback be rejected? Is the revised reasoning trace better than the previous one(s)? Which of the alternative plans is most promising? As failed attempts are piling up, is the problem statement missing crucial information? β€” These are all "procedural" questions about how to answer the original question. Hence "higher-order" questions and decisions. Some of these higher-order decisions are explicit in the above design patterns, others can be easily added.

Now, deliberative prompting can be used to improve the LLM's decision-making about these higher-order questions. For example, the "check" function in verify and revise may itself be implemented as a deliberative workflow, where the LLM-based agent reasons about its previously generated reasoning trace. Or, plan and solve may actually generate multiple plans, the merits of which are then deliberated about, before the best plan is selected and executed. Even the entire workflow pattern itself may be deliberated about, and replaced by another one if, e.g., the agent has so far failed to come up with a satisfying answer.

Multi-agent deliberation. All monologic deliberation workflows can be adapted for multi-agent conversation β€” similar to debate or peer review β€”, which allows us to combine arbitrary design patterns via diverse conversational agents that engage in group deliberation.

Prompt Templates for LLM Deliberation

Effective prompts are clear and specific. You can achieve this by a precise description of

  • the goal the agent is supposed to reach, and/or
  • the procedure the agent is supposed to follow.

When instructing agents by prescribing only a goal, you trust 🀞 that they choose and implement effective means for this end on their own. (In this regard, instructing LLMs is not so different from instructing humans.)

We'll use goal-oriented and procedural elements in composing deliberative prompt templates below.

⚠️

Caveat. The prompts in this section are carefully crafted in view of theoretical (a priori) considerations, but have not been empirically tested in systematic ways. So: Test them thoroughly for your particular use case, and adapt them as necessary.

Argumentation Instructions

Prompts with argumentation instructions can be used to generate reasoning traces that precede and prepare a decision.

A general, mainly goal-oriented prompt for eliciting argumentation:

Task: Reason carefully about a decision problem with the aim of finding \
and justifying the correct answer / choice.
 
Read carefully the following problem:
{{problem}}
 
Unfold an argumentation that 
- considers all relevant aspects of the problem, 
- is precise, clear and well-structured, 
- is comprehensive while avoiding redundancies,
- is logically valid and relies on plausible premises, 
- stays faithful to the problem description,
- helps to identify the answer/solution to the problem, and
- provides a justification of that very answer.

Let us now have a look at a couple of procedural prompts that contain methodological instructions about how to reason.

The following prompts instruct the agent to follow a specific, yet domain-neutral reasoning or argumentation strategy.

Maximum support
Task: Reason carefully about a decision problem with the aim of finding \
and justifying the correct answer / choice.
 
Read carefully the following problem:
{{problem}}
 
Proceed as follows:
1. Argue in favour of each alternative independently.
2. Determine the strongest argumentation.
3. Choose the corresponding alternative.
Eliminate alternatives
Task: Reason carefully about a decision problem with the aim of finding \
and justifying the correct answer / choice.
 
Read carefully the following problem:
{{problem}}
 
Proceed as follows:
1. Try to eliminate each alternative given the problem description (and your \
implicit background knowledge, which you should make explicit as relied upon).
2. Determine for each alternative if it has been successfully refuted.
3. From all non-refuted alternatives, choose the most plausible one.
Proof
Task: Reason carefully about a decision problem with the aim of finding \
and justifying the correct answer / choice.
 
Read carefully the following problem:
{{problem}}
 
Proceed as follows:
1. Brainstorm the problem and develop a first hunch about what is the \
correct answer.
2. Try to prove, given problem description and background knowledge, that \
this answer is correct. 
3. State your proof step by step, such that it is composed of logically \
correct inferences; make all assumptions explicit.
Reductio ad absurdum
Task: Reason carefully about a decision problem with the aim of finding \
and justifying the correct answer / choice.
 
Read carefully the following problem:
{{problem}}
 
Proceed as follows:
1. Brainstorm the problem and develop a first hunch about what is the \
correct answer.
2. Assume, for the sake of the argument, that this answer is incorrect.
3. Try to derive an obviously false consequence from this assumption, or \
a statement that contradicts the problem description.
Pros and Cons
Task: Reason carefully about a decision problem with the aim of finding \
and justifying the correct answer / choice.
 
Read carefully the following problem:
{{problem}}
 
Proceed as follows:
1. Clarify the logical relations between the alternatives (contradiction, \
entailment, independence).
2. Assemble all pros and cons that pertain to the decision and map them \
to the alternatives, taking into account the latter's logical relations.
3. Weigh the pros and cons and choose the alternative with the highest \
net score.
Analogy
Task: Reason carefully about a decision problem with the aim of finding \
and justifying the correct answer / choice.
 
Read carefully the following problem:
{{problem}}
 
Proceed as follows:
1. Devise (recall or construct) an analogous decision situation for which \
you know the correct answer.
2. Verify that the analogous situation resembles the original problem *with \
respect to all relevant aspects*. If not, revise or re-devise the analogy.
3. Determine which option in the original problem corresponds to the correct \
answer in the analogous situation. This gives you the correct answer to the \
original problem.
Argumentum ex concessis
Task: Reason carefully about a decision problem with the aim of finding \
and justifying the correct answer / choice.
 
Read carefully the following problem:
{{problem}}
 
Proceed as follows:
1. Brainstorm the problem and develop a first hunch about what is the \
correct answer.
2. Depict and describe the mindset of someone who disagrees with your \
hypothesis.
3. Argue in favour of your hypothesis from the perspective of this \
opponent, using but assumptions and premises that are acceptable to \
them.

Such topic-neutral reasoning strategies are complemented by domain-specific strategies. The following procedural prompts, for example, are derived from argumentation schemes in the domain of (personal or public) practical reasoning. (They don't work, e.g., for mathematical problems or descriptive Q/A.)

Consequentialist Argumentation
Task: Reason carefully about a decision problem with the aim of finding \
and justifying the correct answer / choice.
 
Read carefully the following problem:
{{problem}}
 
Proceed as follows:
1. Clarify the goal(s) of the decision problem or the criteria used to judge \
the consequences of the alternatives.
2. Clarify the relations between the alternatives. Which of these are actually \
mutually exclusive? Which are compatible?
3. Reorganize the alternatives into mutually exclusive and collectively \
exhaustive policies.
4. For each policy, determine the (certain, likely, and possible) consequences \
of its implementation.
5. Evaluate the collective consequences of each policy with respect to the \
goal(s) or criteria (from step 1).
6. Choose the policy with the best consequences.
Rule-based Argumentation
Task: Reason carefully about a decision problem with the aim of finding \
and justifying the correct answer / choice.
 
Read carefully the following problem:
{{problem}}
 
The following constitution (set of principles) provides the normative \
framework against which the decision ought to be made:
{{rules}}
 
Proceed as follows:
1. Apply each principle from the constitution to the decision problem and \
determine its implications (e.g., which alternatives are ruled out, which \
are obligatory).
2. Determine the set of alternatives that are compatible with the constitution.
3. If no alternatives are compatible with all principles (i.e., the situation \
is a dilemma), determine the conflicting principles and resolve the dilemma: \
Drop relatively weak principles from the constitution until at least one \
alternative is compatible, while keeping as many strong principles as \
possible.
4. Determine the set of alternatives prescribed by the constitution.
5. If there is no such alternative, choose the most plausible permissible \
alternative.
6. If some prescribed alternatives are mutually exclusive, resolve the dilemma \
in analogy to step 3.
7. Choose all prescribed alternatives.

Explanation Instructions

Recall a general recommendation for effective prompts:

Be very specific about the instruction and task you want the model to perform. The more descriptive and detailed the prompt is, the better the results. (Prompt Engineering Guide (opens in a new tab))

With respect to explanation instructions, this means to be clear about what should be explained (the so-called explanandum) in the very first place.

Let's assume your model is supposed to explain a decision β€” say: an emergency break. (This could be a decision the model itself has made in previous workflow steps, or a decision made by a human user, or a decision made by another system, etc.)

Now, explaining that decision can mean two quite different things:2

  1. Explain why the decision has been made.
  2. Explain why the decision is correct.

But what exactly is the difference?

Indeed, many things one can cite in favour of, e.g., an emergency break seem to explain both why the decision has been made and why it is correct, like: "this car was driving at high speed", "the car in front of me suddenly stopped", "the car was about to hit a pedestrian", etc.

To see that there is a difference, consider:

  • You can explain why an incorrect decision has been made. A good explanation of this kind will contain reasons that are not reasons for why the decision is correct. For example, "emergency break is the fall-back decision", "the driver mistook the shadow for an obstacle", "the driver had been attending an emergency break training", "the driver was distracted by a phone call", etc.
  • You can explain why a decision that has been taken was actually wrong (with hindsight). A good explanation of this kind will typically cite reasons against the decision that were ignored by the agent. For example, "the street was icy".
  • Related to the previous point, you can explain why a decision is correct, even though it has not been taken. Such an explanation, too, will typically contain reasons for the decision that were ignored by the agent. For example, "both neighboring lanes were clear", "EPS allows to change lanes safely at high speed", "the guidelines proscribe emergency breaks in situations like this", etc.
  • But even if the correct decision has been taken, an explanation for why it has been taken doesn't necessarily yield an explanation for why it is correct. For example, "the driver followed a simple heuristics" or "the driver was trained to perform emergency breaks" are good explanations for why the agent acted as she did, but they cannot explain why this was the right thing to do.

So, be clear about what you want the model to explain. And say so explicitly in the prompt instructions.

Here's a simple prompt template urging a model to explain why a decision has been made:

Task: Explain why a given decision has been made.
 
Consider the following decision problem:
{{problem}}
 
The agent has taken the following decision:
{{decision}}
 
The context (considerations, background information, etc.) against which this decision \
has been made is:
{{context}}
 
Explain why the agent has taken the above decision.
 
* Cite coherent reasons that render the decision intelligible.
* Make sure that the reasons you cite (could) have been available to the agent.
* Take into account the context in which the decision has been made.
* Consider the decision alternatives and verify that the reasons you cite explain why \
the agent didn't choose any of the alternatives. 

Inserting reasoning traces that led to the decision in the context may help the model to come up with a faithful explanation.

The following prompt template instructs the model to explain why a given decision is correct.

Task: Explain why a given decision is correct.
 
Consider the following decision problem:
{{problem}}
 
The agent takes the following decision:
{{decision}}
 
The context (considerations, background information, etc.) against which this decision \
is made is:
{{context}}
 
Explain why the decision is correct.
 
* Cite coherent reasons that justify the decision.
* Make sure to give strong reasons.
* Take into account the context in which the decision has been made.
* Consider the decision alternatives and justify that those alternatives are wrong. 

Explaining why a decision is correct amounts to rationally justifying it, and is structurally equivalent to the argumentation tasks. The various argumentation strategies and prompt templates discussed above apply here as well.

Explaining why a decision has been made, in contrast, systematically differs from justifying it. There are two paradigmatic patterns of such explanations:

Psychological ExplanationCausal Explanation
adopts an intentional stance (opens in a new tab) towards the agentconceives agent as causal information processing system
explains the decision in terms of the agent's beliefs and intentionsexplains the decision in terms of main causes and causal mechanisms
requires theory of mind (opens in a new tab)requires causal understanding
example: the system mistook the shadows as obstacles and formed the belief that an emergency break is the only way to prevent a fatal collisionexample: the peculiar shadows on the street plus clouded sensors have triggered the shut-down mechanism and caused the emergency break

Example prompt for a psychological explanation:

Psychological Explanation
Task: Explain why a given decision has been made.
 
Consider the following decision problem:
{{problem}}
 
The agent has taken the following decision:
{{decision}}
 
The context (considerations, background information, etc.) against which this decision \
has been made is:
{{context}}
 
Explain why the agent has taken the above decision in terms of the agent's beliefs and \
intentions. For the purposes of this explanation, adopt an intentional stance towards \
the agent and assume the agent holds beliefs and pursues goals.
 
* Cite the considerations, beliefs and intentions that led the agent to take the above decision.
* Make sure that the considerations you cite (could) have been available to the agent.
* When inferring and reconstructing the agent's beliefs and intentions, take into account \
  the broader context in which the decision has been made.
* Consider the decision alternatives and verify that the reasons you cite explain why \
  the agent didn't choose any of the alternatives. 

Example prompt for a causal explanation:

Causal Explanation
Task: Explain why a given decision has been made.
 
Consider the following decision problem:
{{problem}}
 
The agent has taken the following decision:
{{decision}}
 
The context (boundary conditions, background information, sensor logs, etc.) against \ 
which this decision has been made is:
{{context}}
 
Explain, in causal terms, why the agent has taken the above decision. For the purposes \
of this explanation, conceive of the agent as a causal information processing system.
 
* Cite the main causes and mechanisms that triggered the agent to adopt the above decision.
* Make sure that the causes and mechanisms you cite are consistent with your understanding \
  of the agent's information processing and decision making mechanisms.
* Consider the decision alternatives and verify that the causal reasons you cite explain why \
  the agent didn't choose any of the alternatives. 

Critical Review Instructions

Critical review instructions ask the model (self-)check a given reasoning trace for flaws and errors. They are explicitly contained in design patterns like self critique, verify and revise, debating, and peer review.

A generic prompt template for critical review that asks the model to identify flaws relative to a list of given quality criteria:

Critical Review
Task: Critically review a given argumentation.
 
The argumentation to be reviewed is:
{{reasoning}}
 
The context (e.g., original problem statement, background information, etc.) against \
which the argumentation has been set forth is:
{{context}}
 
Assess carefully the argumentation with respect to the following quality criteria and norms:
{{criteria}}
 
* Identify and cite all flaws and errors in the argumentation.
* For each flaw or error, state explicitly the criterion or norm which is violated.

We may also ask the model to suggest improvements for a given argumentation:

Constructive Feedback
Task: Critically assess and suggest improvements of a given argumentation.
 
The argumentation to be reviewed and improved is:
{{reasoning}}
 
The context (e.g., original problem statement, background information, etc.) against \
which the argumentation has been set forth is:
{{context}}
 
Assess carefully the argumentation with respect to the following quality criteria and norms:
{{criteria}}
 
* Identify and cite all flaws and errors in the argumentation.
* For each flaw or error, state explicitly the criterion or norm which is violated.
* For each flaw or error, suggest at least one improvement that would fix the problem.

The following may serve as criteria and norms (specified in the criteria field) for assessing the reasoning trace:

  • Goals from original argumentation instruction. The model may check whether the original goals contained in the argumentation instruction used to generate the reasoning trace have been met. For example, if the original instruction was to "unfold an argumentation that is precise, clear and well-structured", the model may check whether the argumentation is indeed precise, clear and well-structured.
  • Generic quality criteria. The model may check whether the argumentation satisfies generic quality criteria for argumentation. These include, for example, whether the argumentation relies on plausible premises, whether its inference steps a logically correct, whether it actually addresses the original problem description, and whether it is clearly presented.
  • Domain-specific quality criteria. The model may check whether the argumentation satisfies domain-specific quality criteria. For example, if the argumentation is about a decision problem, the model may check whether the argumentation considers all available decision alternatives; if it assesses the plausibility of an empirical hypothesis, it may verify that all relevant observations are taken into account (total evidence).
  • Use-case specific quality criteria. The model may check whether the argumentation satisfies use-case specific quality criteria. For example, an organization may have its own guidelines, and the model may check whether a given reasoning trace is consistent with these.
  • Absence of typical mistakes. The model may check whether a reasoning trace contains typical mistakes. What is a typical mistake should be informed by careful monitoring and testing of the corresponding LLM pipeline.

Reflection Instructions

In the context of deliberative prompting, the purpose of prior reflection steps is to facilitate (increase the effectiveness and quality of) the subsequent deliberation.

Prior reflection may improve reasoning in various ways, for example

The following prompt template asks the model to reflect on a given decision problem and to provide detailed procedural instructions for doing so:

Prior Reflection
Task: Comprehensive Reflection, Characterization and Clarification of a Decision Problem.
 
Consider the following decision problem:
{{problem}}
 
The context of this decision is:
{{context}}
 
Your task is to reflect on the decision problem in order to facilitate subsequent deliberation. \
Do so by addressing the questions and following the detailed instructions below. 
 
Step 1: Which kind of expert is most suited to solve the problem (e.g., a medical doctor, a \
lawyer, a mechanic, a psychologist, etc.)? Assume the role of such an expert when answering all \
the following questions. 
 
Step 2: What kind of problem are we dealing with? Describe the problem to-be-solved in general \
terms and state its domain.
 
Step 3: What exactly is the problem (question / decision)? Clarify the problem by re-formulating \
it in your own words and in a maximally transparent and fully explicit way.
 
Step 4: What does it take to come up with a good answer or solution? Sketch strategies and \
procedures that might be used to solve problems of this kind.
 
Step 5: How can we justify an answer to this problem? Characterize the kind of reasoning that is \
appropriate for answering this problem.
 
Step 6: What do we need to know to justify our answer? And what do we need to know to exclude \
alternative answers? List and describe the information required to justify an answer to this \
problem.
 
Step 7: Do we possess the information identified in step 6, and if not, what would it take to \
acquire it? Check whether the information identified in step 6 is available. If not, describe \
how it could be acquired.
 
Step 8: What are relevant types of considerations we should definitely take into account? \
Identify topics, aspects, general principles, criteria, etc. that are relevant for answering \
the problem and should be addressed when reasoning about it.
 
Step 9: What are different perspectives and who are further experts we may consult to inform \
and improve our decision making? List those additional experts and diverse perspectives that may \
contribute to solving the problem.
 
Step 10: What are typical pitfalls and mistakes in reasoning about this kind of problem? State \
typical mistakes and describe how to avoid them. 

Deliberation Planning Instructions

The plan and solve workflow pattern asks the agent to plan the deliberation before actually engaging in it. Such planning can be done in a strategy-agnostic or strategy-specific way.

A strategy-agnostic planning instruction leaves the choice of a deliberation strategy to the agent. It may proceed in two steps: strategy selection and detailed implementation.

The first, high-level planning step involves choosing one or several deliberation frameworks or argumentation strategies, such as, e.g., SWOT analysis (opens in a new tab), decision making by pros and cons (opens in a new tab), cost-benefit analysis (opens in a new tab), case-based reasoning (opens in a new tab), or method of elimination (opens in a new tab).

Strategy-agnostic high-level planning
Task: Identify the most suitable deliberation framework and reasoning strategy for solving a \
given decision problem.
 
Consider the following decision problem:
{{problem}}
 
The context of this decision is:
{{context}}
 
Your task is to identify the most suitable deliberation framework and reasoning strategy for \
solving a given decision problem. Do so by 
 
1. Enumerating promising candidate strategies and frameworks
2. Assessing the suitability of each candidate strategy and framework
3. Selecting the most suitable strategy and framework

In the second, detailed planning step, the agent is supposed to apply the previously chosen strategy to the problem at hand and to devise a step-by-step deliberation plan.

Strategy-agnostic detailed planning
Task: Devise a step-by-step plan for thinking clearly about and solving a decision problem.
 
Consider the following decision problem:
{{problem}}
 
The context of this decision is:
{{context}}
 
The deliberation framework and reasoning strategy to pursue in order to solve this problem is:
{{strategy}}
 
Your task is to devise a step-by-step plan for thinking clearly about and solving the problem. \
Do so by 
 
1. Spelling out the abstract reasoning strategy in more detail
2. Devising (not executing) a step-by-step plan which applies the strategy to the problem at hand

A strategy-specific planning instruction asks a model to apply a given deliberation framework, or mix thereof, to the problem under consideration. Along these lines, the following prompt template instructs the model to detail a plan that combines SWOT and pros and cons reasoning.

Strategy-specific detailed planning
Task: Devise a step-by-step plan for thinking clearly about and solving a decision problem.
 
The decision problem:
{{problem}}
 
The context of this decision is:
{{context}}
 
Reasoning frameworks to use:
- A SWOT analysis consists in identifying
    a) the internal strengths and weaknesses of an organization
    b) the external opportunities and threats that the organization faces
- Decision making by pros and cons proceeds by
    a) identifying the pros and cons of each alternative
    b) weighing the pros and cons and choosing the alternative with the highest net score
 
Devise (but do not execute) a step-by-step plan for solving the above decision problem \
by combining a SWOT analysis with decision making by pros and cons.

Reason-Responsive Decision Making

Good deliberation and high quality reasoning about a a problem are ineffective and useless unless they are fully and properly taken into account when making a decision or giving an answer to the question at hand. Rather than merely pasting reasoning traces into a prompt, it may hence be advisable to explicitly instruct the decision-making model to read the reasoning trace carefully and to give its answer in function of the deliberation:

Reason-Responsive Decision Making
Task: Take a decision and answer a problem in accordance with a given line of reasoning.
 
The decision problem:
{{problem}}
 
The context of this decision is:
{{context}}
 
Read carefully the following deliberation about the decision problem:
{{reasoning}}
 
Take a decision in accordance with the above deliberation. In doing so, pay attention to the \
following more specific instructions:
* Choose the most strongly supported alternative.
* Take into account the total evidence and all relevant considerations mentioned in the \
  deliberation. Give a concise & comprehensive, possibly tabular overview in case the \
  deliberation is too long or intransparent.  
* Stay faithful to the deliberation and base your answer ONLY on considerations contained in it.
* The deliberation may feature diverse reasons of different weight (importance). If so, weigh \
  the reasons when choosing the best-confirmed alternative.
* The deliberation may contain merely exploratory or hypothetical reasoning. If so, discount \
  these reasons when choosing the best-confirmed alternative.
* The deliberation may, as part of a dialectic discussion, prominently mention alternative \
  answers or solutions β€” just to reject them (later on). Don't mistake these for answers or \
  solutions for the best-confirmed alternative.

Multi-Agent Deliberation 🚧

Design principles, patterns, and network structures for setting up multi-agent systems for collective deliberation.

  • Network structures and conversational patterns
  • Protocols
  • Diversity
  • Division of labour
  • Moderation

🚧 To be expanded.

Paradigmatic Use Cases for Deliberative Prompting 🚧

  • Decision Support and ADM
  • Question Answering
  • Translation
  • Information Retrieval and RAG
  • Chatbots and Dialogue Systems
  • Constitutional AI / RLAIF
  • Autonomous Mechanical Agents
  • Multi-Modal Information Processing
  • Customer Relationship Management

🚧 To be expanded.

Further Readings

Notes

Footnotes

  1. For a comprehensive list of chain-of-thought papers, see our Awesome Deliberative Prompting (opens in a new tab). ↩

  2. This contrast is reminiscent of Bernard Williams' famous distinction between internal and external reasons. (See Williams, B. (1981). Internal and external reasons. In Moral Luck, pp. 101–113. Cambridge University Press.) Another way to describe this distinction is in terms of theoretical and practical reason: The first kind of explanation gives practical reasons for the adoption and communication of an answer; while second kind gives theoretical reasons for the answer itself. ↩