[Summary] Ada-R1: Hybrid CoT via Bi-Level Adaptive Reasoning Optimization

TL;DR Chain-of-Thought (CoT) enables large language models (LLMs) to solve complex tasks by generating intermediate reasoning steps. Ada-R1 approach fine-tunes a model to prefer Short-CoT over Long-CoT based on problem complexity, using training a model to minimize reasoning length while preserving accuracy. This approach reduces average reasoning length by over 50%, substantially lowering inference cost, with maintained accuracy across five mathematical reasoning benchmarks. Background CoT prompting decomposes complex tasks into intermediate reasoning steps....

May 1, 2025 · 2 min · 372 words

[Summary] ReAct: Synergizing Reasoning and Acting in Language Models

TL;DR Large Language Models (LLMs) often suffer from hallucinations. Two common mitigation strategies are Chain of Thought (CoT), where the LLM is prompted to show its step-by-step reasoning, and Act, where LLMs use external tools to ground their answers in reliable databases. However, CoT relies on the model’s internal representations, limiting its ability to reason reactively or update its knowledge. ReAct is a prompting method that combines CoT with action plan generation using external tools....

January 17, 2025 · 1 min · 203 words