Illusion of Thinking Exploration Tool

4 months ago 6

Hostinger Web Hosting

A Gradio web application for exploring the experiments described in Apple's paper "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models with locally hosted language models using Ollama. This project is designed to test and evaluate the reasoning capabilities of language models on well-defined problem-solving tasks.

The paper evaluates on four different types of puzzles with varying difficulty levels:

Towers of Hanoi - The classic disk-moving puzzle with three pegs
Checker Jumping - A one-dimensional board puzzle where checkers must swap positions
River Crossing - A constraint-satisfaction puzzle involving actors and agents crossing a river
Blocks World - A planning puzzle requiring rearrangement of stacked blocks

Each puzzle:

Has configurable difficulty levels (n=1 to n=10)
Provides structured system prompts to guide the language model
Automatically evaluates the model's solution for correctness
Supports real-time interaction through a web interface via Gradio

Install Ollama from https://ollama.ai
Pull at least one model (recommended models for reasoning tasks):
Verify Ollama is running:

Clone the repository:

git clone <repository-url> cd illusion-of-thinking
Install dependencies (using uv - recommended):

Or using pip:

pip install -r requirements.txt

Ensure Ollama is running with at least one model available
Launch the Gradio interface:

using uv

or if using a virtual environment
Open your browser to the displayed URL (typically http://127.0.0.1:7860)

Left Panel (Chat Interface)

Chatbot Window: Displays the conversation between system prompts and model responses
Model Dropdown: Select which Ollama model to use for solving puzzles
Options: Advanced JSON configuration for model parameters (temperature, top_p, etc.)
Clear Button: Reset the conversation history

Right Panel (Puzzle Configuration)

Puzzle Dropdown: Choose from Towers of Hanoi, Checker Jumping, River Crossing, or Blocks World
Difficulty Slider: Set complexity level (n=1 for easiest, n=10 for hardest)
Solve Button: Start the puzzle-solving process
System Tab: View/edit the system prompt that guides the model
User Tab: View/edit the specific puzzle instance description

Select a model (e.g., "qwen3:8b")
Choose "Towers of Hanoi" puzzle
Set difficulty to 3
Click "Solve"
Watch as the model attempts to solve the puzzle
View the automatic evaluation of the solution

To add a new puzzle:

Create a new class inheriting from Puzzle in puzzles.py
Implement required methods: parse_solution(), play(), move(), user_prompt()
Define NAME, SYSTEM_PROMPT, and other class attributes
Add the puzzle to the puzzles dictionary in main.py

Read Entire Article

Hostinger Web Hosting