novelty-seeking/experiments/novelty_loop/README.md

# Novelty-Driven LLM Agent Loop

An autonomous LLM agent that generates tasks in a while loop, using **novelty assessment as the termination condition** to help the agent "jump out" of its trained data distribution (semantic gravity).

## Concept

Traditional LLM-based idea generation tends to produce outputs clustered around high-probability regions of the training distribution. This "semantic gravity" limits creative exploration.

This module implements a novel approach: use **novelty scores** to dynamically control when the agent should stop. Instead of fixed iteration counts, the agent continues until it finds something truly novel (a "breakthrough").

```
Seed Problem → Expert Sample → Task Generation → Novelty Assessment → Continue/Stop
```

## Research Foundation

This work builds on established research:

- **Novelty Search** (Lehman & Stanley): Reward novelty, not objectives
- **Curiosity-driven Exploration** (Pathak et al.): Intrinsic motivation via prediction error
- **Quality-Diversity** (MAP-Elites): Maintain diverse high-quality solutions
- **Open-ended Learning**: Endless innovation through novelty pressure

The unique contribution is using **novelty as a termination condition** rather than just a reward signal.

## Architecture

```
┌──────────────────────────────────────────────────────────────────┐
│              Novelty-Driven Task Generation Loop                 │
├──────────────────────────────────────────────────────────────────┤
│                                                                  │
│   ┌──────────┐                                                   │
│   │ Seed     │  "Design a better bicycle"                        │
│   │ Problem  │                                                   │
│   └────┬─────┘                                                   │
│        │                                                         │
│        ▼                                                         │
│   ┌─────────────────────────────────────────────────────────┐    │
│   │  WHILE novelty < threshold AND iterations < max:        │    │
│   │                                                         │    │
│   │    1. Sample random expert (curated occupations)        │    │
│   │       e.g., "marine biologist", "choreographer"         │    │
│   │                                                         │    │
│   │    2. Generate task from expert perspective             │    │
│   │       "What task would a {expert} assign to improve     │    │
│   │        {seed_problem}?"                                 │    │
│   │                                                         │    │
│   │    3. Embed task, compute novelty vs. centroid          │    │
│   │                                                         │    │
│   │    4. If novelty > threshold → STOP (breakthrough!)     │    │
│   │                                                         │    │
│   └─────────────────────────────────────────────────────────┘    │
│        │                                                         │
│        ▼                                                         │
│   ┌──────────┐                                                   │
│   │ Output:  │  Novel task that "jumped out" of typical space    │
│   │ Task     │  + trajectory of exploration                      │
│   └──────────┘                                                   │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘
```

## Installation

The module uses the existing project infrastructure. Ensure you have:

1. **Ollama** running with the required models:
   ```bash
   ollama pull qwen3:8b
   ollama pull qwen3-embedding:4b
   ```

2. **Python dependencies** (from project root):
   ```bash
   cd backend
   source venv/bin/activate
   pip install httpx numpy
   ```

## Quick Start

### Basic Usage

```bash
cd experiments/novelty_loop
python demo.py "Improve urban transportation"
```

### Example Output

```
Iteration 1
  Expert: Architect (Architecture & Design)
  Task: Design multi-modal transit hubs that integrate pedestrian, cycling, and public transport seamlessly
  Novelty: [████████░░░░░░░░░░░░] 0.1234

Iteration 2
  Expert: Chef (Culinary)
  Task: Create food delivery route optimization algorithms inspired by kitchen workflow efficiency
  Novelty: [███████████░░░░░░░░░] 0.1823

Iteration 3
  Expert: Marine Biologist (Science)
  Task: Study fish schooling behavior to develop organic traffic flow algorithms
  Novelty: [██████████████░░░░░░] 0.3521

Iteration 4
  Expert: Choreographer (Performing Arts)
  Task: Design pedestrian movement as urban dance, creating rhythmic crossing patterns
  Novelty: [████████████████████] 0.5234
  ★ BREAKTHROUGH! ★
```

## Termination Strategies

### 1. Seek Breakthrough (Default)

Stop when novelty exceeds threshold. Finds the first truly novel task.

```bash
python demo.py "Your problem" --strategy breakthrough --threshold 0.4
```

### 2. Exhaust Frontier

Continue while novelty is high, stop when average novelty drops. Explores more thoroughly.

```bash
python demo.py "Your problem" --strategy exhaust --exhaust-threshold 0.15
```

### 3. Coverage Target

Continue until N distinct conceptual clusters are covered. Ensures diversity.

```bash
python demo.py "Your problem" --strategy coverage --clusters 5
```

## API Usage

```python
import asyncio
from experiments.novelty_loop.agent import NoveltyDrivenTaskAgent

async def main():
    agent = NoveltyDrivenTaskAgent(
        novelty_threshold=0.4,
        max_iterations=20,
        language="en"
    )

    result = await agent.run("Design a better bicycle")

    print(f"Found breakthrough: {result.breakthrough_task.task}")
    print(f"Novelty score: {result.breakthrough_task.novelty_score}")
    print(f"From expert: {result.breakthrough_task.expert}")

    await agent.close()

asyncio.run(main())
```

## Novelty Metrics

The `novelty_metrics.py` module provides:

- **Centroid Distance**: Primary novelty metric - how far from the average of all previous outputs
- **Min Distance**: Distance to nearest neighbor (detect duplicates)
- **Jump Detection**: Identifies significant semantic shifts between consecutive outputs
- **Trajectory Tracking**: Cumulative novelty, jump ratio, etc.

```python
from experiments.novelty_loop.novelty_metrics import NoveltyMetrics

metrics = NoveltyMetrics(similarity_threshold=0.7)

# Add embeddings one by one
for embedding in embeddings:
    novelty = metrics.compute_novelty(embedding)
    metrics.add_embedding(embedding, novelty)
    print(f"Novelty: {novelty.score:.4f}, Is Jump: {novelty.is_jump}")

# Get trajectory stats
print(f"Mean novelty: {metrics.trajectory.mean_novelty}")
print(f"Max novelty: {metrics.trajectory.max_novelty}")
print(f"Jump ratio: {metrics.trajectory.jump_ratio}")
```

## CLI Options

```
positional arguments:
  seed_problem          The seed problem or challenge to explore

options:
  --strategy {breakthrough,exhaust,coverage}
                        Termination strategy (default: breakthrough)
  --threshold, -t       Novelty threshold for breakthrough (default: 0.4)
  --max-iter, -m        Maximum iterations (default: 20)
  --language, -l {en,zh}
                        Language for prompts and experts (default: en)
  --model               LLM model for task generation (default: qwen3:8b)
  --embedding-model     Embedding model (default: qwen3-embedding:4b)
  --temperature         LLM temperature (default: 0.7)
  --output, -o          Save results to JSON file
  --quiet, -q           Suppress iteration output
  --verbose, -v         Enable verbose logging
```

## File Structure

```
experiments/novelty_loop/
├── README.md           # This file
├── agent.py            # Core NoveltyDrivenTaskAgent and variants
├── novelty_metrics.py  # Novelty computation utilities
└── demo.py             # Interactive CLI demo
```

## Design Decisions

| Question | Decision | Rationale |
|----------|----------|-----------|
| Output Type | **Tasks** | Self-generated sub-goals for autonomous problem decomposition |
| Termination | **Seek Breakthrough** | Stop when novelty exceeds threshold - find truly novel task |
| Perturbation | **Expert Perspectives** | Experts have task-oriented knowledge; more natural than abstract domains |
| Novelty Reference | **Centroid** | Dynamic, adapts as exploration progresses |

## Connection to Main Project

This module integrates with the main novelty-seeking project:

- Uses the same **curated occupation data** (`backend/app/data/curated_occupations_*.json`)
- Uses the same **embedding model** (qwen3-embedding:4b)
- Builds on the **AUT flexibility analysis** metrics for novelty computation
- Can use **DDC domain data** for alternative perturbation strategies

## Future Work

1. **Hybrid Perturbation**: Combine expert + domain perspectives
2. **Contrastive Prompting**: Explicitly ask for outputs unlike recent ones
3. **Semantic Steering**: Guide generation away from centroid direction
4. **Multi-Agent Exploration**: Parallel agents with different strategies
5. **Quality-Diversity Archive**: Maintain diverse high-quality solutions

## References

- Lehman, J., & Stanley, K. O. (2011). Abandoning objectives: Evolution through the search for novelty alone.
- Pathak, D., et al. (2017). Curiosity-driven exploration by self-supervised prediction.
- Mouret, J. B., & Clune, J. (2015). Illuminating search spaces by mapping elites.
- arXiv:2405.00899 - Characterising Creative Process in Humans and LLMs