# Novelty-Driven LLM Agent Loop An autonomous LLM agent that generates tasks in a while loop, using **novelty assessment as the termination condition** to help the agent "jump out" of its trained data distribution (semantic gravity). ## Concept Traditional LLM-based idea generation tends to produce outputs clustered around high-probability regions of the training distribution. This "semantic gravity" limits creative exploration. This module implements a novel approach: use **novelty scores** to dynamically control when the agent should stop. Instead of fixed iteration counts, the agent continues until it finds something truly novel (a "breakthrough"). ``` Seed Problem → Expert Sample → Task Generation → Novelty Assessment → Continue/Stop ``` ## Research Foundation This work builds on established research: - **Novelty Search** (Lehman & Stanley): Reward novelty, not objectives - **Curiosity-driven Exploration** (Pathak et al.): Intrinsic motivation via prediction error - **Quality-Diversity** (MAP-Elites): Maintain diverse high-quality solutions - **Open-ended Learning**: Endless innovation through novelty pressure The unique contribution is using **novelty as a termination condition** rather than just a reward signal. ## Architecture ``` ┌──────────────────────────────────────────────────────────────────┐ │ Novelty-Driven Task Generation Loop │ ├──────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────┐ │ │ │ Seed │ "Design a better bicycle" │ │ │ Problem │ │ │ └────┬─────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ WHILE novelty < threshold AND iterations < max: │ │ │ │ │ │ │ │ 1. Sample random expert (curated occupations) │ │ │ │ e.g., "marine biologist", "choreographer" │ │ │ │ │ │ │ │ 2. Generate task from expert perspective │ │ │ │ "What task would a {expert} assign to improve │ │ │ │ {seed_problem}?" │ │ │ │ │ │ │ │ 3. Embed task, compute novelty vs. centroid │ │ │ │ │ │ │ │ 4. If novelty > threshold → STOP (breakthrough!) │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌──────────┐ │ │ │ Output: │ Novel task that "jumped out" of typical space │ │ │ Task │ + trajectory of exploration │ │ └──────────┘ │ │ │ └──────────────────────────────────────────────────────────────────┘ ``` ## Installation The module uses the existing project infrastructure. Ensure you have: 1. **Ollama** running with the required models: ```bash ollama pull qwen3:8b ollama pull qwen3-embedding:4b ``` 2. **Python dependencies** (from project root): ```bash cd backend source venv/bin/activate pip install httpx numpy ``` ## Quick Start ### Basic Usage ```bash cd experiments/novelty_loop python demo.py "Improve urban transportation" ``` ### Example Output ``` Iteration 1 Expert: Architect (Architecture & Design) Task: Design multi-modal transit hubs that integrate pedestrian, cycling, and public transport seamlessly Novelty: [████████░░░░░░░░░░░░] 0.1234 Iteration 2 Expert: Chef (Culinary) Task: Create food delivery route optimization algorithms inspired by kitchen workflow efficiency Novelty: [███████████░░░░░░░░░] 0.1823 Iteration 3 Expert: Marine Biologist (Science) Task: Study fish schooling behavior to develop organic traffic flow algorithms Novelty: [██████████████░░░░░░] 0.3521 Iteration 4 Expert: Choreographer (Performing Arts) Task: Design pedestrian movement as urban dance, creating rhythmic crossing patterns Novelty: [████████████████████] 0.5234 ★ BREAKTHROUGH! ★ ``` ## Termination Strategies ### 1. Seek Breakthrough (Default) Stop when novelty exceeds threshold. Finds the first truly novel task. ```bash python demo.py "Your problem" --strategy breakthrough --threshold 0.4 ``` ### 2. Exhaust Frontier Continue while novelty is high, stop when average novelty drops. Explores more thoroughly. ```bash python demo.py "Your problem" --strategy exhaust --exhaust-threshold 0.15 ``` ### 3. Coverage Target Continue until N distinct conceptual clusters are covered. Ensures diversity. ```bash python demo.py "Your problem" --strategy coverage --clusters 5 ``` ## API Usage ```python import asyncio from experiments.novelty_loop.agent import NoveltyDrivenTaskAgent async def main(): agent = NoveltyDrivenTaskAgent( novelty_threshold=0.4, max_iterations=20, language="en" ) result = await agent.run("Design a better bicycle") print(f"Found breakthrough: {result.breakthrough_task.task}") print(f"Novelty score: {result.breakthrough_task.novelty_score}") print(f"From expert: {result.breakthrough_task.expert}") await agent.close() asyncio.run(main()) ``` ## Novelty Metrics The `novelty_metrics.py` module provides: - **Centroid Distance**: Primary novelty metric - how far from the average of all previous outputs - **Min Distance**: Distance to nearest neighbor (detect duplicates) - **Jump Detection**: Identifies significant semantic shifts between consecutive outputs - **Trajectory Tracking**: Cumulative novelty, jump ratio, etc. ```python from experiments.novelty_loop.novelty_metrics import NoveltyMetrics metrics = NoveltyMetrics(similarity_threshold=0.7) # Add embeddings one by one for embedding in embeddings: novelty = metrics.compute_novelty(embedding) metrics.add_embedding(embedding, novelty) print(f"Novelty: {novelty.score:.4f}, Is Jump: {novelty.is_jump}") # Get trajectory stats print(f"Mean novelty: {metrics.trajectory.mean_novelty}") print(f"Max novelty: {metrics.trajectory.max_novelty}") print(f"Jump ratio: {metrics.trajectory.jump_ratio}") ``` ## CLI Options ``` positional arguments: seed_problem The seed problem or challenge to explore options: --strategy {breakthrough,exhaust,coverage} Termination strategy (default: breakthrough) --threshold, -t Novelty threshold for breakthrough (default: 0.4) --max-iter, -m Maximum iterations (default: 20) --language, -l {en,zh} Language for prompts and experts (default: en) --model LLM model for task generation (default: qwen3:8b) --embedding-model Embedding model (default: qwen3-embedding:4b) --temperature LLM temperature (default: 0.7) --output, -o Save results to JSON file --quiet, -q Suppress iteration output --verbose, -v Enable verbose logging ``` ## File Structure ``` experiments/novelty_loop/ ├── README.md # This file ├── agent.py # Core NoveltyDrivenTaskAgent and variants ├── novelty_metrics.py # Novelty computation utilities └── demo.py # Interactive CLI demo ``` ## Design Decisions | Question | Decision | Rationale | |----------|----------|-----------| | Output Type | **Tasks** | Self-generated sub-goals for autonomous problem decomposition | | Termination | **Seek Breakthrough** | Stop when novelty exceeds threshold - find truly novel task | | Perturbation | **Expert Perspectives** | Experts have task-oriented knowledge; more natural than abstract domains | | Novelty Reference | **Centroid** | Dynamic, adapts as exploration progresses | ## Connection to Main Project This module integrates with the main novelty-seeking project: - Uses the same **curated occupation data** (`backend/app/data/curated_occupations_*.json`) - Uses the same **embedding model** (qwen3-embedding:4b) - Builds on the **AUT flexibility analysis** metrics for novelty computation - Can use **DDC domain data** for alternative perturbation strategies ## Future Work 1. **Hybrid Perturbation**: Combine expert + domain perspectives 2. **Contrastive Prompting**: Explicitly ask for outputs unlike recent ones 3. **Semantic Steering**: Guide generation away from centroid direction 4. **Multi-Agent Exploration**: Parallel agents with different strategies 5. **Quality-Diversity Archive**: Maintain diverse high-quality solutions ## References - Lehman, J., & Stanley, K. O. (2011). Abandoning objectives: Evolution through the search for novelty alone. - Pathak, D., et al. (2017). Curiosity-driven exploration by self-supervised prediction. - Mouret, J. B., & Clune, J. (2015). Illuminating search spaces by mapping elites. - arXiv:2405.00899 - Characterising Creative Process in Humans and LLMs