novelty-seeking/research/paper_outline.md

# Paper Outline: Expert-Augmented LLM Ideation

## Suggested Titles

1. **"Breaking Semantic Gravity: Expert-Augmented LLM Ideation for Enhanced Creativity"**
2. "Beyond Interpolation: Multi-Expert Perspectives for Combinatorial Innovation"
3. "Escaping the Relevance Trap: Structured Expert Frameworks for Creative AI"
4. "From Crowd to Expert: Simulating Diverse Perspectives for LLM-Based Ideation"

---

## Abstract (Draft)

Large Language Models (LLMs) are increasingly used for creative ideation, yet they exhibit a phenomenon we term "semantic gravity" - the tendency to generate outputs clustered around high-probability regions of their training distribution. This limits the novelty and diversity of generated ideas. We propose a multi-expert transformation framework that systematically activates diverse semantic regions by conditioning LLM generation on simulated expert perspectives. Our system decomposes concepts into structured attributes, generates ideas through multiple domain-expert viewpoints, and employs semantic deduplication to ensure genuine diversity. Through experiments comparing multi-expert generation against direct LLM generation and single-expert baselines, we demonstrate that our approach produces ideas with [X]% higher semantic diversity and [Y]% lower patent overlap. We contribute a theoretical framework explaining LLM creativity limitations and an open-source system for innovation ideation.

---

## 1. Introduction

### 1.1 The Promise and Problem of LLM Creativity
- LLMs widely adopted for creative tasks
- Initial enthusiasm: infinite idea generation
- Emerging concern: quality and diversity issues

### 1.2 The Semantic Gravity Problem
- Define the phenomenon
- Why it occurs (statistical learning, mode collapse)
- Why it matters (innovation requires novelty)

### 1.3 Our Solution: Expert-Augmented Ideation
- Brief overview of the approach
- Key insight: expert perspectives as semantic "escape velocity"
- Contributions preview

### 1.4 Paper Organization
- Roadmap for the rest of the paper

---

## 2. Related Work

### 2.1 Theoretical Foundations
- Semantic distance and creativity (Mednick, 1962)
- Conceptual blending theory (Fauconnier & Turner)
- Design fixation (Jansson & Smith)
- Constraint-based creativity

### 2.2 LLM Limitations in Creative Generation
- Design fixation from AI (CHI 2024)
- Dual mechanisms: inspiration vs. fixation
- Bias and pattern perpetuation

### 2.3 Persona-Based Prompting
- PersonaFlow (2024)
- BILLY persona vectors (2025)
- Quantifying persona effects (ACL 2024)

### 2.4 Creativity Support Tools
- Wisdom of crowds approaches
- Human-AI collaboration in ideation
- Evaluation methods (CAT, semantic distance)

### 2.5 Positioning Our Work
- Gap: No end-to-end system combining structured decomposition + multi-expert transformation + deduplication
- Distinction from PersonaFlow: product innovation focus, attribute structure

---

## 3. System Design

### 3.1 Overview
- Pipeline diagram
- Design rationale

### 3.2 Attribute Decomposition
- Category analysis (dynamic vs. fixed)
- Attribute generation per category
- DAG relationship mapping

### 3.3 Expert Team Generation
- Expert sources: LLM-generated, curated, external databases
- Diversity optimization strategies
- Domain coverage considerations

### 3.4 Expert Transformation
- Conditioning mechanism
- Keyword generation
- Description generation
- Parallel processing for efficiency

### 3.5 Semantic Deduplication
- Embedding-based approach
- LLM-based approach
- Threshold selection

### 3.6 Novelty Validation
- Patent search integration
- Overlap scoring

---

## 4. Experiments

### 4.1 Research Questions
- RQ1: Does multi-expert generation increase semantic diversity?
- RQ2: Does multi-expert generation reduce patent overlap?
- RQ3: What is the optimal number of experts?
- RQ4: How do expert sources affect output quality?

### 4.2 Experimental Setup

#### 4.2.1 Dataset
- N concepts/queries for ideation
- Selection criteria (diverse domains, complexity levels)

#### 4.2.2 Conditions
| Condition | Description |
|-----------|-------------|
| Baseline | Direct LLM: "Generate 20 creative ideas for X" |
| Single-Expert | 1 expert × 20 ideas |
| Multi-Expert-4 | 4 experts × 5 ideas each |
| Multi-Expert-8 | 8 experts × 2-3 ideas each |
| Random-Perspective | 4 random words as "perspectives" |

#### 4.2.3 Controls
- Same LLM model (specify version)
- Same temperature settings
- Same total idea count per condition

### 4.3 Metrics

#### 4.3.1 Semantic Diversity
- Mean pairwise cosine distance between embeddings
- Cluster distribution analysis
- Silhouette score for idea clustering

#### 4.3.2 Novelty
- Patent overlap rate
- Semantic distance from query centroid

#### 4.3.3 Quality (Human Evaluation)
- Novelty rating (1-7 Likert)
- Usefulness rating (1-7 Likert)
- Creativity rating (1-7 Likert)
- Interrater reliability (Cronbach's alpha)

### 4.4 Procedure
- Idea generation process
- Evaluation process
- Statistical analysis methods

---

## 5. Results

### 5.1 Semantic Diversity (RQ1)
- Quantitative results
- Visualization (t-SNE/UMAP of idea embeddings)
- Statistical significance tests

### 5.2 Patent Novelty (RQ2)
- Overlap rates by condition
- Examples of high-novelty ideas

### 5.3 Expert Count Analysis (RQ3)
- Diversity vs. expert count curve
- Diminishing returns analysis
- Optimal expert count recommendation

### 5.4 Expert Source Comparison (RQ4)
- LLM-generated vs. curated vs. random
- Unconventionality metrics

### 5.5 Human Evaluation Results
- Rating distributions
- Condition comparisons
- Correlation with automatic metrics

---

## 6. Discussion

### 6.1 Interpreting the Results
- Why multi-expert works
- The role of structured decomposition
- Deduplication importance

### 6.2 Theoretical Implications
- Semantic gravity as framework for LLM creativity
- Expert perspectives as productive constraints
- Inner crowd wisdom

### 6.3 Practical Implications
- When to use multi-expert approach
- Expert selection strategies
- Integration with existing workflows

### 6.4 Limitations
- LLM-specific results may not generalize
- Patent overlap as proxy for true novelty
- Human evaluation subjectivity
- Single-language experiments

### 6.5 Future Work
- Cross-cultural creativity
- Domain-specific expert optimization
- Real-world deployment studies
- Integration with other creativity techniques

---

## 7. Conclusion

- Summary of contributions
- Key takeaways
- Broader impact

---

## Appendices

### A. Prompt Templates
- Expert generation prompts
- Keyword generation prompts
- Description generation prompts

### B. Full Experimental Results
- Complete data tables
- Additional visualizations

### C. Expert Source Details
- Curated occupation list
- DBpedia/Wikidata query details

### D. Human Evaluation Protocol
- Instructions for raters
- Example ratings
- Training materials

---

## Target Venues

### Tier 1 (Recommended)
1. **CHI** - ACM Conference on Human Factors in Computing Systems
   - Strong fit: creativity support tools, human-AI collaboration
   - Deadline: typically September

2. **CSCW** - ACM Conference on Computer-Supported Cooperative Work
   - Good fit: collaborative ideation, crowd wisdom
   - Deadline: typically April/January

3. **Creativity & Cognition** - ACM Conference
   - Perfect fit: computational creativity focus
   - Smaller but specialized venue

### Tier 2 (Alternative)
4. **DIS** - ACM Designing Interactive Systems
   - Good fit: design ideation tools

5. **UIST** - ACM Symposium on User Interface Software and Technology
   - If system/interaction focus emphasized

6. **ICCC** - International Conference on Computational Creativity
   - Specialized computational creativity venue

### Journal Options
1. **International Journal of Human-Computer Studies (IJHCS)**
2. **ACM Transactions on Computer-Human Interaction (TOCHI)**
3. **Design Studies**
4. **Creativity Research Journal**

---

## Timeline Checklist

- [ ] Finalize experimental design
- [ ] Collect/select query dataset
- [ ] Run all experimental conditions
- [ ] Compute automatic metrics
- [ ] Design human evaluation study
- [ ] Recruit evaluators
- [ ] Conduct human evaluation
- [ ] Statistical analysis
- [ ] Write first draft
- [ ] Internal review
- [ ] Revision
- [ ] Submit