chore: save local changes
This commit is contained in:
38
research/README.md
Normal file
38
research/README.md
Normal file
@@ -0,0 +1,38 @@
|
||||
# Research: Expert-Augmented LLM Ideation
|
||||
|
||||
This folder contains research materials for the academic paper on the novelty-seeking system.
|
||||
|
||||
## Files
|
||||
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| `literature_review.md` | Comprehensive literature review covering semantic distance theory, conceptual blending, design fixation, LLM limitations, and related work |
|
||||
| `references.md` | 55+ academic references with links to papers |
|
||||
| `theoretical_framework.md` | The "Semantic Gravity" theoretical model and testable hypotheses |
|
||||
| `paper_outline.md` | Complete paper structure, experimental design, and target venues |
|
||||
|
||||
## Key Theoretical Contribution
|
||||
|
||||
**"Semantic Gravity"**: LLMs exhibit a tendency to generate outputs clustered around high-probability regions of their training distribution, limiting creative novelty. Expert perspectives provide "escape velocity" to break free from this gravity.
|
||||
|
||||
## Core Hypotheses
|
||||
|
||||
1. **H1**: Multi-expert generation → higher semantic diversity
|
||||
2. **H2**: Multi-expert generation → lower patent overlap (higher novelty)
|
||||
3. **H3**: Diversity increases with expert count (diminishing returns ~4-6)
|
||||
4. **H4**: Expert source affects unconventionality of ideas
|
||||
|
||||
## Target Venues
|
||||
|
||||
- **CHI** (ACM Conference on Human Factors in Computing Systems)
|
||||
- **CSCW** (ACM Conference on Computer-Supported Cooperative Work)
|
||||
- **Creativity & Cognition** (ACM Conference)
|
||||
- **IJHCS** (International Journal of Human-Computer Studies)
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Design concrete experiment protocol
|
||||
2. Add measurement code to existing system
|
||||
3. Collect experimental data
|
||||
4. Conduct human evaluation
|
||||
5. Write and submit paper
|
||||
555
research/experimental_protocol.md
Normal file
555
research/experimental_protocol.md
Normal file
@@ -0,0 +1,555 @@
|
||||
# Experimental Protocol: Expert-Augmented LLM Ideation
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This document outlines a comprehensive experimental design to test the hypothesis that multi-expert LLM-based ideation produces more diverse and novel ideas than direct LLM generation.
|
||||
|
||||
---
|
||||
|
||||
## 1. Research Questions
|
||||
|
||||
| ID | Research Question |
|
||||
|----|-------------------|
|
||||
| **RQ1** | Does multi-expert generation produce higher semantic diversity than direct LLM generation? |
|
||||
| **RQ2** | Does multi-expert generation produce ideas with lower patent overlap (higher novelty)? |
|
||||
| **RQ3** | What is the optimal number of experts for maximizing diversity? |
|
||||
| **RQ4** | How do different expert sources (LLM vs Curated vs DBpedia) affect idea quality? |
|
||||
| **RQ5** | Does structured attribute decomposition enhance the multi-expert effect? |
|
||||
|
||||
---
|
||||
|
||||
## 2. Experimental Design Overview
|
||||
|
||||
### 2.1 Design Type
|
||||
**Mixed Design**: Between-subjects for main conditions × Within-subjects for queries
|
||||
|
||||
### 2.2 Variables
|
||||
|
||||
#### Independent Variables (Manipulated)
|
||||
|
||||
| Variable | Levels | Your System Parameter |
|
||||
|----------|--------|----------------------|
|
||||
| **Generation Method** | 5 levels (see conditions) | Condition-dependent |
|
||||
| **Expert Count** | 1, 2, 4, 6, 8 | `expert_count` |
|
||||
| **Expert Source** | LLM, Curated, DBpedia | `expert_source` |
|
||||
| **Attribute Structure** | With/Without decomposition | Pipeline inclusion |
|
||||
|
||||
#### Dependent Variables (Measured)
|
||||
|
||||
| Variable | Measurement Method |
|
||||
|----------|-------------------|
|
||||
| **Semantic Diversity** | Mean pairwise cosine distance (embeddings) |
|
||||
| **Cluster Spread** | Number of clusters, silhouette score |
|
||||
| **Patent Novelty** | 1 - (ideas with patent match / total ideas) |
|
||||
| **Semantic Distance** | Distance from query centroid |
|
||||
| **Human Novelty Rating** | 7-point Likert scale |
|
||||
| **Human Usefulness Rating** | 7-point Likert scale |
|
||||
| **Human Creativity Rating** | 7-point Likert scale |
|
||||
|
||||
#### Control Variables (Held Constant)
|
||||
|
||||
| Variable | Fixed Value |
|
||||
|----------|-------------|
|
||||
| LLM Model | Qwen3:8b (or specify) |
|
||||
| Temperature | 0.7 |
|
||||
| Total Ideas per Query | 20 |
|
||||
| Keywords per Expert | 1 |
|
||||
| Deduplication | Disabled for raw comparison |
|
||||
| Language | English (for patent search) |
|
||||
|
||||
---
|
||||
|
||||
## 3. Experimental Conditions
|
||||
|
||||
### 3.1 Main Study: Generation Method Comparison
|
||||
|
||||
| Condition | Description | Implementation |
|
||||
|-----------|-------------|----------------|
|
||||
| **C1: Direct** | Direct LLM generation | Prompt: "Generate 20 creative ideas for [query]" |
|
||||
| **C2: Single-Expert** | 1 expert × 20 ideas | `expert_count=1`, `keywords_per_expert=20` |
|
||||
| **C3: Multi-Expert-4** | 4 experts × 5 ideas each | `expert_count=4`, `keywords_per_expert=5` |
|
||||
| **C4: Multi-Expert-8** | 8 experts × 2-3 ideas each | `expert_count=8`, `keywords_per_expert=2-3` |
|
||||
| **C5: Random-Perspective** | 4 random words as "perspectives" | Custom prompt with random nouns |
|
||||
|
||||
### 3.2 Expert Count Study
|
||||
|
||||
| Condition | Expert Count | Ideas per Expert |
|
||||
|-----------|--------------|------------------|
|
||||
| **E1** | 1 | 20 |
|
||||
| **E2** | 2 | 10 |
|
||||
| **E4** | 4 | 5 |
|
||||
| **E6** | 6 | 3-4 |
|
||||
| **E8** | 8 | 2-3 |
|
||||
|
||||
### 3.3 Expert Source Study
|
||||
|
||||
| Condition | Source | Implementation |
|
||||
|-----------|--------|----------------|
|
||||
| **S-LLM** | LLM-generated | `expert_source=ExpertSource.LLM` |
|
||||
| **S-Curated** | Curated 210 occupations | `expert_source=ExpertSource.CURATED` |
|
||||
| **S-DBpedia** | DBpedia 2164 occupations | `expert_source=ExpertSource.DBPEDIA` |
|
||||
| **S-Random** | Random word "experts" | Custom implementation |
|
||||
|
||||
---
|
||||
|
||||
## 4. Query Dataset
|
||||
|
||||
### 4.1 Design Principles
|
||||
- **Diversity**: Cover multiple domains (consumer products, technology, services, abstract concepts)
|
||||
- **Complexity Variation**: Simple objects to complex systems
|
||||
- **Familiarity Variation**: Common items to specialized equipment
|
||||
- **Cultural Neutrality**: Concepts understandable across cultures
|
||||
|
||||
### 4.2 Query Set (30 Queries)
|
||||
|
||||
#### Category A: Everyday Objects (10)
|
||||
| ID | Query | Complexity |
|
||||
|----|-------|------------|
|
||||
| A1 | Chair | Low |
|
||||
| A2 | Umbrella | Low |
|
||||
| A3 | Backpack | Low |
|
||||
| A4 | Coffee mug | Low |
|
||||
| A5 | Bicycle | Medium |
|
||||
| A6 | Refrigerator | Medium |
|
||||
| A7 | Smartphone | Medium |
|
||||
| A8 | Running shoes | Medium |
|
||||
| A9 | Kitchen knife | Low |
|
||||
| A10 | Desk lamp | Low |
|
||||
|
||||
#### Category B: Technology & Tools (10)
|
||||
| ID | Query | Complexity |
|
||||
|----|-------|------------|
|
||||
| B1 | Solar panel | Medium |
|
||||
| B2 | Electric vehicle | High |
|
||||
| B3 | 3D printer | High |
|
||||
| B4 | Drone | Medium |
|
||||
| B5 | Smart thermostat | Medium |
|
||||
| B6 | Noise-canceling headphones | Medium |
|
||||
| B7 | Water purifier | Medium |
|
||||
| B8 | Wind turbine | High |
|
||||
| B9 | Robotic vacuum | Medium |
|
||||
| B10 | Wearable fitness tracker | Medium |
|
||||
|
||||
#### Category C: Services & Systems (10)
|
||||
| ID | Query | Complexity |
|
||||
|----|-------|------------|
|
||||
| C1 | Food delivery service | Medium |
|
||||
| C2 | Online education platform | High |
|
||||
| C3 | Healthcare appointment system | High |
|
||||
| C4 | Public transportation | High |
|
||||
| C5 | Hotel booking system | Medium |
|
||||
| C6 | Personal finance app | Medium |
|
||||
| C7 | Grocery shopping experience | Medium |
|
||||
| C8 | Parking solution | Medium |
|
||||
| C9 | Elderly care service | High |
|
||||
| C10 | Waste management system | High |
|
||||
|
||||
### 4.3 Sample Size Justification
|
||||
|
||||
Based on [CHI meta-study on effect sizes](https://dl.acm.org/doi/10.1145/3706598.3713671):
|
||||
|
||||
- **Queries**: 30 (crossed with conditions)
|
||||
- **Expected effect size**: d = 0.5 (medium)
|
||||
- **Power target**: 80%
|
||||
- **For automatic metrics**: 30 queries × 5 conditions × 20 ideas = 3,000 ideas
|
||||
- **For human evaluation**: Subset of 10 queries × 3 conditions × 20 ideas = 600 ideas
|
||||
|
||||
---
|
||||
|
||||
## 5. Automatic Metrics Collection
|
||||
|
||||
### 5.1 Semantic Diversity Metrics
|
||||
|
||||
#### 5.1.1 Mean Pairwise Distance (Primary)
|
||||
```python
|
||||
def compute_mean_pairwise_distance(ideas: List[str], embedding_model: str) -> float:
|
||||
"""
|
||||
Compute mean cosine distance between all idea pairs.
|
||||
Higher = more diverse.
|
||||
"""
|
||||
embeddings = get_embeddings(ideas, model=embedding_model)
|
||||
n = len(embeddings)
|
||||
distances = []
|
||||
for i in range(n):
|
||||
for j in range(i+1, n):
|
||||
dist = 1 - cosine_similarity(embeddings[i], embeddings[j])
|
||||
distances.append(dist)
|
||||
return np.mean(distances), np.std(distances)
|
||||
```
|
||||
|
||||
#### 5.1.2 Cluster Analysis
|
||||
```python
|
||||
def compute_cluster_metrics(ideas: List[str], embedding_model: str) -> dict:
|
||||
"""
|
||||
Analyze idea clustering patterns.
|
||||
"""
|
||||
embeddings = get_embeddings(ideas, model=embedding_model)
|
||||
|
||||
# Find optimal k using silhouette score
|
||||
silhouette_scores = []
|
||||
for k in range(2, min(len(ideas), 10)):
|
||||
kmeans = KMeans(n_clusters=k)
|
||||
labels = kmeans.fit_predict(embeddings)
|
||||
score = silhouette_score(embeddings, labels)
|
||||
silhouette_scores.append((k, score))
|
||||
|
||||
best_k = max(silhouette_scores, key=lambda x: x[1])[0]
|
||||
|
||||
return {
|
||||
'optimal_clusters': best_k,
|
||||
'silhouette_score': max(silhouette_scores, key=lambda x: x[1])[1],
|
||||
'cluster_distribution': compute_cluster_sizes(embeddings, best_k)
|
||||
}
|
||||
```
|
||||
|
||||
#### 5.1.3 Semantic Distance from Query
|
||||
```python
|
||||
def compute_query_distance(query: str, ideas: List[str], embedding_model: str) -> dict:
|
||||
"""
|
||||
Measure how far ideas are from the original query.
|
||||
Higher = more novel/distant.
|
||||
"""
|
||||
query_emb = get_embedding(query, model=embedding_model)
|
||||
idea_embs = get_embeddings(ideas, model=embedding_model)
|
||||
|
||||
distances = [1 - cosine_similarity(query_emb, e) for e in idea_embs]
|
||||
|
||||
return {
|
||||
'mean_distance': np.mean(distances),
|
||||
'max_distance': np.max(distances),
|
||||
'min_distance': np.min(distances),
|
||||
'std_distance': np.std(distances)
|
||||
}
|
||||
```
|
||||
|
||||
### 5.2 Patent Novelty Metrics
|
||||
|
||||
#### 5.2.1 Patent Overlap Rate
|
||||
```python
|
||||
def compute_patent_novelty(ideas: List[str], query: str) -> dict:
|
||||
"""
|
||||
Search patents for each idea and compute overlap rate.
|
||||
Uses existing patent_search_service.
|
||||
"""
|
||||
matches = 0
|
||||
match_details = []
|
||||
|
||||
for idea in ideas:
|
||||
result = patent_search_service.search(idea)
|
||||
if result.has_match:
|
||||
matches += 1
|
||||
match_details.append({
|
||||
'idea': idea,
|
||||
'patent': result.best_match
|
||||
})
|
||||
|
||||
return {
|
||||
'novelty_rate': 1 - (matches / len(ideas)),
|
||||
'match_count': matches,
|
||||
'total_ideas': len(ideas),
|
||||
'match_details': match_details
|
||||
}
|
||||
```
|
||||
|
||||
### 5.3 Metrics Summary Table
|
||||
|
||||
| Metric | Formula | Interpretation |
|
||||
|--------|---------|----------------|
|
||||
| **Mean Pairwise Distance** | avg(1 - cos_sim(i, j)) for all pairs | Higher = more diverse |
|
||||
| **Silhouette Score** | Cluster cohesion vs separation | Higher = clearer clusters |
|
||||
| **Optimal Cluster Count** | argmax(silhouette) | More clusters = more themes |
|
||||
| **Query Distance** | 1 - cos_sim(query, idea) | Higher = farther from original |
|
||||
| **Patent Novelty Rate** | 1 - (matches / total) | Higher = more novel |
|
||||
|
||||
---
|
||||
|
||||
## 6. Human Evaluation Protocol
|
||||
|
||||
### 6.1 Participants
|
||||
|
||||
#### 6.1.1 Recruitment
|
||||
- **Platform**: Prolific, MTurk, or domain experts
|
||||
- **Sample Size**: 60 evaluators (20 per condition group)
|
||||
- **Criteria**:
|
||||
- Native English speakers
|
||||
- Bachelor's degree or higher
|
||||
- Attention check pass rate > 80%
|
||||
|
||||
#### 6.1.2 Compensation
|
||||
- $15/hour equivalent
|
||||
- ~30 minutes per session
|
||||
- Bonus for high-quality ratings
|
||||
|
||||
### 6.2 Rating Scales
|
||||
|
||||
#### 6.2.1 Novelty (7-point Likert)
|
||||
```
|
||||
How novel/surprising is this idea?
|
||||
1 = Not at all novel (very common/obvious)
|
||||
4 = Moderately novel
|
||||
7 = Extremely novel (never seen before)
|
||||
```
|
||||
|
||||
#### 6.2.2 Usefulness (7-point Likert)
|
||||
```
|
||||
How useful/practical is this idea?
|
||||
1 = Not at all useful (impractical)
|
||||
4 = Moderately useful
|
||||
7 = Extremely useful (highly practical)
|
||||
```
|
||||
|
||||
#### 6.2.3 Creativity (7-point Likert)
|
||||
```
|
||||
How creative is this idea overall?
|
||||
1 = Not at all creative
|
||||
4 = Moderately creative
|
||||
7 = Extremely creative
|
||||
```
|
||||
|
||||
### 6.3 Procedure
|
||||
|
||||
1. **Introduction** (5 min)
|
||||
- Study purpose (without revealing hypotheses)
|
||||
- Rating scale explanation
|
||||
- Practice with 3 example ideas
|
||||
|
||||
2. **Training** (5 min)
|
||||
- Rate 5 calibration ideas with feedback
|
||||
- Discuss edge cases
|
||||
|
||||
3. **Main Evaluation** (20 min)
|
||||
- Rate 30 ideas (randomized order)
|
||||
- 3 attention check items embedded
|
||||
- Break after 15 ideas
|
||||
|
||||
4. **Debriefing** (2 min)
|
||||
- Demographics
|
||||
- Open-ended feedback
|
||||
|
||||
### 6.4 Quality Control
|
||||
|
||||
| Check | Threshold | Action |
|
||||
|-------|-----------|--------|
|
||||
| Attention checks | < 2/3 correct | Exclude |
|
||||
| Completion time | < 10 min | Flag for review |
|
||||
| Variance in ratings | All same score | Exclude |
|
||||
| Inter-rater reliability | Cronbach's α < 0.7 | Review ratings |
|
||||
|
||||
### 6.5 Analysis Plan
|
||||
|
||||
#### 6.5.1 Reliability
|
||||
- Cronbach's alpha for each scale
|
||||
- ICC (Intraclass Correlation) for inter-rater agreement
|
||||
|
||||
#### 6.5.2 Main Analysis
|
||||
- Mixed-effects ANOVA: Condition × Query
|
||||
- Post-hoc: Tukey HSD for pairwise comparisons
|
||||
- Effect sizes: Cohen's d
|
||||
|
||||
#### 6.5.3 Correlation with Automatic Metrics
|
||||
- Pearson correlation: Human ratings vs semantic diversity
|
||||
- Regression: Predict human ratings from automatic metrics
|
||||
|
||||
---
|
||||
|
||||
## 7. Experimental Procedure
|
||||
|
||||
### 7.1 Phase 1: Idea Generation
|
||||
|
||||
```
|
||||
For each query Q in QuerySet:
|
||||
For each condition C in Conditions:
|
||||
|
||||
If C == "Direct":
|
||||
ideas = direct_llm_generation(Q, n=20)
|
||||
|
||||
Elif C == "Single-Expert":
|
||||
expert = generate_expert(Q, n=1)
|
||||
ideas = expert_transformation(Q, expert, ideas_per_expert=20)
|
||||
|
||||
Elif C == "Multi-Expert-4":
|
||||
experts = generate_experts(Q, n=4)
|
||||
ideas = expert_transformation(Q, experts, ideas_per_expert=5)
|
||||
|
||||
Elif C == "Multi-Expert-8":
|
||||
experts = generate_experts(Q, n=8)
|
||||
ideas = expert_transformation(Q, experts, ideas_per_expert=2-3)
|
||||
|
||||
Elif C == "Random-Perspective":
|
||||
perspectives = random.sample(RANDOM_WORDS, 4)
|
||||
ideas = perspective_generation(Q, perspectives, ideas_per=5)
|
||||
|
||||
Store(Q, C, ideas)
|
||||
```
|
||||
|
||||
### 7.2 Phase 2: Automatic Metrics
|
||||
|
||||
```
|
||||
For each (Q, C, ideas) in Results:
|
||||
metrics = {
|
||||
'diversity': compute_mean_pairwise_distance(ideas),
|
||||
'clusters': compute_cluster_metrics(ideas),
|
||||
'query_distance': compute_query_distance(Q, ideas),
|
||||
'patent_novelty': compute_patent_novelty(ideas, Q)
|
||||
}
|
||||
Store(Q, C, metrics)
|
||||
```
|
||||
|
||||
### 7.3 Phase 3: Human Evaluation
|
||||
|
||||
```
|
||||
# Sample selection
|
||||
selected_queries = random.sample(QuerySet, 10)
|
||||
selected_conditions = ["Direct", "Multi-Expert-4", "Multi-Expert-8"]
|
||||
|
||||
# Create evaluation set
|
||||
evaluation_items = []
|
||||
For each Q in selected_queries:
|
||||
For each C in selected_conditions:
|
||||
ideas = Get(Q, C)
|
||||
For each idea in ideas:
|
||||
evaluation_items.append((Q, C, idea))
|
||||
|
||||
# Randomize and assign to evaluators
|
||||
random.shuffle(evaluation_items)
|
||||
assignments = assign_to_evaluators(evaluation_items, n_evaluators=60)
|
||||
|
||||
# Collect ratings
|
||||
ratings = collect_human_ratings(assignments)
|
||||
```
|
||||
|
||||
### 7.4 Phase 4: Analysis
|
||||
|
||||
```
|
||||
# Automatic metrics analysis
|
||||
Run ANOVA: diversity ~ condition + query + condition:query
|
||||
Run post-hoc: Tukey HSD for condition pairs
|
||||
Compute effect sizes
|
||||
|
||||
# Human ratings analysis
|
||||
Check reliability: Cronbach's alpha, ICC
|
||||
Run mixed-effects model: rating ~ condition + (1|evaluator) + (1|query)
|
||||
Compute correlations: human vs automatic metrics
|
||||
|
||||
# Visualization
|
||||
Plot: Diversity by condition (box plots)
|
||||
Plot: t-SNE of idea embeddings colored by condition
|
||||
Plot: Expert count vs diversity curve
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Implementation Checklist
|
||||
|
||||
### 8.1 Code to Implement
|
||||
|
||||
- [ ] `experiments/generate_ideas.py` - Idea generation for all conditions
|
||||
- [ ] `experiments/compute_metrics.py` - Automatic metric computation
|
||||
- [ ] `experiments/export_for_evaluation.py` - Prepare human evaluation set
|
||||
- [ ] `experiments/analyze_results.py` - Statistical analysis
|
||||
- [ ] `experiments/visualize.py` - Generate figures
|
||||
|
||||
### 8.2 Data Files to Create
|
||||
|
||||
- [ ] `data/queries.json` - 30 queries with metadata
|
||||
- [ ] `data/random_words.json` - Random perspective words
|
||||
- [ ] `data/generated_ideas/` - Raw idea outputs
|
||||
- [ ] `data/metrics/` - Computed metric results
|
||||
- [ ] `data/human_ratings/` - Collected ratings
|
||||
|
||||
### 8.3 Analysis Outputs
|
||||
|
||||
- [ ] `results/diversity_by_condition.csv`
|
||||
- [ ] `results/patent_novelty_by_condition.csv`
|
||||
- [ ] `results/human_ratings_summary.csv`
|
||||
- [ ] `results/statistical_tests.txt`
|
||||
- [ ] `figures/` - All visualizations
|
||||
|
||||
---
|
||||
|
||||
## 9. Expected Results & Hypotheses
|
||||
|
||||
### 9.1 Primary Hypotheses
|
||||
|
||||
| Hypothesis | Prediction | Metric |
|
||||
|------------|------------|--------|
|
||||
| **H1** | Multi-Expert-4 > Single-Expert > Direct | Semantic diversity |
|
||||
| **H2** | Multi-Expert-8 ≈ Multi-Expert-4 (diminishing returns) | Semantic diversity |
|
||||
| **H3** | Multi-Expert > Direct | Patent novelty rate |
|
||||
| **H4** | LLM experts > Curated > DBpedia | Unconventionality |
|
||||
| **H5** | With attributes > Without attributes | Overall diversity |
|
||||
|
||||
### 9.2 Expected Effect Sizes
|
||||
|
||||
Based on related work:
|
||||
- Diversity increase: d = 0.5-0.8 (medium to large)
|
||||
- Patent novelty increase: 20-40% improvement
|
||||
- Human creativity rating: d = 0.3-0.5 (small to medium)
|
||||
|
||||
### 9.3 Potential Confounds
|
||||
|
||||
| Confound | Mitigation |
|
||||
|----------|-----------|
|
||||
| Query difficulty | Crossed design (all queries × all conditions) |
|
||||
| LLM variability | Multiple runs, fixed seed where possible |
|
||||
| Evaluator bias | Randomized presentation, blinding |
|
||||
| Order effects | Counterbalancing in human evaluation |
|
||||
|
||||
---
|
||||
|
||||
## 10. Timeline
|
||||
|
||||
| Week | Activity |
|
||||
|------|----------|
|
||||
| 1-2 | Implement idea generation scripts |
|
||||
| 3 | Generate all ideas (5 conditions × 30 queries) |
|
||||
| 4 | Compute automatic metrics |
|
||||
| 5 | Design and pilot human evaluation |
|
||||
| 6-7 | Run human evaluation (60 participants) |
|
||||
| 8 | Analyze results |
|
||||
| 9-10 | Write paper |
|
||||
| 11 | Internal review |
|
||||
| 12 | Submit |
|
||||
|
||||
---
|
||||
|
||||
## 11. Appendix: Direct Generation Prompt
|
||||
|
||||
For baseline condition C1 (Direct LLM generation):
|
||||
|
||||
```
|
||||
You are a creative innovation consultant. Generate 20 unique and creative ideas
|
||||
for improving or reimagining a [QUERY].
|
||||
|
||||
Requirements:
|
||||
- Each idea should be distinct and novel
|
||||
- Ideas should range from incremental improvements to radical innovations
|
||||
- Consider different aspects: materials, functions, user experiences, contexts
|
||||
- Provide a brief (15-30 word) description for each idea
|
||||
|
||||
Output format:
|
||||
1. [Idea keyword]: [Description]
|
||||
2. [Idea keyword]: [Description]
|
||||
...
|
||||
20. [Idea keyword]: [Description]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 12. Appendix: Random Perspective Words
|
||||
|
||||
For condition C5 (Random-Perspective), sample from:
|
||||
|
||||
```json
|
||||
[
|
||||
"ocean", "mountain", "forest", "desert", "cave",
|
||||
"microscope", "telescope", "kaleidoscope", "prism", "lens",
|
||||
"butterfly", "elephant", "octopus", "eagle", "ant",
|
||||
"sunrise", "thunderstorm", "rainbow", "fog", "aurora",
|
||||
"clockwork", "origami", "mosaic", "symphony", "ballet",
|
||||
"ancient", "futuristic", "organic", "crystalline", "liquid",
|
||||
"whisper", "explosion", "rhythm", "silence", "echo"
|
||||
]
|
||||
```
|
||||
|
||||
This tests whether ANY perspective shift helps, or if EXPERT perspectives specifically matter.
|
||||
209
research/literature_review.md
Normal file
209
research/literature_review.md
Normal file
@@ -0,0 +1,209 @@
|
||||
# Literature Review: Expert-Augmented LLM Ideation
|
||||
|
||||
## 1. Core Directly-Related Work
|
||||
|
||||
### 1.1 Wisdom of Crowds via Role Assumption
|
||||
**Bringing the Wisdom of the Crowd to an Individual by Having the Individual Assume Different Roles** (ACM C&C 2017)
|
||||
|
||||
Groups of people tend to generate more diverse ideas than individuals because each group member brings a different perspective. This study showed it's possible to help individuals think more like a group by asking them to approach a problem from different perspectives. In an experiment with 54 crowd workers, participants who assumed different expert roles came up with more creative ideas than those who did not.
|
||||
|
||||
**Gap for our work**: This was human-based role-playing. Our system automates this with LLM-powered expert perspectives.
|
||||
|
||||
### 1.2 PersonaFlow: LLM-Simulated Expert Perspectives
|
||||
**PersonaFlow: Designing LLM-Simulated Expert Perspectives for Enhanced Research Ideation** (2024)
|
||||
|
||||
PersonaFlow provides multiple perspectives by using LLMs to simulate domain-specific experts. User studies showed it increased the perceived relevance and creativity of ideated research directions and promoted users' critical thinking activities without increasing perceived cognitive load.
|
||||
|
||||
**Gap for our work**: PersonaFlow focuses on research ideation. Our system applies to product/innovation ideation with structured attribute decomposition.
|
||||
|
||||
### 1.3 PopBlends: Conceptual Blending with LLMs
|
||||
**PopBlends: Strategies for Conceptual Blending with Large Language Models** (CHI 2023)
|
||||
|
||||
PopBlends automatically suggests conceptual blends using both traditional knowledge extraction and LLMs. Studies showed people found twice as many blend suggestions with the system, with half the mental demand.
|
||||
|
||||
**Gap for our work**: We structure blending through expert domain knowledge rather than direct concept pairing.
|
||||
|
||||
### 1.4 BILLY: Persona Vector Merging
|
||||
**BILLY: Steering Large Language Models via Merging Persona Vectors for Creative Generation** (2025)
|
||||
|
||||
Proposes fusing persona vectors in activation space to steer LLM output towards multiple perspectives simultaneously, requiring only a single additive operation during inference.
|
||||
|
||||
**Gap for our work**: We use sequential multi-expert generation rather than vector fusion, allowing more explicit control and interpretability.
|
||||
|
||||
---
|
||||
|
||||
## 2. Theoretical Foundations
|
||||
|
||||
### 2.1 Semantic Distance Theory
|
||||
|
||||
**Core Insight** (Mednick, 1962): Creative thinking involves connecting weakly related, remote concepts in semantic memory. The farther one "moves away" from a conventional idea, the more creative the new idea will likely be.
|
||||
|
||||
**Key Research**:
|
||||
- Semantic distance plays an important role in the creative process
|
||||
- A more "flexible" semantic memory structure (higher connectivity, shorter distances) facilitates creative idea generation
|
||||
- Quantitative measures using LSA and semantic networks can objectively examine creative output
|
||||
- Divergent Semantic Integration (DSI) correlates strongly with human creativity ratings (72% variance explained)
|
||||
|
||||
**Application to Our Work**: Expert perspectives force semantic "jumps" to distant domains that LLMs wouldn't naturally traverse.
|
||||
|
||||
```
|
||||
Without Expert: "Chair" → furniture, sitting, comfort (short semantic distance)
|
||||
With Expert: "Chair" + Marine Biologist → pressure, buoyancy, coral (long semantic distance)
|
||||
```
|
||||
|
||||
### 2.2 Conceptual Blending Theory
|
||||
|
||||
**Core Insight** (Fauconnier & Turner, 2002): Creative products emerge from blending elements of two input spaces into a novel integrated space.
|
||||
|
||||
**Key Research**:
|
||||
- Blending process: (1) find connecting concept between inputs, (2) map elements that can be blended
|
||||
- Generative AI demonstrates ability to blend and integrate concepts (bisociation)
|
||||
- Trisociation (three-concept blending) is being used for AI-augmented idea generation
|
||||
- Conceptual blending provides terminology for describing creative products
|
||||
|
||||
**Limitation**: Blending theory doesn't explain where inputs originate - the "inspiration problem."
|
||||
|
||||
**Application to Our Work**: Each expert provides a distinct "input space" enabling systematic multi-space blending. Our attribute decomposition provides structured inputs for blending.
|
||||
|
||||
### 2.3 Design Fixation
|
||||
|
||||
**Core Insight** (Jansson & Smith, 1991): Design fixation is "blind adherence to a set of ideas or concepts limiting the output of conceptual design."
|
||||
|
||||
**Key Research**:
|
||||
- Fixation results from categorical knowledge organization around prototypes
|
||||
- Accessing prototypes requires less cognitive effort than processing exemplars
|
||||
- Diverse teams, model-making, and facilitation help prevent fixation
|
||||
- Reflecting on prior fixation episodes is most effective prevention
|
||||
|
||||
**Neural Evidence**: fMRI studies show distinct patterns during fixated vs. creative ideation.
|
||||
|
||||
**Application to Our Work**: LLMs exhibit "semantic fixation" on high-probability outputs. Expert perspectives break this by forcing activation of non-prototype knowledge.
|
||||
|
||||
### 2.4 Constraint-Based Creativity
|
||||
|
||||
**Core Insight**: Paradoxically, constraints can enhance creativity by pushing beyond the path of least resistance.
|
||||
|
||||
**Key Research**:
|
||||
- Constraints push people to search for more distant ideas in semantic memory
|
||||
- Extreme constraints may require different types of creative problem-solving
|
||||
- Not all constraints promote creativity for all individuals/tasks
|
||||
- A "constraint-leveraging mindset" can be developed through experience
|
||||
|
||||
**Application to Our Work**: Expert role = productive constraint that expands rather than limits creative space. The expert perspective forces exploration of non-obvious solution spaces.
|
||||
|
||||
---
|
||||
|
||||
## 3. LLM Limitations in Creative Generation
|
||||
|
||||
### 3.1 Design Fixation from AI
|
||||
**The Effects of Generative AI on Design Fixation and Divergent Thinking** (CHI 2024)
|
||||
|
||||
Key finding: AI exposure during ideation leads to HIGHER fixation. Participants who used AI produced:
|
||||
- Fewer ideas
|
||||
- Less variety
|
||||
- Lower originality
|
||||
|
||||
compared to baseline (no AI assistance).
|
||||
|
||||
### 3.2 Dual Mechanisms: Inspiration vs. Fixation
|
||||
**Inspiration Booster or Creative Fixation?** (Nature Humanities & Social Sciences, 2025)
|
||||
|
||||
- LLMs help in **simple** creative tasks (inspiration stimulation)
|
||||
- LLMs **hurt** in **complex** creative tasks (creative fixation)
|
||||
|
||||
**Application to Our Work**: Our structured decomposition manages complexity, while multi-expert approach maintains inspiration benefits.
|
||||
|
||||
### 3.3 Statistical Pattern Perpetuation
|
||||
**Bias and Fairness in Large Language Models: A Survey** (MIT Press, 2024)
|
||||
|
||||
LLMs learn, perpetuate, and amplify patterns from training data. This applies to creative outputs - LLMs generate what is statistically common/expected.
|
||||
|
||||
### 3.4 Generalization Bias
|
||||
**Generalization Bias in LLM Summarization** (Royal Society, 2025)
|
||||
|
||||
LLMs' overgeneralization tendency produces outputs that lack sufficient empirical support. This suggests a bias toward "safe" middle-ground outputs rather than novel extremes.
|
||||
|
||||
---
|
||||
|
||||
## 4. Role-Playing and Perspective-Taking
|
||||
|
||||
### 4.1 Creativity Enhancement
|
||||
Research on tabletop role-playing games (TTRPGs) demonstrates:
|
||||
- Significant positive impact on creativity potential through divergent thinking
|
||||
- TTRPG players exhibit significantly higher creativity than non-players
|
||||
- Perspective-taking is closely linked to empathy and cognitive flexibility
|
||||
|
||||
### 4.2 Therapeutic and Educational Applications
|
||||
- Role-playing develops perspective-taking, storytelling, creativity, and self-expression
|
||||
- Physiological, emotional, and mental well-being from play enables creative ideation
|
||||
- Play signals psychological safety, which is essential for creativity
|
||||
|
||||
### 4.3 Design Research Applications
|
||||
- Role-playing stimulates creativity by exploring alternative solutions
|
||||
- Offers safe environment to explore failure modes and challenge assumptions
|
||||
- Well-suited for early-stage ideation and empathy-critical moments
|
||||
|
||||
---
|
||||
|
||||
## 5. Creativity Support Tools (CSTs)
|
||||
|
||||
### 5.1 Current State
|
||||
- CSTs primarily support **divergent** thinking
|
||||
- **Convergent** thinking often neglected
|
||||
- Ideal CST should offer tailored support for both
|
||||
|
||||
### 5.2 AI as Creative Partner
|
||||
- Collaborative ideation systems expose users to different ideas
|
||||
- Competing theories on when/whether such exposure helps
|
||||
- Tool-mediated expert activity view: computers as "mediating artifacts people act through"
|
||||
|
||||
### 5.3 Evaluation Methods
|
||||
**Consensual Assessment Technique (CAT)**:
|
||||
- Pool of experts independently evaluate artifacts
|
||||
- Creative if high evaluations + high interrater reliability (Cronbach's alpha > 0.7)
|
||||
|
||||
**Semantic Distance Measures**:
|
||||
- SemDis platform for automated creativity assessment
|
||||
- Overcomes labor cost and subjectivity of human rating
|
||||
- Uses NLP to quantify semantic relatedness
|
||||
|
||||
---
|
||||
|
||||
## 6. Our Theoretical Contribution
|
||||
|
||||
### The "Semantic Gravity" Problem
|
||||
|
||||
```
|
||||
Direct LLM Generation:
|
||||
P(idea | query)
|
||||
→ Samples from high-probability region
|
||||
→ Ideas cluster around training distribution modes
|
||||
→ "Semantic gravity" pulls toward conventional associations
|
||||
```
|
||||
|
||||
### Expert Transformation Solution
|
||||
|
||||
```
|
||||
Conditioned Generation:
|
||||
P(idea | query, expert)
|
||||
→ Expert perspective activates distant semantic regions
|
||||
→ Forces conceptual blending across domains
|
||||
→ Breaks design fixation through productive constraints
|
||||
```
|
||||
|
||||
### Multi-Expert Aggregation
|
||||
|
||||
```
|
||||
Diverse Experts → Semantic Coverage
|
||||
→ "Inner crowd" wisdom without actual crowd
|
||||
→ Systematic exploration of idea space
|
||||
→ Deduplication ensures non-redundant novelty
|
||||
```
|
||||
|
||||
### Theoretical Model
|
||||
|
||||
1. **Attribute Decomposition**: Structures the problem space (categories, attributes)
|
||||
2. **Expert Perspectives**: Forces semantic jumps to distant domains
|
||||
3. **Multi-Expert Aggregation**: Achieves crowd-like diversity individually
|
||||
4. **Deduplication**: Ensures generated ideas are truly distinct
|
||||
5. **Patent Validation**: Grounds novelty in real-world uniqueness
|
||||
288
research/paper_outline.md
Normal file
288
research/paper_outline.md
Normal file
@@ -0,0 +1,288 @@
|
||||
# Paper Outline: Expert-Augmented LLM Ideation
|
||||
|
||||
## Suggested Titles
|
||||
|
||||
1. **"Breaking Semantic Gravity: Expert-Augmented LLM Ideation for Enhanced Creativity"**
|
||||
2. "Beyond Interpolation: Multi-Expert Perspectives for Combinatorial Innovation"
|
||||
3. "Escaping the Relevance Trap: Structured Expert Frameworks for Creative AI"
|
||||
4. "From Crowd to Expert: Simulating Diverse Perspectives for LLM-Based Ideation"
|
||||
|
||||
---
|
||||
|
||||
## Abstract (Draft)
|
||||
|
||||
Large Language Models (LLMs) are increasingly used for creative ideation, yet they exhibit a phenomenon we term "semantic gravity" - the tendency to generate outputs clustered around high-probability regions of their training distribution. This limits the novelty and diversity of generated ideas. We propose a multi-expert transformation framework that systematically activates diverse semantic regions by conditioning LLM generation on simulated expert perspectives. Our system decomposes concepts into structured attributes, generates ideas through multiple domain-expert viewpoints, and employs semantic deduplication to ensure genuine diversity. Through experiments comparing multi-expert generation against direct LLM generation and single-expert baselines, we demonstrate that our approach produces ideas with [X]% higher semantic diversity and [Y]% lower patent overlap. We contribute a theoretical framework explaining LLM creativity limitations and an open-source system for innovation ideation.
|
||||
|
||||
---
|
||||
|
||||
## 1. Introduction
|
||||
|
||||
### 1.1 The Promise and Problem of LLM Creativity
|
||||
- LLMs widely adopted for creative tasks
|
||||
- Initial enthusiasm: infinite idea generation
|
||||
- Emerging concern: quality and diversity issues
|
||||
|
||||
### 1.2 The Semantic Gravity Problem
|
||||
- Define the phenomenon
|
||||
- Why it occurs (statistical learning, mode collapse)
|
||||
- Why it matters (innovation requires novelty)
|
||||
|
||||
### 1.3 Our Solution: Expert-Augmented Ideation
|
||||
- Brief overview of the approach
|
||||
- Key insight: expert perspectives as semantic "escape velocity"
|
||||
- Contributions preview
|
||||
|
||||
### 1.4 Paper Organization
|
||||
- Roadmap for the rest of the paper
|
||||
|
||||
---
|
||||
|
||||
## 2. Related Work
|
||||
|
||||
### 2.1 Theoretical Foundations
|
||||
- Semantic distance and creativity (Mednick, 1962)
|
||||
- Conceptual blending theory (Fauconnier & Turner)
|
||||
- Design fixation (Jansson & Smith)
|
||||
- Constraint-based creativity
|
||||
|
||||
### 2.2 LLM Limitations in Creative Generation
|
||||
- Design fixation from AI (CHI 2024)
|
||||
- Dual mechanisms: inspiration vs. fixation
|
||||
- Bias and pattern perpetuation
|
||||
|
||||
### 2.3 Persona-Based Prompting
|
||||
- PersonaFlow (2024)
|
||||
- BILLY persona vectors (2025)
|
||||
- Quantifying persona effects (ACL 2024)
|
||||
|
||||
### 2.4 Creativity Support Tools
|
||||
- Wisdom of crowds approaches
|
||||
- Human-AI collaboration in ideation
|
||||
- Evaluation methods (CAT, semantic distance)
|
||||
|
||||
### 2.5 Positioning Our Work
|
||||
- Gap: No end-to-end system combining structured decomposition + multi-expert transformation + deduplication
|
||||
- Distinction from PersonaFlow: product innovation focus, attribute structure
|
||||
|
||||
---
|
||||
|
||||
## 3. System Design
|
||||
|
||||
### 3.1 Overview
|
||||
- Pipeline diagram
|
||||
- Design rationale
|
||||
|
||||
### 3.2 Attribute Decomposition
|
||||
- Category analysis (dynamic vs. fixed)
|
||||
- Attribute generation per category
|
||||
- DAG relationship mapping
|
||||
|
||||
### 3.3 Expert Team Generation
|
||||
- Expert sources: LLM-generated, curated, external databases
|
||||
- Diversity optimization strategies
|
||||
- Domain coverage considerations
|
||||
|
||||
### 3.4 Expert Transformation
|
||||
- Conditioning mechanism
|
||||
- Keyword generation
|
||||
- Description generation
|
||||
- Parallel processing for efficiency
|
||||
|
||||
### 3.5 Semantic Deduplication
|
||||
- Embedding-based approach
|
||||
- LLM-based approach
|
||||
- Threshold selection
|
||||
|
||||
### 3.6 Novelty Validation
|
||||
- Patent search integration
|
||||
- Overlap scoring
|
||||
|
||||
---
|
||||
|
||||
## 4. Experiments
|
||||
|
||||
### 4.1 Research Questions
|
||||
- RQ1: Does multi-expert generation increase semantic diversity?
|
||||
- RQ2: Does multi-expert generation reduce patent overlap?
|
||||
- RQ3: What is the optimal number of experts?
|
||||
- RQ4: How do expert sources affect output quality?
|
||||
|
||||
### 4.2 Experimental Setup
|
||||
|
||||
#### 4.2.1 Dataset
|
||||
- N concepts/queries for ideation
|
||||
- Selection criteria (diverse domains, complexity levels)
|
||||
|
||||
#### 4.2.2 Conditions
|
||||
| Condition | Description |
|
||||
|-----------|-------------|
|
||||
| Baseline | Direct LLM: "Generate 20 creative ideas for X" |
|
||||
| Single-Expert | 1 expert × 20 ideas |
|
||||
| Multi-Expert-4 | 4 experts × 5 ideas each |
|
||||
| Multi-Expert-8 | 8 experts × 2-3 ideas each |
|
||||
| Random-Perspective | 4 random words as "perspectives" |
|
||||
|
||||
#### 4.2.3 Controls
|
||||
- Same LLM model (specify version)
|
||||
- Same temperature settings
|
||||
- Same total idea count per condition
|
||||
|
||||
### 4.3 Metrics
|
||||
|
||||
#### 4.3.1 Semantic Diversity
|
||||
- Mean pairwise cosine distance between embeddings
|
||||
- Cluster distribution analysis
|
||||
- Silhouette score for idea clustering
|
||||
|
||||
#### 4.3.2 Novelty
|
||||
- Patent overlap rate
|
||||
- Semantic distance from query centroid
|
||||
|
||||
#### 4.3.3 Quality (Human Evaluation)
|
||||
- Novelty rating (1-7 Likert)
|
||||
- Usefulness rating (1-7 Likert)
|
||||
- Creativity rating (1-7 Likert)
|
||||
- Interrater reliability (Cronbach's alpha)
|
||||
|
||||
### 4.4 Procedure
|
||||
- Idea generation process
|
||||
- Evaluation process
|
||||
- Statistical analysis methods
|
||||
|
||||
---
|
||||
|
||||
## 5. Results
|
||||
|
||||
### 5.1 Semantic Diversity (RQ1)
|
||||
- Quantitative results
|
||||
- Visualization (t-SNE/UMAP of idea embeddings)
|
||||
- Statistical significance tests
|
||||
|
||||
### 5.2 Patent Novelty (RQ2)
|
||||
- Overlap rates by condition
|
||||
- Examples of high-novelty ideas
|
||||
|
||||
### 5.3 Expert Count Analysis (RQ3)
|
||||
- Diversity vs. expert count curve
|
||||
- Diminishing returns analysis
|
||||
- Optimal expert count recommendation
|
||||
|
||||
### 5.4 Expert Source Comparison (RQ4)
|
||||
- LLM-generated vs. curated vs. random
|
||||
- Unconventionality metrics
|
||||
|
||||
### 5.5 Human Evaluation Results
|
||||
- Rating distributions
|
||||
- Condition comparisons
|
||||
- Correlation with automatic metrics
|
||||
|
||||
---
|
||||
|
||||
## 6. Discussion
|
||||
|
||||
### 6.1 Interpreting the Results
|
||||
- Why multi-expert works
|
||||
- The role of structured decomposition
|
||||
- Deduplication importance
|
||||
|
||||
### 6.2 Theoretical Implications
|
||||
- Semantic gravity as framework for LLM creativity
|
||||
- Expert perspectives as productive constraints
|
||||
- Inner crowd wisdom
|
||||
|
||||
### 6.3 Practical Implications
|
||||
- When to use multi-expert approach
|
||||
- Expert selection strategies
|
||||
- Integration with existing workflows
|
||||
|
||||
### 6.4 Limitations
|
||||
- LLM-specific results may not generalize
|
||||
- Patent overlap as proxy for true novelty
|
||||
- Human evaluation subjectivity
|
||||
- Single-language experiments
|
||||
|
||||
### 6.5 Future Work
|
||||
- Cross-cultural creativity
|
||||
- Domain-specific expert optimization
|
||||
- Real-world deployment studies
|
||||
- Integration with other creativity techniques
|
||||
|
||||
---
|
||||
|
||||
## 7. Conclusion
|
||||
|
||||
- Summary of contributions
|
||||
- Key takeaways
|
||||
- Broader impact
|
||||
|
||||
---
|
||||
|
||||
## Appendices
|
||||
|
||||
### A. Prompt Templates
|
||||
- Expert generation prompts
|
||||
- Keyword generation prompts
|
||||
- Description generation prompts
|
||||
|
||||
### B. Full Experimental Results
|
||||
- Complete data tables
|
||||
- Additional visualizations
|
||||
|
||||
### C. Expert Source Details
|
||||
- Curated occupation list
|
||||
- DBpedia/Wikidata query details
|
||||
|
||||
### D. Human Evaluation Protocol
|
||||
- Instructions for raters
|
||||
- Example ratings
|
||||
- Training materials
|
||||
|
||||
---
|
||||
|
||||
## Target Venues
|
||||
|
||||
### Tier 1 (Recommended)
|
||||
1. **CHI** - ACM Conference on Human Factors in Computing Systems
|
||||
- Strong fit: creativity support tools, human-AI collaboration
|
||||
- Deadline: typically September
|
||||
|
||||
2. **CSCW** - ACM Conference on Computer-Supported Cooperative Work
|
||||
- Good fit: collaborative ideation, crowd wisdom
|
||||
- Deadline: typically April/January
|
||||
|
||||
3. **Creativity & Cognition** - ACM Conference
|
||||
- Perfect fit: computational creativity focus
|
||||
- Smaller but specialized venue
|
||||
|
||||
### Tier 2 (Alternative)
|
||||
4. **DIS** - ACM Designing Interactive Systems
|
||||
- Good fit: design ideation tools
|
||||
|
||||
5. **UIST** - ACM Symposium on User Interface Software and Technology
|
||||
- If system/interaction focus emphasized
|
||||
|
||||
6. **ICCC** - International Conference on Computational Creativity
|
||||
- Specialized computational creativity venue
|
||||
|
||||
### Journal Options
|
||||
1. **International Journal of Human-Computer Studies (IJHCS)**
|
||||
2. **ACM Transactions on Computer-Human Interaction (TOCHI)**
|
||||
3. **Design Studies**
|
||||
4. **Creativity Research Journal**
|
||||
|
||||
---
|
||||
|
||||
## Timeline Checklist
|
||||
|
||||
- [ ] Finalize experimental design
|
||||
- [ ] Collect/select query dataset
|
||||
- [ ] Run all experimental conditions
|
||||
- [ ] Compute automatic metrics
|
||||
- [ ] Design human evaluation study
|
||||
- [ ] Recruit evaluators
|
||||
- [ ] Conduct human evaluation
|
||||
- [ ] Statistical analysis
|
||||
- [ ] Write first draft
|
||||
- [ ] Internal review
|
||||
- [ ] Revision
|
||||
- [ ] Submit
|
||||
208
research/references.md
Normal file
208
research/references.md
Normal file
@@ -0,0 +1,208 @@
|
||||
# References
|
||||
|
||||
## Core Related Work
|
||||
|
||||
1. **Siangliulue, P., Arnold, K. C., Gajos, K. Z., & Dow, S. P.** (2017). Bringing the Wisdom of the Crowd to an Individual by Having the Individual Assume Different Roles. *Proceedings of the 2017 ACM SIGCHI Conference on Creativity and Cognition (C&C '17)*, 131-141.
|
||||
- https://dl.acm.org/doi/10.1145/3059454.3059467
|
||||
|
||||
2. **Liu, Y., Sharma, A., et al.** (2024). PersonaFlow: Designing LLM-Simulated Expert Perspectives for Enhanced Research Ideation. *arXiv preprint*.
|
||||
- https://arxiv.org/html/2409.12538v1
|
||||
- https://www.semanticscholar.org/paper/PersonaFlow:-Designing-LLM-Simulated-Expert-for-Liu-Sharma/eb0c224be9191e39452f20b2cbb886b5ecc4f57b
|
||||
|
||||
3. **Choi, J., et al.** (2023). PopBlends: Strategies for Conceptual Blending with Large Language Models. *Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems*.
|
||||
- https://dl.acm.org/doi/10.1145/3544548.3580948
|
||||
|
||||
4. **BILLY Authors** (2025). BILLY: Steering Large Language Models via Merging Persona Vectors for Creative Generation. *arXiv preprint*.
|
||||
- https://arxiv.org/html/2510.10157v1
|
||||
|
||||
---
|
||||
|
||||
## Semantic Distance & Creative Cognition
|
||||
|
||||
5. **Mednick, S. A.** (1962). The associative basis of the creative process. *Psychological Review, 69*(3), 220-232.
|
||||
- (Classic foundational paper)
|
||||
|
||||
6. **Kenett, Y. N., & Faust, M.** (2019). Going the Extra Creative Mile: The Role of Semantic Distance in Creativity – Theory, Research, and Measurement. *The Cambridge Handbook of the Neuroscience of Creativity*.
|
||||
- https://www.cambridge.org/core/books/abs/cambridge-handbook-of-the-neuroscience-of-creativity/going-the-extra-creative-mile-the-role-of-semantic-distance-in-creativity-theory-research-and-measurement/3AD9143E69A463F85F2D8CC8940425CA
|
||||
|
||||
7. **Beaty, R. E., & Johnson, D. R.** (2021). Automating creativity assessment with SemDis: An open platform for computing semantic distance. *Behavior Research Methods, 53*, 757-780.
|
||||
- https://link.springer.com/article/10.3758/s13428-020-01453-w
|
||||
|
||||
8. **What can quantitative measures of semantic distance tell us about creativity?** (2018). *Current Opinion in Behavioral Sciences*.
|
||||
- https://www.sciencedirect.com/science/article/abs/pii/S2352154618301098
|
||||
|
||||
9. **Semantic Memory and Creativity: The Costs and Benefits of Semantic Memory Structure in Generating Original Ideas** (2023). *PMC*.
|
||||
- https://pmc.ncbi.nlm.nih.gov/articles/PMC10128864/
|
||||
|
||||
10. **The Role of Semantic Associations as a Metacognitive Cue in Creative Idea Generation** (2023). *PMC*.
|
||||
- https://pmc.ncbi.nlm.nih.gov/articles/PMC10141130/
|
||||
|
||||
---
|
||||
|
||||
## Conceptual Blending Theory
|
||||
|
||||
11. **Fauconnier, G., & Turner, M.** (2002). *The Way We Think: Conceptual Blending and the Mind's Hidden Complexities*. Basic Books.
|
||||
|
||||
12. **Conceptual Blending** - Wikipedia Overview
|
||||
- https://en.wikipedia.org/wiki/Conceptual_blending
|
||||
|
||||
13. **Pereira, F. C.** (2007). *Creativity and Artificial Intelligence: A Conceptual Blending Approach*. Mouton de Gruyter.
|
||||
- https://dl.acm.org/doi/10.5555/1557446
|
||||
- https://www.researchgate.net/publication/332711522_Creativity_and_Artificial_Intelligence_A_Conceptual_Blending_Approach
|
||||
|
||||
14. **Confalonieri, R., et al.** (2018). A computational framework for conceptual blending. *Artificial Intelligence, 256*, 105-129.
|
||||
- https://www.sciencedirect.com/science/article/pii/S000437021730142X
|
||||
|
||||
15. **Trisociation with AI for Creative Idea Generation** (2025). *California Management Review*.
|
||||
- https://cmr.berkeley.edu/2025/01/trisociation-with-ai-for-creative-idea-generation/
|
||||
|
||||
---
|
||||
|
||||
## Design Fixation & Constraint-Based Creativity
|
||||
|
||||
16. **Jansson, D. G., & Smith, S. M.** (1991). Design fixation. *Design Studies, 12*(1), 3-11.
|
||||
- (Classic foundational paper)
|
||||
|
||||
17. **Design Fixation: A Cognitive Model**. *Design Society*.
|
||||
- https://www.designsociety.org/download-publication/25504/design_fixation_a_cognitive_model
|
||||
|
||||
18. **Crilly, N.** (2019). Research Design Fixation. *Cambridge Repository*.
|
||||
- https://www.repository.cam.ac.uk/bitstreams/2c002015-8771-4694-ad48-0e4b52008bdf/download
|
||||
|
||||
19. **Using fMRI to deepen our understanding of design fixation** (2020). *Design Science, Cambridge Core*.
|
||||
- https://www.cambridge.org/core/journals/design-science/article/using-fmri-to-deepen-our-understanding-of-design-fixation/2DD81FEE8ED682F6DFF415BF2948EFA6
|
||||
|
||||
20. **Acar, O. A., Tarakci, M., & van Knippenberg, D.** (2019). Creativity and Innovation Under Constraints: A Cross-Disciplinary Integrative Review. *Journal of Management, 45*(1), 96-121.
|
||||
- https://journals.sagepub.com/doi/full/10.1177/0149206318805832
|
||||
|
||||
21. **Cromwell, J. R.** (2024). How combinations of constraint affect creativity: A new typology of creative problem solving in organizations. *Organizational Psychology Review*.
|
||||
- https://journals.sagepub.com/doi/10.1177/20413866231202031
|
||||
|
||||
22. **Creativity from constraints: Theory and applications to education** (2022). *Thinking Skills and Creativity*.
|
||||
- https://www.sciencedirect.com/science/article/abs/pii/S1871187122001870
|
||||
|
||||
---
|
||||
|
||||
## LLM Limitations in Creative Generation
|
||||
|
||||
23. **Wadinambiarachchi, S., et al.** (2024). The Effects of Generative AI on Design Fixation and Divergent Thinking. *Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems*.
|
||||
- https://dl.acm.org/doi/full/10.1145/3613904.3642919
|
||||
- https://arxiv.org/html/2403.11164v1
|
||||
|
||||
24. **Inspiration booster or creative fixation? The dual mechanisms of LLMs in shaping individual creativity in tasks of different complexity** (2025). *Humanities and Social Sciences Communications (Nature)*.
|
||||
- https://www.nature.com/articles/s41599-025-05867-9
|
||||
|
||||
25. **Gallegos, I. O., et al.** (2024). Bias and Fairness in Large Language Models: A Survey. *Computational Linguistics, 50*(3), 1097-1179. MIT Press.
|
||||
- https://direct.mit.edu/coli/article/50/3/1097/121961/Bias-and-Fairness-in-Large-Language-Models-A
|
||||
|
||||
26. **Generalization bias in large language model summarization of scientific research** (2025). *Royal Society Open Science, 12*(4).
|
||||
- https://royalsocietypublishing.org/rsos/article/12/4/241776/235656/Generalization-bias-in-large-language-model
|
||||
|
||||
27. **LLLMs: A Data-Driven Survey of Evolving Research on Limitations of Large Language Models** (2025). *arXiv*.
|
||||
- https://arxiv.org/html/2505.19240v1
|
||||
|
||||
---
|
||||
|
||||
## Persona Prompting & Multi-Agent Systems
|
||||
|
||||
28. **Quantifying the Persona Effect in LLM Simulations** (2024). *ACL 2024*.
|
||||
- https://aclanthology.org/2024.acl-long.554.pdf
|
||||
- https://www.emergentmind.com/topics/persona-effect-in-llm-simulations
|
||||
|
||||
29. **Two Tales of Persona in LLMs: A Survey of Role-Playing** (2024). *EMNLP Findings*.
|
||||
- https://aclanthology.org/2024.findings-emnlp.969.pdf
|
||||
|
||||
30. **LLM Generated Persona is a Promise with a Catch** (2024). *Semantic Scholar*.
|
||||
- https://www.semanticscholar.org/paper/LLM-Generated-Persona-is-a-Promise-with-a-Catch-Li-Chen/3ea29481ec11d1568fde727d236f71e44e4e2ad0
|
||||
|
||||
31. **Using AI for User Representation: An Analysis of 83 Persona Prompts** (2025). *arXiv*.
|
||||
- https://arxiv.org/html/2508.13047v1
|
||||
|
||||
32. **Scaffolding Creativity: How Divergent and Convergent Personas Shape AI-Assisted Ideation** (2025). *arXiv*.
|
||||
- https://arxiv.org/pdf/2510.26490
|
||||
|
||||
---
|
||||
|
||||
## Role-Playing & Perspective-Taking
|
||||
|
||||
33. **Chung, T. S.** (2013). Table-top role playing game and creativity. *Thinking Skills and Creativity, 8*, 56-71.
|
||||
- https://www.researchgate.net/publication/257701334_Table-top_role_playing_game_and_creativity
|
||||
|
||||
34. **The effect of tabletop role-playing games on the creative potential and emotional creativity of Taiwanese college students** (2015). *Thinking Skills and Creativity*.
|
||||
- https://www.researchgate.net/publication/284013184_The_effect_of_tabletop_role-playing_games_on_the_creative_potential_and_emotional_creativity_of_Taiwanese_college_students
|
||||
|
||||
35. **Psychology and Role-Playing Games** (2019). *ResearchGate*.
|
||||
- https://www.researchgate.net/publication/331758159_Psychology_and_Role-Playing_Games
|
||||
|
||||
36. **Role Playing and Perspective Taking: An Educational Point of View** (2020). *ResearchGate*.
|
||||
- https://www.researchgate.net/publication/346610467_Role_Playing_and_Perspective_Taking_An_Educational_Point_of_View
|
||||
|
||||
---
|
||||
|
||||
## Creativity Support Tools & Evaluation
|
||||
|
||||
37. **Jordanous, A.** (2018). Evaluating Computational Creativity: An Interdisciplinary Tutorial. *ACM Computing Surveys, 51*(2), Article 28.
|
||||
- https://dl.acm.org/doi/10.1145/3167476
|
||||
|
||||
38. **Evaluating Creativity in Computational Co-Creative Systems** (2018). *ResearchGate*.
|
||||
- https://www.researchgate.net/publication/326646917_Evaluating_Creativity_in_Computational_Co-Creative_Systems
|
||||
|
||||
39. **The Intersection of Users, Roles, Interactions, and Technologies in Creativity Support Tools** (2021). *DIS '21*.
|
||||
- https://dl.acm.org/doi/10.1145/3461778.3462050
|
||||
|
||||
40. **What Counts as 'Creative' Work? Articulating Four Epistemic Positions in Creativity-Oriented HCI Research** (2024). *CHI '24*.
|
||||
- https://dl.acm.org/doi/10.1145/3613904.3642854
|
||||
|
||||
41. **Colton, S., & Wiggins, G. A.** (2012). Computational Creativity: The Final Frontier? *ECAI 2012*.
|
||||
- https://link.springer.com/article/10.1007/s00354-020-00116-w
|
||||
|
||||
---
|
||||
|
||||
## AI-Augmented Design & Ideation
|
||||
|
||||
42. **The effect of AI-based inspiration on human design ideation** (2023). *CoDesign*.
|
||||
- https://www.tandfonline.com/doi/full/10.1080/21650349.2023.2167124
|
||||
|
||||
43. **A Hybrid Prototype Method Combining Physical Models and Generative AI to Support Creativity in Conceptual Design** (2024). *ACM TOCHI*.
|
||||
- https://dl.acm.org/doi/10.1145/3689433
|
||||
|
||||
44. **Artificial intelligence for design education: A conceptual approach to enhance students' divergent and convergent thinking** (2025). *IJTDE*.
|
||||
- https://link.springer.com/article/10.1007/s10798-025-09964-3
|
||||
|
||||
45. **The Ideation Compass: Supporting interdisciplinary creative dialogues with real time visualization** (2022). *CoDesign*.
|
||||
- https://www.tandfonline.com/doi/full/10.1080/21650349.2022.2142674
|
||||
|
||||
46. **Guiding data-driven design ideation by knowledge distance** (2021). *Knowledge-Based Systems*.
|
||||
- https://www.sciencedirect.com/science/article/abs/pii/S0950705121001362
|
||||
|
||||
---
|
||||
|
||||
## CHI/CSCW Related Papers
|
||||
|
||||
47. **Chan, J., Dang, S., & Dow, S. P.** (2016). Improving Crowd Innovation with Expert Facilitation. *CSCW '16*.
|
||||
|
||||
48. **Koch, J., et al.** (2020). ImageSense: An Intelligent Collaborative Ideation Tool to Support Diverse Human-Computer Partnerships. *CSCW '20*.
|
||||
|
||||
49. **Yu, L., Kittur, A., & Kraut, R. E.** (2014). Distributed Analogical Idea Generation: Inventing with Crowds. *CHI '14*.
|
||||
|
||||
50. **Crowdboard** (2017). *C&C '17*.
|
||||
- https://dl.acm.org/doi/10.1145/3059454.3059477
|
||||
|
||||
51. **Collaborative Creativity** (2011). *CHI '11*.
|
||||
- https://dl.acm.org/doi/10.1145/1978942.1979214
|
||||
|
||||
52. **Beyond Automation: How UI/UX Designers Perceive AI as a Creative Partner in the Divergent Thinking Stages** (2025). *arXiv*.
|
||||
- https://arxiv.org/html/2501.18778
|
||||
|
||||
---
|
||||
|
||||
## Additional Resources
|
||||
|
||||
53. **Automatic Scoring of Metaphor Creativity with Large Language Models** (2024). *Creativity Research Journal*.
|
||||
- https://www.tandfonline.com/doi/full/10.1080/10400419.2024.2326343
|
||||
|
||||
54. **Wisdom of Crowds** - Surowiecki, J. (2004). *The Wisdom of Crowds*. Doubleday.
|
||||
- https://en.wikipedia.org/wiki/The_Wisdom_of_Crowds
|
||||
|
||||
55. **Research: When Used Correctly, LLMs Can Unlock More Creative Ideas** (2025). *Harvard Business Review*.
|
||||
- https://hbr.org/2025/12/research-when-used-correctly-llms-can-unlock-more-creative-ideas
|
||||
280
research/theoretical_framework.md
Normal file
280
research/theoretical_framework.md
Normal file
@@ -0,0 +1,280 @@
|
||||
# Theoretical Framework: Expert-Augmented LLM Ideation
|
||||
|
||||
## The Core Problem: LLM "Semantic Gravity"
|
||||
|
||||
### What is Semantic Gravity?
|
||||
|
||||
When LLMs generate creative ideas directly, they exhibit a phenomenon we term "semantic gravity" - the tendency to generate outputs that cluster around high-probability regions of their training distribution.
|
||||
|
||||
```
|
||||
Direct LLM Generation:
|
||||
Input: "Generate creative ideas for a chair"
|
||||
|
||||
LLM Process:
|
||||
P(idea | "chair") → samples from training distribution
|
||||
|
||||
Result:
|
||||
- "Ergonomic office chair" (high probability)
|
||||
- "Foldable portable chair" (high probability)
|
||||
- "Eco-friendly bamboo chair" (moderate probability)
|
||||
|
||||
Problem:
|
||||
→ Ideas cluster in predictable semantic neighborhoods
|
||||
→ Limited exploration of distant conceptual spaces
|
||||
→ "Creative" outputs are interpolations, not extrapolations
|
||||
```
|
||||
|
||||
### Why Does This Happen?
|
||||
|
||||
1. **Statistical Pattern Learning**: LLMs learn co-occurrence patterns from training data
|
||||
2. **Mode Collapse**: When asked to be "creative," LLMs sample from the distribution of "creative ideas" they've seen
|
||||
3. **Relevance Trap**: Strong associations dominate weak ones (chair→furniture >> chair→marine biology)
|
||||
4. **Prototype Bias**: Outputs gravitate toward category prototypes, not edge cases
|
||||
|
||||
---
|
||||
|
||||
## The Solution: Expert Perspective Transformation
|
||||
|
||||
### Theoretical Basis
|
||||
|
||||
Our approach draws from three key theoretical foundations:
|
||||
|
||||
#### 1. Semantic Distance Theory (Mednick, 1962)
|
||||
|
||||
> "Creative thinking involves connecting weakly related, remote concepts in semantic memory."
|
||||
|
||||
**Key insight**: Creativity correlates with semantic distance. The farther the conceptual "jump," the more creative the result.
|
||||
|
||||
**Our application**: Expert perspectives force semantic jumps that LLMs wouldn't naturally make.
|
||||
|
||||
```
|
||||
Without Expert:
|
||||
"Chair" → furniture, sitting, comfort, design
|
||||
Semantic distance: SHORT
|
||||
|
||||
With Marine Biologist Expert:
|
||||
"Chair" → underwater pressure, coral structure, buoyancy, bioluminescence
|
||||
Semantic distance: LONG
|
||||
|
||||
Result: Novel ideas like "pressure-adaptive seating" or "coral-inspired structural support"
|
||||
```
|
||||
|
||||
#### 2. Conceptual Blending Theory (Fauconnier & Turner, 2002)
|
||||
|
||||
> "Creative products emerge from blending elements of two input spaces into a novel integrated space."
|
||||
|
||||
**The blending process**:
|
||||
1. Input Space 1: The target concept (e.g., "chair")
|
||||
2. Input Space 2: The expert's domain knowledge (e.g., marine biology)
|
||||
3. Generic Space: Abstract structure shared by both
|
||||
4. Blended Space: Novel integration of elements from both inputs
|
||||
|
||||
**Our application**: Each expert provides a distinct input space for systematic blending.
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌─────────────────┐
|
||||
│ Input 1 │ │ Input 2 │
|
||||
│ "Chair" │ │ Marine Biology │
|
||||
│ - support │ │ - pressure │
|
||||
│ - sitting │ │ - buoyancy │
|
||||
│ - comfort │ │ - adaptation │
|
||||
└────────┬────────┘ └────────┬────────┘
|
||||
│ │
|
||||
└───────────┬───────────┘
|
||||
▼
|
||||
┌─────────────────────┐
|
||||
│ Blended Space │
|
||||
│ Novel Chair Ideas │
|
||||
│ - pressure-adapt │
|
||||
│ - buoyant support │
|
||||
│ - bio-adaptive │
|
||||
└─────────────────────┘
|
||||
```
|
||||
|
||||
#### 3. Design Fixation Breaking (Jansson & Smith, 1991)
|
||||
|
||||
> "Design fixation is blind adherence to initial ideas, limiting creative output."
|
||||
|
||||
**Fixation occurs because**:
|
||||
- Knowledge is organized around category prototypes
|
||||
- Prototypes require less cognitive effort to access
|
||||
- Initial examples anchor subsequent ideation
|
||||
|
||||
**Our application**: Expert perspectives act as "defixation triggers" by activating non-prototype knowledge.
|
||||
|
||||
```
|
||||
Without Intervention:
|
||||
Prototype: "standard four-legged chair"
|
||||
Fixation: Variations on four-legged design
|
||||
|
||||
With Expert Intervention:
|
||||
Archaeologist: "Ancient people sat differently..."
|
||||
Dance Therapist: "Seating affects movement expression..."
|
||||
|
||||
Fixation Broken: Entirely new seating paradigms explored
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## The Multi-Expert Aggregation Model
|
||||
|
||||
### From "Wisdom of Crowds" to "Inner Crowd"
|
||||
|
||||
Research shows that groups generate more diverse ideas because each member brings different perspectives. Our system simulates this "crowd wisdom" through multiple expert personas:
|
||||
|
||||
```
|
||||
Traditional Crowd:
|
||||
Person 1 → Ideas from perspective 1
|
||||
Person 2 → Ideas from perspective 2
|
||||
Person 3 → Ideas from perspective 3
|
||||
Aggregation → Diverse idea pool
|
||||
|
||||
Our "Inner Crowd":
|
||||
LLM + Expert 1 Persona → Ideas from perspective 1
|
||||
LLM + Expert 2 Persona → Ideas from perspective 2
|
||||
LLM + Expert 3 Persona → Ideas from perspective 3
|
||||
Aggregation → Diverse idea pool (simulated crowd)
|
||||
```
|
||||
|
||||
### Why Multiple Experts Work
|
||||
|
||||
1. **Coverage**: Different experts activate different semantic regions
|
||||
2. **Redundancy Reduction**: Deduplication removes overlapping ideas
|
||||
3. **Diversity by Design**: Expert selection can be optimized for maximum diversity
|
||||
4. **Diminishing Returns**: Beyond ~4-6 experts, marginal diversity gains decrease
|
||||
|
||||
---
|
||||
|
||||
## The Complete Pipeline
|
||||
|
||||
### Stage 1: Attribute Decomposition
|
||||
|
||||
**Purpose**: Structure the problem space before creative exploration
|
||||
|
||||
```
|
||||
Input: "Innovative chair design"
|
||||
|
||||
Output:
|
||||
Categories: [Material, Function, Usage, User Group]
|
||||
|
||||
Material: [wood, metal, fabric, composite]
|
||||
Function: [support, comfort, mobility, storage]
|
||||
Usage: [office, home, outdoor, medical]
|
||||
User Group: [children, elderly, professionals, athletes]
|
||||
```
|
||||
|
||||
**Theoretical basis**: Structured decomposition prevents premature fixation on holistic solutions.
|
||||
|
||||
### Stage 2: Expert Team Generation
|
||||
|
||||
**Purpose**: Assemble diverse perspectives for maximum semantic coverage
|
||||
|
||||
```
|
||||
Strategies:
|
||||
1. LLM-Generated: Query-specific, prioritizes unconventional experts
|
||||
2. Curated: Pre-selected high-quality occupations
|
||||
3. External Sources: DBpedia, Wikidata for broad coverage
|
||||
|
||||
Diversity Optimization:
|
||||
- Domain spread (arts, science, trades, services)
|
||||
- Expertise level variation
|
||||
- Cultural/geographic diversity
|
||||
```
|
||||
|
||||
### Stage 3: Expert Transformation
|
||||
|
||||
**Purpose**: Apply each expert's perspective to each attribute
|
||||
|
||||
```
|
||||
For each (attribute, expert) pair:
|
||||
|
||||
Input: "Chair comfort" + "Marine Biologist"
|
||||
|
||||
LLM Prompt:
|
||||
"As a marine biologist, how might you reimagine
|
||||
chair comfort using principles from your field?"
|
||||
|
||||
Output: Keywords + Descriptions
|
||||
- "Pressure-distributed seating inspired by deep-sea fish"
|
||||
- "Buoyancy-assisted support reducing pressure points"
|
||||
```
|
||||
|
||||
### Stage 4: Deduplication
|
||||
|
||||
**Purpose**: Ensure idea set is truly diverse, not just numerous
|
||||
|
||||
```
|
||||
Methods:
|
||||
1. Embedding-based: Fast cosine similarity clustering
|
||||
2. LLM-based: Semantic pairwise comparison (more accurate)
|
||||
|
||||
Output:
|
||||
- Unique ideas grouped by similarity
|
||||
- Representative idea selected from each cluster
|
||||
- Diversity metrics computed
|
||||
```
|
||||
|
||||
### Stage 5: Novelty Validation
|
||||
|
||||
**Purpose**: Ground novelty in real-world uniqueness
|
||||
|
||||
```
|
||||
Process:
|
||||
- Search patent databases for similar concepts
|
||||
- Compute overlap scores
|
||||
- Flag ideas with high existing coverage
|
||||
|
||||
Output:
|
||||
- Novelty score per idea
|
||||
- Patent overlap rate for idea set
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testable Hypotheses
|
||||
|
||||
### H1: Semantic Diversity
|
||||
> Multi-expert generation produces higher semantic diversity than single-expert or direct generation.
|
||||
|
||||
**Measurement**: Mean pairwise cosine distance between idea embeddings
|
||||
|
||||
### H2: Novelty
|
||||
> Ideas from multi-expert generation have lower patent overlap than direct generation.
|
||||
|
||||
**Measurement**: Percentage of ideas with existing patent matches
|
||||
|
||||
### H3: Expert Count Effect
|
||||
> Semantic diversity increases with expert count, with diminishing returns beyond 4-6 experts.
|
||||
|
||||
**Measurement**: Diversity vs. expert count curve
|
||||
|
||||
### H4: Expert Source Effect
|
||||
> LLM-generated experts produce more unconventional ideas than curated/database experts.
|
||||
|
||||
**Measurement**: Semantic distance from query centroid
|
||||
|
||||
### H5: Fixation Breaking
|
||||
> Multi-expert approach produces more ideas outside the top-3 semantic clusters than direct generation.
|
||||
|
||||
**Measurement**: Cluster distribution analysis
|
||||
|
||||
---
|
||||
|
||||
## Expected Contributions
|
||||
|
||||
1. **Theoretical**: Formalization of "semantic gravity" as LLM creativity limitation
|
||||
2. **Methodological**: Expert-augmented ideation pipeline with evaluation framework
|
||||
3. **Empirical**: Quantitative evidence for multi-expert creativity enhancement
|
||||
4. **Practical**: Open-source system for innovation ideation
|
||||
|
||||
---
|
||||
|
||||
## Positioning Against Related Work
|
||||
|
||||
| Approach | Limitation | Our Advantage |
|
||||
|----------|------------|---------------|
|
||||
| Direct LLM generation | Semantic gravity, fixation | Expert-forced semantic jumps |
|
||||
| Human brainstorming | Cognitive fatigue, social dynamics | Tireless LLM generation |
|
||||
| PersonaFlow (2024) | Research-focused, no attribute structure | Product innovation, structured decomposition |
|
||||
| PopBlends (2023) | Two-concept blending only | Multi-expert, multi-attribute blending |
|
||||
| BILLY (2025) | Vector fusion less interpretable | Sequential generation, explicit control |
|
||||
Reference in New Issue
Block a user