feat: Enhance patent search and update research documentation
- Improve patent search service with expanded functionality - Update PatentSearchPanel UI component - Add new research_report.md - Update experimental protocol, literature review, paper outline, and theoretical framework Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -11,7 +11,7 @@
|
||||
|
||||
## Abstract (Draft)
|
||||
|
||||
Large Language Models (LLMs) are increasingly used for creative ideation, yet they exhibit a phenomenon we term "semantic gravity" - the tendency to generate outputs clustered around high-probability regions of their training distribution. This limits the novelty and diversity of generated ideas. We propose a multi-expert transformation framework that systematically activates diverse semantic regions by conditioning LLM generation on simulated expert perspectives. Our system decomposes concepts into structured attributes, generates ideas through multiple domain-expert viewpoints, and employs semantic deduplication to ensure genuine diversity. Through experiments comparing multi-expert generation against direct LLM generation and single-expert baselines, we demonstrate that our approach produces ideas with [X]% higher semantic diversity and [Y]% lower patent overlap. We contribute a theoretical framework explaining LLM creativity limitations and an open-source system for innovation ideation.
|
||||
Large Language Models (LLMs) are increasingly used for creative ideation, yet they exhibit a phenomenon we term "semantic gravity" - the tendency to generate outputs clustered around high-probability regions of their training distribution. This limits the novelty and diversity of generated ideas. We investigate two complementary strategies to overcome this limitation: (1) **attribute decomposition**, which structures the problem space before creative exploration, and (2) **expert perspective transformation**, which conditions LLM generation on simulated domain-expert viewpoints. Through a 2×2 factorial experiment comparing Direct generation, Expert-Only, Attribute-Only, and Full Pipeline (both factors combined), we demonstrate that each factor independently improves semantic diversity, with the combination producing super-additive effects. Our Full Pipeline achieves [X]% higher semantic diversity and [Y]% lower patent overlap compared to direct generation. We contribute a theoretical framework explaining LLM creativity limitations and an open-source system for innovation ideation.
|
||||
|
||||
---
|
||||
|
||||
@@ -61,8 +61,17 @@ Large Language Models (LLMs) are increasingly used for creative ideation, yet th
|
||||
- Evaluation methods (CAT, semantic distance)
|
||||
|
||||
### 2.5 Positioning Our Work
|
||||
- Gap: No end-to-end system combining structured decomposition + multi-expert transformation + deduplication
|
||||
- Distinction from PersonaFlow: product innovation focus, attribute structure
|
||||
|
||||
**Key distinction from PersonaFlow (closest related work)**:
|
||||
```
|
||||
PersonaFlow: Query → Experts → Ideas (no problem structure)
|
||||
Our approach: Query → Attributes → (Attributes × Experts) → Ideas
|
||||
```
|
||||
|
||||
- PersonaFlow applies experts to whole query; we apply experts to decomposed attributes
|
||||
- PersonaFlow cannot isolate what helps; our 2×2 factorial design tests each factor
|
||||
- We hypothesize attribute decomposition **amplifies** expert effectiveness (interaction effect)
|
||||
- PersonaFlow showed experts help; we test whether **structuring the problem first** makes experts more effective
|
||||
|
||||
---
|
||||
|
||||
@@ -102,30 +111,41 @@ Large Language Models (LLMs) are increasingly used for creative ideation, yet th
|
||||
## 4. Experiments
|
||||
|
||||
### 4.1 Research Questions
|
||||
- RQ1: Does multi-expert generation increase semantic diversity?
|
||||
- RQ2: Does multi-expert generation reduce patent overlap?
|
||||
- RQ3: What is the optimal number of experts?
|
||||
- RQ4: How do expert sources affect output quality?
|
||||
- RQ1: Does attribute decomposition improve semantic diversity?
|
||||
- RQ2: Does expert perspective transformation improve semantic diversity?
|
||||
- RQ3: Is there an interaction effect between the two factors?
|
||||
- RQ4: Which combination produces the highest patent novelty?
|
||||
- RQ5: How do expert sources (LLM vs Curated vs External) affect quality?
|
||||
- RQ6: What is the hallucination/nonsense rate of context-free keyword generation?
|
||||
|
||||
### 4.1.1 Design Note: Context-Free Keyword Generation
|
||||
Our system intentionally excludes the original query during keyword generation:
|
||||
- Stage 1: Expert sees attribute only (e.g., "wood" + "accountant"), NOT the query ("chair")
|
||||
- Stage 2: Expert applies keyword to original query with context
|
||||
- Rationale: Maximize semantic distance for novelty
|
||||
- Risk: Some ideas may be too distant (nonsense/hallucination)
|
||||
- RQ6 investigates this tradeoff
|
||||
|
||||
### 4.2 Experimental Setup
|
||||
|
||||
#### 4.2.1 Dataset
|
||||
- N concepts/queries for ideation
|
||||
- Selection criteria (diverse domains, complexity levels)
|
||||
- 30 queries for ideation (see experimental_protocol.md)
|
||||
- Selection criteria: diverse domains, complexity levels
|
||||
- Categories: everyday objects, technology/tools, services/systems
|
||||
|
||||
#### 4.2.2 Conditions
|
||||
| Condition | Description |
|
||||
|-----------|-------------|
|
||||
| Baseline | Direct LLM: "Generate 20 creative ideas for X" |
|
||||
| Single-Expert | 1 expert × 20 ideas |
|
||||
| Multi-Expert-4 | 4 experts × 5 ideas each |
|
||||
| Multi-Expert-8 | 8 experts × 2-3 ideas each |
|
||||
| Random-Perspective | 4 random words as "perspectives" |
|
||||
#### 4.2.2 Conditions (2×2 Factorial Design)
|
||||
| Condition | Attributes | Experts | Description |
|
||||
|-----------|------------|---------|-------------|
|
||||
| **C1: Direct** | ❌ | ❌ | Baseline: "Generate 20 creative ideas for [query]" |
|
||||
| **C2: Expert-Only** | ❌ | ✅ | Expert personas generate for whole query |
|
||||
| **C3: Attribute-Only** | ✅ | ❌ | Decompose query, direct generate per attribute |
|
||||
| **C4: Full Pipeline** | ✅ | ✅ | Decompose query, experts generate per attribute |
|
||||
| **C5: Random-Perspective** | ❌ | (random) | Control: 4 random words as "perspectives" |
|
||||
|
||||
#### 4.2.3 Controls
|
||||
- Same LLM model (specify version)
|
||||
- Same temperature settings
|
||||
- Same total idea count per condition
|
||||
- Same total idea count per condition (20 ideas)
|
||||
|
||||
### 4.3 Metrics
|
||||
|
||||
@@ -142,8 +162,18 @@ Large Language Models (LLMs) are increasingly used for creative ideation, yet th
|
||||
- Novelty rating (1-7 Likert)
|
||||
- Usefulness rating (1-7 Likert)
|
||||
- Creativity rating (1-7 Likert)
|
||||
- **Relevance rating (1-7 Likert) - for RQ6**
|
||||
- Interrater reliability (Cronbach's alpha)
|
||||
|
||||
#### 4.3.4 Nonsense/Hallucination Analysis (RQ6) - Three Methods
|
||||
| Method | Metric | Purpose |
|
||||
|--------|--------|---------|
|
||||
| Automatic | Semantic distance threshold (>0.85) | Fast screening |
|
||||
| LLM-as-Judge | GPT-4 relevance score (1-3) | Scalable evaluation |
|
||||
| Human | Relevance rating (1-7 Likert) | Gold standard validation |
|
||||
|
||||
Triangulate all three to validate findings
|
||||
|
||||
### 4.4 Procedure
|
||||
- Idea generation process
|
||||
- Evaluation process
|
||||
@@ -153,27 +183,44 @@ Large Language Models (LLMs) are increasingly used for creative ideation, yet th
|
||||
|
||||
## 5. Results
|
||||
|
||||
### 5.1 Semantic Diversity (RQ1)
|
||||
### 5.1 Main Effect of Attribute Decomposition (RQ1)
|
||||
- Compare: (Attribute-Only + Full Pipeline) vs (Direct + Expert-Only)
|
||||
- Quantitative results
|
||||
- Visualization (t-SNE/UMAP of idea embeddings)
|
||||
- Statistical significance tests
|
||||
- Statistical significance (ANOVA main effect)
|
||||
|
||||
### 5.2 Patent Novelty (RQ2)
|
||||
### 5.2 Main Effect of Expert Perspectives (RQ2)
|
||||
- Compare: (Expert-Only + Full Pipeline) vs (Direct + Attribute-Only)
|
||||
- Quantitative results
|
||||
- Statistical significance (ANOVA main effect)
|
||||
|
||||
### 5.3 Interaction Effect (RQ3)
|
||||
- 2×2 interaction analysis
|
||||
- Visualization: interaction plot
|
||||
- Evidence for super-additive vs additive effects
|
||||
|
||||
### 5.4 Patent Novelty (RQ4)
|
||||
- Overlap rates by condition
|
||||
- Full Pipeline vs other conditions
|
||||
- Examples of high-novelty ideas
|
||||
|
||||
### 5.3 Expert Count Analysis (RQ3)
|
||||
- Diversity vs. expert count curve
|
||||
- Diminishing returns analysis
|
||||
- Optimal expert count recommendation
|
||||
|
||||
### 5.4 Expert Source Comparison (RQ4)
|
||||
- LLM-generated vs. curated vs. random
|
||||
### 5.5 Expert Source Comparison (RQ5)
|
||||
- LLM-generated vs curated vs external
|
||||
- Unconventionality metrics
|
||||
- Within Expert=With conditions only
|
||||
|
||||
### 5.5 Human Evaluation Results
|
||||
- Rating distributions
|
||||
- Condition comparisons
|
||||
### 5.6 Control Condition Analysis
|
||||
- Expert-Only vs Random-Perspective
|
||||
- Validates expert knowledge matters
|
||||
|
||||
### 5.7 Hallucination/Nonsense Analysis (RQ6)
|
||||
- Nonsense rate by condition (LLM-as-judge)
|
||||
- Semantic distance threshold analysis
|
||||
- Novelty-usefulness tradeoff visualization
|
||||
- Is the context-free design worth the hallucination cost?
|
||||
|
||||
### 5.8 Human Evaluation Results
|
||||
- Rating distributions by condition
|
||||
- 2×2 pattern in human judgments
|
||||
- Correlation with automatic metrics
|
||||
|
||||
---
|
||||
@@ -181,14 +228,14 @@ Large Language Models (LLMs) are increasingly used for creative ideation, yet th
|
||||
## 6. Discussion
|
||||
|
||||
### 6.1 Interpreting the Results
|
||||
- Why multi-expert works
|
||||
- The role of structured decomposition
|
||||
- Deduplication importance
|
||||
- Why each factor contributes independently
|
||||
- The interaction: why attributes amplify expert effectiveness
|
||||
- Theoretical explanation via conceptual blending
|
||||
|
||||
### 6.2 Theoretical Implications
|
||||
- Semantic gravity as framework for LLM creativity
|
||||
- Expert perspectives as productive constraints
|
||||
- Inner crowd wisdom
|
||||
- Two complementary escape mechanisms
|
||||
- Structured decomposition as "scaffolding" for creative exploration
|
||||
|
||||
### 6.3 Practical Implications
|
||||
- When to use multi-expert approach
|
||||
|
||||
Reference in New Issue
Block a user