feat: Add curated expert occupations with local data sources

- Add curated occupations seed files (210 entries in zh/en) with specific domains - Add DBpedia occupations data (2164 entries) for external source option - Refactor expert_source_service to read from local JSON files - Improve keyword generation prompts to leverage expert domain context - Add architecture analysis documentation (ARCHITECTURE_ANALYSIS.md) - Fix expert source selection bug (proper handling of empty custom_experts) - Update frontend to support curated/dbpedia/wikidata expert sources Key changes: - backend/app/data/: Local occupation data files - backend/app/services/expert_source_service.py: Simplified local file reading - backend/app/prompts/expert_transformation_prompt.py: Better domain-aware prompts - Removed expert_cache.py (no longer needed with local files) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 16:34:35 +08:00
parent 8777e27cbb
commit 5571076406
15 changed files with 9970 additions and 380 deletions
--- a/ARCHITECTURE_ANALYSIS.md
+++ b/ARCHITECTURE_ANALYSIS.md
@@ -0,0 +1,277 @@
+# novelty-seeking 系統流程與耦合度分析
+
+> 生成日期: 2025-12-04
+
+## 一、系統整體架構概覽
+
+novelty-seeking 是一個創新思維引導系統，由三個核心 Agent 組成：
+- **Attribute Agent**：從查詢到屬性節點的映射
+- **Transformation Agent**：屬性到新關鍵字的轉換
+- **Expert Transformation Agent**：多視角專家角度的屬性轉換
+
+---
+
+## 二、完整資料流程
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│                        Attribute Agent                               │
+├─────────────────────────────────────────────────────────────────────┤
+│  用戶輸入 Query (如「腳踏車」)                                        │
+│       ↓                                                              │
+│  Step 0: 類別分析 (category_mode 決定)                               │
+│       → 產出: CategoryDefinition[] (如 材料/功能/用途/使用族群)        │
+│       ↓                                                              │
+│  Step 1: 屬性列表生成                                                │
+│       → 產出: {材料: [鋼,木,碳纖維], 功能: [搬運,儲存], ...}           │
+│       ↓                                                              │
+│  Step 2: 關係生成 (DAG 邊)                                           │
+│       → 產出: AttributeDAG (nodes + edges)                           │
+└─────────────────────────────────────────────────────────────────────┘
+                              ↓ (高耦合)
+┌─────────────────────────────────────────────────────────────────────┐
+│                   Expert Transformation Agent                        │
+├─────────────────────────────────────────────────────────────────────┤
+│  輸入: Query + Category + Attributes (來自 Attribute Agent)          │
+│       ↓                                                              │
+│  Step 0: 專家團隊生成                                                │
+│       → expert_source 決定: llm / curated / dbpedia / wikidata       │
+│       → 產出: ExpertProfile[] (如 會計師/心理師/生態學家)              │
+│       ↓                                                              │
+│  Step 1: 專家視角關鍵字生成 (對每個 attribute)                        │
+│       → 產出: ExpertKeyword[] (關鍵字 + 來源專家 + 來源屬性)          │
+│       ↓                                                              │
+│  Step 2: 描述生成 (對每個 keyword)                                   │
+│       → 產出: ExpertTransformationDescription[]                      │
+└─────────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## 三、Attribute Agent 詳細流程
+
+### 3.1 流程架構
+
+```
+用戶查詢 (Query)
+    ↓
+┌─────────────────────────────────────────────┐
+│ Step 0: 類別分析 (Category Mode 決定)      │
+├─────────────────────────────────────────────┤
+│ 輸入: query, suggested_category_count      │
+│ 處理:                                        │
+│ - FIXED_ONLY: 使用 4 個固定類別              │
+│ - FIXED_PLUS_CUSTOM: 固定 + 用戶自訂        │
+│ - FIXED_PLUS_DYNAMIC: 固定 + LLM 推薦       │
+│ - CUSTOM_ONLY: 僅 LLM 推薦                  │
+│ - DYNAMIC_AUTO: 純 LLM 推薦 (預設)          │
+│ 輸出: Step0Result (recommended categories)  │
+└─────────────────────────────────────────────┘
+    ↓
+┌─────────────────────────────────────────────┐
+│ Step 1: 屬性列表生成 (Attributes)          │
+├─────────────────────────────────────────────┤
+│ 輸入: query, final_categories              │
+│ LLM 處理:                                    │
+│ - 分析 query 在各類別下的屬性               │
+│ - 每個類別生成 3-5 個屬性                    │
+│ 輸出: DynamicStep1Result                    │
+└─────────────────────────────────────────────┘
+    ↓
+┌─────────────────────────────────────────────┐
+│ Step 2: 關係映射 (Relationships → DAG)     │
+├─────────────────────────────────────────────┤
+│ 輸入: query, categories, attributes        │
+│ LLM 處理:                                    │
+│ - 分析相鄰類別之間的因果關係                │
+│ - 生成 (source, target) 關係對               │
+│ 輸出: AttributeDAG                          │
+└─────────────────────────────────────────────┘
+```
+
+### 3.2 關鍵輸入變數
+
+| 變數 | 來源 | 影響範圍 | 作用 |
+|------|------|--------|------|
+| `query` | 用戶輸入 | Step 0-2 全部 | 決定分析的物件 |
+| `category_mode` | 用戶選擇 | Step 0 | 決定使用哪些類別 |
+| `suggested_category_count` | 用戶設定 | Step 0 | LLM 推薦類別的數量 |
+| `temperature` | 用戶設定 | Step 0-2 | 控制 LLM 輸出的多樣性 |
+| `model` | 用戶選擇 | Step 0-2 | 選擇不同的 LLM 模型 |
+
+---
+
+## 四、Expert Transformation Agent 詳細流程
+
+### 4.1 流程架構
+
+```
+屬性列表 (attributes from Attribute Agent)
+    ↓
+┌─────────────────────────────────────────────┐
+│ Step 0: 專家團隊生成 (Expert Generation)   │
+├─────────────────────────────────────────────┤
+│ 決定因素:                                    │
+│ - expert_source = 'llm' → LLM 生成          │
+│ - expert_source ∈ ['curated', 'dbpedia',   │
+│   'wikidata'] → 本地檔案隨機選取             │
+│ - 有 custom_experts → 結合 LLM              │
+│                                             │
+│ 輸出: ExpertProfile[]                       │
+│       [{id, name, domain, perspective}]    │
+└─────────────────────────────────────────────┘
+    ↓
+┌─────────────────────────────────────────────┐
+│ Step 1: 專家視角關鍵字生成 (Keywords)       │
+├─────────────────────────────────────────────┤
+│ 迴圈: for each attribute in attributes:    │
+│   LLM 為每個專家生成 keywords_per_expert    │
+│   個關鍵字                                   │
+│                                             │
+│ 輸出: ExpertKeyword[]                       │
+│       [{keyword, expert_id, expert_name,    │
+│         source_attribute}]                  │
+└─────────────────────────────────────────────┘
+    ↓
+┌─────────────────────────────────────────────┐
+│ Step 2: 描述生成 (Descriptions)             │
+├─────────────────────────────────────────────┤
+│ 迴圈: for each expert_keyword:              │
+│   LLM 生成 15-30 字的創新應用描述           │
+│                                             │
+│ 輸出: ExpertTransformationDescription[]     │
+└─────────────────────────────────────────────┘
+```
+
+### 4.2 關鍵輸入變數
+
+| 變數 | 來源 | 影響範圍 | 作用 |
+|------|------|--------|------|
+| `expert_source` | 用戶選擇 | Step 0 | 決定專家來源 (llm/curated/dbpedia/wikidata) |
+| `expert_count` | 用戶設定 | Step 0 | 專家數量 (2-8) |
+| `keywords_per_expert` | 用戶設定 | Step 1 | 每專家每屬性關鍵字數 (1-3) |
+| `custom_experts` | 用戶輸入 | Step 0 | 用戶指定的專家名稱 |
+| `temperature` | 用戶設定 | Step 0-2 | 控制多樣性 |
+
+### 4.3 關鍵字生成公式
+
+```
+總關鍵字數 = len(attributes) × expert_count × keywords_per_expert
+
+範例計算:
+├─ 3 個屬性 (搬運, 儲存, 展示)
+├─ 3 位專家 (會計師, 心理師, 生態學家)
+├─ 1 個關鍵字/專家
+└─ = 3 × 3 × 1 = 9 個關鍵字
+```
+
+---
+
+## 五、關鍵字生成影響因素
+
+| 階段 | 影響變數 | 影響程度 | 說明 |
+|------|---------|---------|------|
+| **屬性生成** | `query` | 極高 | 決定 LLM 分析的語義基礎 |
+| | `category_mode` | 高 | 決定類別維度 |
+| | `temperature` | 中 | 越高越多樣 |
+| | `model` | 中 | 不同模型知識基礎不同 |
+| **專家生成** | `expert_source` | 高 | 決定專家來源與品質 |
+| | `expert_count` | 中 | 2-8 位專家 |
+| | `custom_experts` | 中 | 與 LLM 結合 |
+| **關鍵字生成** | `experts[].domain` | 極高 | 直接決定關鍵字視角 |
+| | `keywords_per_expert` | 低 | 控制數量 |
+| | `source_attribute` | 高 | 決定思考起點 |
+
+---
+
+## 六、耦合度分析
+
+### 6.1 高耦合連接 ⚠️
+
+| 連接 | 耦合度 | 原因 | 風險 |
+|------|--------|------|------|
+| Attribute → Expert Transform | 高 | Expert 依賴 Attribute 輸出 | 結構變更需同步修改 |
+| Expert 生成 → Keyword 生成 | 高 | domain 直接影響關鍵字 | domain 品質差→關鍵字無關 |
+| Prompt → LLM 輸出結構 | 高 | prompt 定義 JSON 格式 | 改 prompt 需改 schema |
+
+### 6.2 低耦合連接 ✓
+
+| 連接 | 耦合度 | 原因 | 優點 |
+|------|--------|------|------|
+| curated/dbpedia/wikidata | 低 | 獨立本地檔案 | 可單獨更新 |
+| SSE 通信格式 | 低 | 標準化解耦 | 向後相容 |
+| useAttribute/useExpertTransformation | 低 | 獨立 hook | 可單獨複用 |
+
+### 6.3 耦合度矩陣
+
+|  | Attribute | Transformation | Expert Transform |
+|----|-----------|---------------|----|
+| **Attribute** | - | 低 | 高 |
+| **Transformation** | 低 | - | 低 |
+| **Expert Transform** | 高 | 低 | - |
+
+---
+
+## 七、專家來源比較
+
+| 來源 | 檔案 | 筆數 | Domain 品質 | 特點 |
+|------|------|------|------------|------|
+| `llm` | - | 動態 | 高 | LLM 根據 query 生成相關專家 |
+| `curated` | curated_occupations_zh/en.json | 210 | 高 | 精選職業，含具體領域 |
+| `dbpedia` | dbpedia_occupations_en.json | 2164 | 低 | 全是 "Professional Field" |
+| `wikidata` | - | - | - | 未實作本地化 |
+
+---
+
+## 八、決策變化追蹤範例
+
+```
+Query: "腳踏車"
+    ↓
+Category Mode: DYNAMIC_AUTO
+    → LLM 建議 [材料, 功能, 用途, 使用族群]
+    ↓
+Expert Source: "curated"
+    → 隨機選取 [外科醫師(醫療與健康), 軟體工程師(資訊科技), 主廚(餐飲與服務)]
+    ↓
+Attribute "搬運" + Expert "外科醫師"
+    → LLM 思考: 醫療視角看搬運
+    → Keyword: "器官運輸", "急救物流"
+    ↓
+Description 生成:
+    → "從急救醫療角度，腳踏車可改良為緊急醫療運輸工具..."
+```
+
+---
+
+## 九、改進建議
+
+| 問題 | 現狀 | 建議改進 |
+|------|------|---------|
+| domain 品質 | DBpedia 全是通用值 | ✅ 已建立精選職業 |
+| 重複計算 Expert | 每類別重新生成 | 考慮 Expert 全局化 |
+| Temperature 統一 | 整流程同一值 | 可按 Step 分開設定 |
+| 缺乏快取 | 每次重新分析 | 加入 Attribute 快取層 |
+| 語言支援 | 主要中文 | ✅ 已建立英文版 |
+
+---
+
+## 十、關鍵檔案清單
+
+### Backend
+- `app/routers/analyze.py` - Attribute Agent 路由
+- `app/routers/expert_transformation.py` - Expert Transformation 路由
+- `app/prompts/step_prompts.py` - Attribute Agent 提示詞
+- `app/prompts/expert_transformation_prompt.py` - Expert Transformation 提示詞
+- `app/services/expert_source_service.py` - 專家來源服務
+- `app/services/llm_service.py` - LLM 調用服務
+- `app/data/curated_occupations_zh.json` - 精選職業（中文）
+- `app/data/curated_occupations_en.json` - 精選職業（英文）
+- `app/data/dbpedia_occupations_en.json` - DBpedia 職業
+
+### Frontend
+- `src/App.tsx` - 主狀態管理
+- `src/hooks/useAttribute.ts` - Attribute Agent Hook
+- `src/hooks/useExpertTransformation.ts` - Expert Transformation Hook
+- `src/components/TransformationInputPanel.tsx` - 轉換控制面板
+- `src/types/index.ts` - 類型定義