Files

gbanyan 5571076406 feat: Add curated expert occupations with local data sources

- Add curated occupations seed files (210 entries in zh/en) with specific domains
- Add DBpedia occupations data (2164 entries) for external source option
- Refactor expert_source_service to read from local JSON files
- Improve keyword generation prompts to leverage expert domain context
- Add architecture analysis documentation (ARCHITECTURE_ANALYSIS.md)
- Fix expert source selection bug (proper handling of empty custom_experts)
- Update frontend to support curated/dbpedia/wikidata expert sources

Key changes:
- backend/app/data/: Local occupation data files
- backend/app/services/expert_source_service.py: Simplified local file reading
- backend/app/prompts/expert_transformation_prompt.py: Better domain-aware prompts
- Removed expert_cache.py (no longer needed with local files)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-12-04 16:34:35 +08:00

14 KiB

Raw Permalink Blame History

novelty-seeking 系統流程與耦合度分析

生成日期: 2025-12-04

一、系統整體架構概覽

novelty-seeking 是一個創新思維引導系統，由三個核心 Agent 組成：

Attribute Agent：從查詢到屬性節點的映射
Transformation Agent：屬性到新關鍵字的轉換
Expert Transformation Agent：多視角專家角度的屬性轉換

二、完整資料流程

┌─────────────────────────────────────────────────────────────────────┐
│                        Attribute Agent                               │
├─────────────────────────────────────────────────────────────────────┤
│  用戶輸入 Query (如「腳踏車」)                                        │
│       ↓                                                              │
│  Step 0: 類別分析 (category_mode 決定)                               │
│       → 產出: CategoryDefinition[] (如 材料/功能/用途/使用族群)        │
│       ↓                                                              │
│  Step 1: 屬性列表生成                                                │
│       → 產出: {材料: [鋼,木,碳纖維], 功能: [搬運,儲存], ...}           │
│       ↓                                                              │
│  Step 2: 關係生成 (DAG 邊)                                           │
│       → 產出: AttributeDAG (nodes + edges)                           │
└─────────────────────────────────────────────────────────────────────┘
                              ↓ (高耦合)
┌─────────────────────────────────────────────────────────────────────┐
│                   Expert Transformation Agent                        │
├─────────────────────────────────────────────────────────────────────┤
│  輸入: Query + Category + Attributes (來自 Attribute Agent)          │
│       ↓                                                              │
│  Step 0: 專家團隊生成                                                │
│       → expert_source 決定: llm / curated / dbpedia / wikidata       │
│       → 產出: ExpertProfile[] (如 會計師/心理師/生態學家)              │
│       ↓                                                              │
│  Step 1: 專家視角關鍵字生成 (對每個 attribute)                        │
│       → 產出: ExpertKeyword[] (關鍵字 + 來源專家 + 來源屬性)          │
│       ↓                                                              │
│  Step 2: 描述生成 (對每個 keyword)                                   │
│       → 產出: ExpertTransformationDescription[]                      │
└─────────────────────────────────────────────────────────────────────┘

三、Attribute Agent 詳細流程

3.1 流程架構

用戶查詢 (Query)
    ↓
┌─────────────────────────────────────────────┐
│ Step 0: 類別分析 (Category Mode 決定)      │
├─────────────────────────────────────────────┤
│ 輸入: query, suggested_category_count      │
│ 處理:                                        │
│ - FIXED_ONLY: 使用 4 個固定類別              │
│ - FIXED_PLUS_CUSTOM: 固定 + 用戶自訂        │
│ - FIXED_PLUS_DYNAMIC: 固定 + LLM 推薦       │
│ - CUSTOM_ONLY: 僅 LLM 推薦                  │
│ - DYNAMIC_AUTO: 純 LLM 推薦 (預設)          │
│ 輸出: Step0Result (recommended categories)  │
└─────────────────────────────────────────────┘
    ↓
┌─────────────────────────────────────────────┐
│ Step 1: 屬性列表生成 (Attributes)          │
├─────────────────────────────────────────────┤
│ 輸入: query, final_categories              │
│ LLM 處理:                                    │
│ - 分析 query 在各類別下的屬性               │
│ - 每個類別生成 3-5 個屬性                    │
│ 輸出: DynamicStep1Result                    │
└─────────────────────────────────────────────┘
    ↓
┌─────────────────────────────────────────────┐
│ Step 2: 關係映射 (Relationships → DAG)     │
├─────────────────────────────────────────────┤
│ 輸入: query, categories, attributes        │
│ LLM 處理:                                    │
│ - 分析相鄰類別之間的因果關係                │
│ - 生成 (source, target) 關係對               │
│ 輸出: AttributeDAG                          │
└─────────────────────────────────────────────┘

3.2 關鍵輸入變數

變數	來源	影響範圍	作用
`query`	用戶輸入	Step 0-2 全部	決定分析的物件
`category_mode`	用戶選擇	Step 0	決定使用哪些類別
`suggested_category_count`	用戶設定	Step 0	LLM 推薦類別的數量
`temperature`	用戶設定	Step 0-2	控制 LLM 輸出的多樣性
`model`	用戶選擇	Step 0-2	選擇不同的 LLM 模型

四、Expert Transformation Agent 詳細流程

4.1 流程架構

屬性列表 (attributes from Attribute Agent)
    ↓
┌─────────────────────────────────────────────┐
│ Step 0: 專家團隊生成 (Expert Generation)   │
├─────────────────────────────────────────────┤
│ 決定因素:                                    │
│ - expert_source = 'llm' → LLM 生成          │
│ - expert_source ∈ ['curated', 'dbpedia',   │
│   'wikidata'] → 本地檔案隨機選取             │
│ - 有 custom_experts → 結合 LLM              │
│                                             │
│ 輸出: ExpertProfile[]                       │
│       [{id, name, domain, perspective}]    │
└─────────────────────────────────────────────┘
    ↓
┌─────────────────────────────────────────────┐
│ Step 1: 專家視角關鍵字生成 (Keywords)       │
├─────────────────────────────────────────────┤
│ 迴圈: for each attribute in attributes:    │
│   LLM 為每個專家生成 keywords_per_expert    │
│   個關鍵字                                   │
│                                             │
│ 輸出: ExpertKeyword[]                       │
│       [{keyword, expert_id, expert_name,    │
│         source_attribute}]                  │
└─────────────────────────────────────────────┘
    ↓
┌─────────────────────────────────────────────┐
│ Step 2: 描述生成 (Descriptions)             │
├─────────────────────────────────────────────┤
│ 迴圈: for each expert_keyword:              │
│   LLM 生成 15-30 字的創新應用描述           │
│                                             │
│ 輸出: ExpertTransformationDescription[]     │
└─────────────────────────────────────────────┘

4.2 關鍵輸入變數

變數	來源	影響範圍	作用
`expert_source`	用戶選擇	Step 0	決定專家來源 (llm/curated/dbpedia/wikidata)
`expert_count`	用戶設定	Step 0	專家數量 (2-8)
`keywords_per_expert`	用戶設定	Step 1	每專家每屬性關鍵字數 (1-3)
`custom_experts`	用戶輸入	Step 0	用戶指定的專家名稱
`temperature`	用戶設定	Step 0-2	控制多樣性

4.3 關鍵字生成公式

總關鍵字數 = len(attributes) × expert_count × keywords_per_expert

範例計算:
├─ 3 個屬性 (搬運, 儲存, 展示)
├─ 3 位專家 (會計師, 心理師, 生態學家)
├─ 1 個關鍵字/專家
└─ = 3 × 3 × 1 = 9 個關鍵字

五、關鍵字生成影響因素

階段	影響變數	影響程度	說明
屬性生成	`query`	極高	決定 LLM 分析的語義基礎
	`category_mode`	高	決定類別維度
	`temperature`	中	越高越多樣
	`model`	中	不同模型知識基礎不同
專家生成	`expert_source`	高	決定專家來源與品質
	`expert_count`	中	2-8 位專家
	`custom_experts`	中	與 LLM 結合
關鍵字生成	`experts[].domain`	極高	直接決定關鍵字視角
	`keywords_per_expert`	低	控制數量
	`source_attribute`	高	決定思考起點

六、耦合度分析

6.1 高耦合連接 ⚠️

連接	耦合度	原因	風險
Attribute → Expert Transform	高	Expert 依賴 Attribute 輸出	結構變更需同步修改
Expert 生成 → Keyword 生成	高	domain 直接影響關鍵字	domain 品質差→關鍵字無關
Prompt → LLM 輸出結構	高	prompt 定義 JSON 格式	改 prompt 需改 schema

6.2 低耦合連接 ✓

連接	耦合度	原因	優點
curated/dbpedia/wikidata	低	獨立本地檔案	可單獨更新
SSE 通信格式	低	標準化解耦	向後相容
useAttribute/useExpertTransformation	低	獨立 hook	可單獨複用

6.3 耦合度矩陣

	Attribute	Transformation	Expert Transform
Attribute	-	低	高
Transformation	低	-	低
Expert Transform	高	低	-

七、專家來源比較

來源	檔案	筆數	Domain 品質	特點
`llm`	-	動態	高	LLM 根據 query 生成相關專家
`curated`	curated_occupations_zh/en.json	210	高	精選職業，含具體領域
`dbpedia`	dbpedia_occupations_en.json	2164	低	全是 "Professional Field"
`wikidata`	-	-	-	未實作本地化

八、決策變化追蹤範例

Query: "腳踏車"
    ↓
Category Mode: DYNAMIC_AUTO
    → LLM 建議 [材料, 功能, 用途, 使用族群]
    ↓
Expert Source: "curated"
    → 隨機選取 [外科醫師(醫療與健康), 軟體工程師(資訊科技), 主廚(餐飲與服務)]
    ↓
Attribute "搬運" + Expert "外科醫師"
    → LLM 思考: 醫療視角看搬運
    → Keyword: "器官運輸", "急救物流"
    ↓
Description 生成:
    → "從急救醫療角度，腳踏車可改良為緊急醫療運輸工具..."

九、改進建議

問題	現狀	建議改進
domain 品質	DBpedia 全是通用值	✅ 已建立精選職業
重複計算 Expert	每類別重新生成	考慮 Expert 全局化
Temperature 統一	整流程同一值	可按 Step 分開設定
缺乏快取	每次重新分析	加入 Attribute 快取層
語言支援	主要中文	✅ 已建立英文版

十、關鍵檔案清單

Backend

app/routers/analyze.py - Attribute Agent 路由
app/routers/expert_transformation.py - Expert Transformation 路由
app/prompts/step_prompts.py - Attribute Agent 提示詞
app/prompts/expert_transformation_prompt.py - Expert Transformation 提示詞
app/services/expert_source_service.py - 專家來源服務
app/services/llm_service.py - LLM 調用服務
app/data/curated_occupations_zh.json - 精選職業（中文）
app/data/curated_occupations_en.json - 精選職業（英文）
app/data/dbpedia_occupations_en.json - DBpedia 職業

Frontend

src/App.tsx - 主狀態管理
src/hooks/useAttribute.ts - Attribute Agent Hook
src/hooks/useExpertTransformation.ts - Expert Transformation Hook
src/components/TransformationInputPanel.tsx - 轉換控制面板
src/types/index.ts - 類型定義

14 KiB Raw Permalink Blame History Unescape Escape

novelty-seeking 系統流程與耦合度分析

一、系統整體架構概覽

二、完整資料流程

三、Attribute Agent 詳細流程

3.1 流程架構

3.2 關鍵輸入變數

四、Expert Transformation Agent 詳細流程

4.1 流程架構

4.2 關鍵輸入變數

4.3 關鍵字生成公式

五、關鍵字生成影響因素

六、耦合度分析

6.1 高耦合連接 ⚠️

6.2 低耦合連接 ✓

6.3 耦合度矩陣

七、專家來源比較

八、決策變化追蹤範例

九、改進建議

十、關鍵檔案清單

Backend

Frontend

14 KiB

Raw Permalink Blame History