feat: Add Deduplication Agent with embedding and LLM methods

Implement a new Deduplication Agent that identifies and groups similar
transformation descriptions. Supports two deduplication methods:
- Embedding: Fast vector similarity comparison using cosine similarity
- LLM: Accurate pairwise semantic comparison (slower but more precise)

Backend changes:
- Add deduplication router with /deduplicate endpoint
- Add embedding_service for vector-based similarity
- Add llm_deduplication_service for LLM-based comparison
- Improve expert_transformation error handling and progress reporting

Frontend changes:
- Add DeduplicationPanel with interactive group visualization
- Add useDeduplication hook for state management
- Integrate deduplication tab in main App
- Add threshold slider and method selector in sidebar

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2025-12-22 20:26:17 +08:00
parent 5571076406
commit bc281b8e0a
18 changed files with 1397 additions and 25 deletions

View File

@@ -90,16 +90,15 @@ def get_single_description_prompt(
) -> str:
"""Step 2: 為單一關鍵字生成描述"""
# 如果 domain 是通用的,就只用職業名稱
domain_text = f"{expert_domain}" if expert_domain and expert_domain != "Professional Field" else ""
domain_text = f"{expert_domain}領域" if expert_domain and expert_domain != "Professional Field" else ""
return f"""/no_think
物件:「{query}
專家:{expert_name}{domain_text}
你是一位{expert_name}{domain_text}
任務:為「{query}」生成一段創新應用描述。
關鍵字:{keyword}
你是一位{expert_name}。從你的專業視角生成一段創新應用描述15-30字,說明如何將「{keyword}」的概念應用到「{query}」上。
從你的專業視角,說明如何將「{keyword}」的概念應用到「{query}」上。描述要具體、有創意15-30字。
描述要體現{expert_name}的專業思維和獨特觀點。
回傳 JSON
{{"description": "應用描述"}}"""
只回傳 JSON不要其他文字
{{"description": "你的創新應用描述"}}"""