feat: Add Deduplication Agent with embedding and LLM methods

Implement a new Deduplication Agent that identifies and groups similar
transformation descriptions. Supports two deduplication methods:
- Embedding: Fast vector similarity comparison using cosine similarity
- LLM: Accurate pairwise semantic comparison (slower but more precise)

Backend changes:
- Add deduplication router with /deduplicate endpoint
- Add embedding_service for vector-based similarity
- Add llm_deduplication_service for LLM-based comparison
- Improve expert_transformation error handling and progress reporting

Frontend changes:
- Add DeduplicationPanel with interactive group visualization
- Add useDeduplication hook for state management
- Integrate deduplication tab in main App
- Add threshold slider and method selector in sidebar

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2025-12-22 20:26:17 +08:00
parent 5571076406
commit bc281b8e0a
18 changed files with 1397 additions and 25 deletions

View File

@@ -3,14 +3,18 @@ from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from .routers import attributes, transformation, expert_transformation
from .routers import attributes, transformation, expert_transformation, deduplication
from .services.llm_service import ollama_provider
from .services.embedding_service import embedding_service
from .services.llm_deduplication_service import llm_deduplication_service
@asynccontextmanager
async def lifespan(app: FastAPI):
yield
await ollama_provider.close()
await embedding_service.close()
await llm_deduplication_service.close()
app = FastAPI(
@@ -31,6 +35,7 @@ app.add_middleware(
app.include_router(attributes.router)
app.include_router(transformation.router)
app.include_router(expert_transformation.router)
app.include_router(deduplication.router)
@app.get("/")