feat: Add experiments framework and novelty-driven agent loop

- Add complete experiments directory with pilot study infrastructure
  - 5 experimental conditions (direct, expert-only, attribute-only, full-pipeline, random-perspective)
  - Human assessment tool with React frontend and FastAPI backend
  - AUT flexibility analysis with jump signal detection
  - Result visualization and metrics computation

- Add novelty-driven agent loop module (experiments/novelty_loop/)
  - NoveltyDrivenTaskAgent with expert perspective perturbation
  - Three termination strategies: breakthrough, exhaust, coverage
  - Interactive CLI demo with colored output
  - Embedding-based novelty scoring

- Add DDC knowledge domain classification data (en/zh)
- Add CLAUDE.md project documentation
- Update research report with experiment findings

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-20 10:16:21 +08:00
parent 26a56a2a07
commit 43c025e060
81 changed files with 18766 additions and 2 deletions

View File

@@ -0,0 +1,314 @@
# Human Assessment Web Interface
A standalone web application for human assessment of generated ideas using Torrance-inspired creativity metrics.
## Overview
This tool enables blind evaluation of creative ideas generated by the novelty-seeking experiment. Raters assess ideas on four dimensions without knowing which experimental condition produced each idea, ensuring unbiased evaluation.
## Quick Start
```bash
cd experiments/assessment
# 1. Prepare assessment data (if not already done)
python3 prepare_data.py
# 2. Start the system
./start.sh
# 3. Open browser
open http://localhost:5174
```
## Directory Structure
```
assessment/
├── backend/
│ ├── app.py # FastAPI backend API
│ ├── database.py # SQLite database operations
│ ├── models.py # Pydantic models & dimension definitions
│ └── requirements.txt # Python dependencies
├── frontend/
│ ├── src/
│ │ ├── components/ # React UI components
│ │ ├── hooks/ # React state management
│ │ ├── services/ # API client
│ │ └── types/ # TypeScript definitions
│ └── package.json
├── data/
│ └── assessment_items.json # Prepared ideas for rating
├── results/
│ └── ratings.db # SQLite database with ratings
├── prepare_data.py # Data preparation script
├── analyze_ratings.py # Inter-rater reliability analysis
├── start.sh # Start both servers
├── stop.sh # Stop all services
└── README.md # This file
```
## Data Preparation
### List Available Experiment Files
```bash
python3 prepare_data.py --list
```
Output:
```
Available experiment files (most recent first):
experiment_20260119_165650_deduped.json (1571.3 KB)
experiment_20260119_163040_deduped.json (156.4 KB)
```
### Prepare Assessment Data
```bash
# Use all ideas (not recommended for human assessment)
python3 prepare_data.py
# RECOMMENDED: Stratified sampling - 4 ideas per condition per query
# Results in ~200 ideas (5 conditions × 4 ideas × 10 queries)
python3 prepare_data.py --per-condition 4
# Alternative: Sample 150 ideas total (proportionally across queries)
python3 prepare_data.py --sample 150
# Limit per query (20 ideas max per query)
python3 prepare_data.py --per-query 20
# Combined: 4 per condition, max 15 per query
python3 prepare_data.py --per-condition 4 --per-query 15
# Specify a different experiment file
python3 prepare_data.py experiment_20260119_163040_deduped.json --per-condition 4
```
### Sampling Options
| Option | Description | Example |
|--------|-------------|---------|
| `--per-condition N` | Max N ideas per condition per query (stratified) | `--per-condition 4` → ~200 ideas |
| `--per-query N` | Max N ideas per query | `--per-query 20` |
| `--sample N` | Total N ideas (proportionally distributed) | `--sample 150` |
| `--seed N` | Random seed for reproducibility | `--seed 42` (default) |
**Recommendation**: Use `--per-condition 4` for balanced assessment across conditions.
The script:
1. Loads the deduped experiment results
2. Extracts all unique ideas with hidden metadata (condition, expert, keyword)
3. Assigns stable IDs to each idea
4. Shuffles ideas within each query (reproducible with seed=42)
5. Outputs `data/assessment_items.json`
## Assessment Dimensions
Raters evaluate each idea on four dimensions using a 1-5 Likert scale:
### Originality
*How unexpected or surprising is this idea?*
| Score | Description |
|-------|-------------|
| 1 | Very common/obvious idea anyone would suggest |
| 2 | Somewhat common, slight variation on expected ideas |
| 3 | Moderately original, some unexpected elements |
| 4 | Quite original, notably different approach |
| 5 | Highly unexpected, truly novel concept |
### Elaboration
*How detailed and well-developed is this idea?*
| Score | Description |
|-------|-------------|
| 1 | Vague, minimal detail, just a concept |
| 2 | Basic idea with little specificity |
| 3 | Moderately detailed, some specifics provided |
| 4 | Well-developed with clear implementation hints |
| 5 | Highly specific, thoroughly developed concept |
### Coherence
*Does this idea make logical sense and relate to the query object?*
| Score | Description |
|-------|-------------|
| 1 | Nonsensical, irrelevant, or incomprehensible |
| 2 | Mostly unclear, weak connection to query |
| 3 | Partially coherent, some logical gaps |
| 4 | Mostly coherent with minor issues |
| 5 | Fully coherent, clearly relates to query |
### Usefulness
*Could this idea have practical value or inspire real innovation?*
| Score | Description |
|-------|-------------|
| 1 | No practical value whatsoever |
| 2 | Minimal usefulness, highly impractical |
| 3 | Some potential value with major limitations |
| 4 | Useful idea with realistic applications |
| 5 | Highly useful, clear practical value |
## Running the System
### Start
```bash
./start.sh
```
This will:
1. Check for `data/assessment_items.json` (runs `prepare_data.py` if missing)
2. Install frontend dependencies if needed
3. Start backend API on port 8002
4. Start frontend dev server on port 5174
### Stop
```bash
./stop.sh
```
Or press `Ctrl+C` in the terminal running `start.sh`.
### Manual Start (Development)
```bash
# Terminal 1: Backend
cd backend
../../../backend/venv/bin/uvicorn app:app --host 0.0.0.0 --port 8002 --reload
# Terminal 2: Frontend
cd frontend
npm run dev
```
## API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/health` | GET | Health check |
| `/api/info` | GET | Experiment info (total ideas, queries, conditions) |
| `/api/dimensions` | GET | Dimension definitions for UI |
| `/api/raters` | GET | List all raters |
| `/api/raters` | POST | Register/login rater |
| `/api/queries` | GET | List all queries |
| `/api/queries/{id}` | GET | Get query with all ideas |
| `/api/queries/{id}/unrated?rater_id=X` | GET | Get unrated ideas for rater |
| `/api/ratings` | POST | Submit a rating |
| `/api/progress/{rater_id}` | GET | Get rater's progress |
| `/api/statistics` | GET | Overall statistics |
| `/api/export` | GET | Export all ratings with metadata |
## Analysis
After collecting ratings from multiple raters:
```bash
python3 analyze_ratings.py
```
This calculates:
- **Krippendorff's alpha**: Inter-rater reliability for ordinal data
- **ICC(2,1)**: Intraclass Correlation Coefficient with 95% CI
- **Mean ratings per condition**: Compare experimental conditions
- **Kruskal-Wallis test**: Statistical significance between conditions
Output is saved to `results/analysis_results.json`.
## Database Schema
SQLite database (`results/ratings.db`):
```sql
-- Raters
CREATE TABLE raters (
rater_id TEXT PRIMARY KEY,
name TEXT,
created_at TIMESTAMP
);
-- Ratings
CREATE TABLE ratings (
id INTEGER PRIMARY KEY,
rater_id TEXT,
idea_id TEXT,
query_id TEXT,
originality INTEGER CHECK(originality BETWEEN 1 AND 5),
elaboration INTEGER CHECK(elaboration BETWEEN 1 AND 5),
coherence INTEGER CHECK(coherence BETWEEN 1 AND 5),
usefulness INTEGER CHECK(usefulness BETWEEN 1 AND 5),
skipped INTEGER DEFAULT 0,
timestamp TIMESTAMP,
UNIQUE(rater_id, idea_id)
);
-- Progress tracking
CREATE TABLE progress (
rater_id TEXT,
query_id TEXT,
completed_count INTEGER,
total_count INTEGER,
PRIMARY KEY (rater_id, query_id)
);
```
## Blind Assessment Design
To ensure unbiased evaluation:
1. **Randomization**: Ideas are shuffled within each query using a fixed seed (42) for reproducibility
2. **Hidden metadata**: Condition, expert name, and keywords are stored but not shown to raters
3. **Consistent ordering**: All raters see the same randomized order
4. **Context provided**: Only the query text is shown (e.g., "Chair", "Bicycle")
## Workflow for Raters
1. **Login**: Enter a unique rater ID
2. **Instructions**: Read dimension definitions (shown before first rating)
3. **Rate ideas**: For each idea:
- Read the idea text
- Rate all 4 dimensions (1-5)
- Click "Submit & Next" or "Skip"
4. **Progress**: Track completion per query and overall
5. **Completion**: Summary shown when all ideas are rated
## Troubleshooting
### Backend won't start
```bash
# Check if port 8002 is in use
lsof -i :8002
# Check backend logs
cat /tmp/assessment_backend.log
```
### Frontend won't start
```bash
# Reinstall dependencies
cd frontend
rm -rf node_modules
npm install
```
### Reset database
```bash
rm results/ratings.db
# Database is auto-created on next backend start
```
### Regenerate assessment data
```bash
rm data/assessment_items.json
python3 prepare_data.py
```
## Tech Stack
- **Backend**: Python 3.11+, FastAPI, SQLite, Pydantic
- **Frontend**: React 19, TypeScript, Vite, Ant Design 6.0
- **Analysis**: NumPy, SciPy (for statistical tests)

View File

@@ -0,0 +1,356 @@
#!/usr/bin/env python3
"""
Analyze assessment ratings for inter-rater reliability and condition comparisons.
This script:
1. Loads ratings from the SQLite database
2. Joins with hidden metadata (condition, expert)
3. Calculates inter-rater reliability metrics
4. Computes mean ratings per dimension per condition
5. Performs statistical comparisons between conditions
"""
import json
import sqlite3
from collections import defaultdict
from datetime import datetime
from pathlib import Path
from typing import Any
import numpy as np
from scipy import stats
# Paths
RESULTS_DIR = Path(__file__).parent / 'results'
DATA_DIR = Path(__file__).parent / 'data'
DB_PATH = RESULTS_DIR / 'ratings.db'
ASSESSMENT_DATA_PATH = DATA_DIR / 'assessment_items.json'
def load_assessment_data() -> dict[str, Any]:
"""Load the assessment items data with hidden metadata."""
with open(ASSESSMENT_DATA_PATH, 'r', encoding='utf-8') as f:
return json.load(f)
def load_ratings_from_db() -> list[dict[str, Any]]:
"""Load all ratings from the SQLite database."""
if not DB_PATH.exists():
print(f"Database not found at {DB_PATH}")
return []
conn = sqlite3.connect(DB_PATH)
conn.row_factory = sqlite3.Row
cursor = conn.cursor()
cursor.execute('''
SELECT r.*, rat.name as rater_name
FROM ratings r
LEFT JOIN raters rat ON r.rater_id = rat.rater_id
WHERE r.skipped = 0
''')
ratings = [dict(row) for row in cursor.fetchall()]
conn.close()
return ratings
def build_idea_lookup(assessment_data: dict[str, Any]) -> dict[str, dict[str, Any]]:
"""Build a lookup table from idea_id to metadata."""
lookup = {}
for query in assessment_data['queries']:
for idea in query['ideas']:
lookup[idea['idea_id']] = {
'text': idea['text'],
'query_id': query['query_id'],
'query_text': query['query_text'],
**idea['_hidden']
}
return lookup
def calculate_krippendorff_alpha(ratings_matrix: np.ndarray) -> float:
"""
Calculate Krippendorff's alpha for ordinal data.
Args:
ratings_matrix: 2D array where rows are items and columns are raters.
NaN values indicate missing ratings.
Returns:
Krippendorff's alpha coefficient
"""
# Remove items with fewer than 2 raters
valid_items = ~np.all(np.isnan(ratings_matrix), axis=1)
ratings_matrix = ratings_matrix[valid_items]
if ratings_matrix.shape[0] < 2:
return np.nan
n_items, n_raters = ratings_matrix.shape
# Observed disagreement
observed_disagreement = 0
n_pairs = 0
for i in range(n_items):
values = ratings_matrix[i, ~np.isnan(ratings_matrix[i])]
if len(values) < 2:
continue
# Ordinal distance: squared difference
for j in range(len(values)):
for k in range(j + 1, len(values)):
observed_disagreement += (values[j] - values[k]) ** 2
n_pairs += 1
if n_pairs == 0:
return np.nan
observed_disagreement /= n_pairs
# Expected disagreement (based on marginal distribution)
all_values = ratings_matrix[~np.isnan(ratings_matrix)]
if len(all_values) < 2:
return np.nan
expected_disagreement = 0
n_total_pairs = 0
for i in range(len(all_values)):
for j in range(i + 1, len(all_values)):
expected_disagreement += (all_values[i] - all_values[j]) ** 2
n_total_pairs += 1
if n_total_pairs == 0:
return np.nan
expected_disagreement /= n_total_pairs
if expected_disagreement == 0:
return 1.0
alpha = 1 - (observed_disagreement / expected_disagreement)
return alpha
def calculate_icc(ratings_matrix: np.ndarray) -> tuple[float, float, float]:
"""
Calculate Intraclass Correlation Coefficient (ICC(2,1)).
Args:
ratings_matrix: 2D array where rows are items and columns are raters.
Returns:
Tuple of (ICC, lower_bound, upper_bound)
"""
# Remove rows with any NaN
valid_rows = ~np.any(np.isnan(ratings_matrix), axis=1)
ratings_matrix = ratings_matrix[valid_rows]
if ratings_matrix.shape[0] < 2 or ratings_matrix.shape[1] < 2:
return np.nan, np.nan, np.nan
n, k = ratings_matrix.shape
# Grand mean
grand_mean = np.mean(ratings_matrix)
# Row means (item means)
row_means = np.mean(ratings_matrix, axis=1)
# Column means (rater means)
col_means = np.mean(ratings_matrix, axis=0)
# Sum of squares
ss_total = np.sum((ratings_matrix - grand_mean) ** 2)
ss_rows = k * np.sum((row_means - grand_mean) ** 2)
ss_cols = n * np.sum((col_means - grand_mean) ** 2)
ss_error = ss_total - ss_rows - ss_cols
# Mean squares
ms_rows = ss_rows / (n - 1) if n > 1 else 0
ms_cols = ss_cols / (k - 1) if k > 1 else 0
ms_error = ss_error / ((n - 1) * (k - 1)) if (n > 1 and k > 1) else 0
# ICC(2,1) - two-way random, absolute agreement, single rater
if ms_error + (ms_cols - ms_error) / n == 0:
return np.nan, np.nan, np.nan
icc = (ms_rows - ms_error) / (ms_rows + (k - 1) * ms_error + k * (ms_cols - ms_error) / n)
# Confidence interval (approximate)
# Using F distribution
df1 = n - 1
df2 = (n - 1) * (k - 1)
if ms_error == 0:
return icc, np.nan, np.nan
f_value = ms_rows / ms_error
f_lower = f_value / stats.f.ppf(0.975, df1, df2)
f_upper = f_value / stats.f.ppf(0.025, df1, df2)
icc_lower = (f_lower - 1) / (f_lower + k - 1)
icc_upper = (f_upper - 1) / (f_upper + k - 1)
return icc, icc_lower, icc_upper
def analyze_ratings():
"""Main analysis function."""
print("=" * 60)
print("CREATIVE IDEA ASSESSMENT ANALYSIS")
print("=" * 60)
print()
# Load data
assessment_data = load_assessment_data()
ratings = load_ratings_from_db()
idea_lookup = build_idea_lookup(assessment_data)
if not ratings:
print("No ratings found in database.")
return
print(f"Loaded {len(ratings)} ratings from database")
print(f"Experiment ID: {assessment_data['experiment_id']}")
print()
# Get unique raters
raters = list(set(r['rater_id'] for r in ratings))
print(f"Raters: {raters}")
print()
# Join ratings with metadata
enriched_ratings = []
for r in ratings:
idea_meta = idea_lookup.get(r['idea_id'], {})
enriched_ratings.append({
**r,
'condition': idea_meta.get('condition', 'unknown'),
'expert_name': idea_meta.get('expert_name', ''),
'keyword': idea_meta.get('keyword', ''),
'query_text': idea_meta.get('query_text', ''),
'idea_text': idea_meta.get('text', '')
})
# Dimensions
dimensions = ['originality', 'elaboration', 'coherence', 'usefulness']
# ================================
# Inter-rater reliability
# ================================
print("-" * 60)
print("INTER-RATER RELIABILITY")
print("-" * 60)
print()
if len(raters) >= 2:
# Build ratings matrix per dimension
idea_ids = list(set(r['idea_id'] for r in enriched_ratings))
for dim in dimensions:
# Create matrix: rows = ideas, cols = raters
matrix = np.full((len(idea_ids), len(raters)), np.nan)
idea_to_idx = {idea: idx for idx, idea in enumerate(idea_ids)}
rater_to_idx = {rater: idx for idx, rater in enumerate(raters)}
for r in enriched_ratings:
if r[dim] is not None:
i = idea_to_idx[r['idea_id']]
j = rater_to_idx[r['rater_id']]
matrix[i, j] = r[dim]
# Calculate metrics
alpha = calculate_krippendorff_alpha(matrix)
icc, icc_low, icc_high = calculate_icc(matrix)
print(f"{dim.upper()}:")
print(f" Krippendorff's alpha: {alpha:.3f}")
print(f" ICC(2,1): {icc:.3f} (95% CI: {icc_low:.3f} - {icc_high:.3f})")
print()
else:
print("Need at least 2 raters for inter-rater reliability analysis.")
print()
# ================================
# Condition comparisons
# ================================
print("-" * 60)
print("MEAN RATINGS BY CONDITION")
print("-" * 60)
print()
# Group ratings by condition
condition_ratings: dict[str, dict[str, list[int]]] = defaultdict(lambda: defaultdict(list))
for r in enriched_ratings:
condition = r['condition']
for dim in dimensions:
if r[dim] is not None:
condition_ratings[condition][dim].append(r[dim])
# Calculate means and print
condition_stats = {}
for condition in sorted(condition_ratings.keys()):
print(f"\n{condition}:")
condition_stats[condition] = {}
for dim in dimensions:
values = condition_ratings[condition][dim]
if values:
mean = np.mean(values)
std = np.std(values)
n = len(values)
condition_stats[condition][dim] = {'mean': mean, 'std': std, 'n': n}
print(f" {dim}: {mean:.2f} (SD={std:.2f}, n={n})")
else:
print(f" {dim}: no data")
# ================================
# Statistical comparisons
# ================================
print()
print("-" * 60)
print("STATISTICAL COMPARISONS (Kruskal-Wallis)")
print("-" * 60)
print()
conditions = sorted(condition_ratings.keys())
if len(conditions) >= 2:
for dim in dimensions:
groups = [condition_ratings[c][dim] for c in conditions if condition_ratings[c][dim]]
if len(groups) >= 2:
h_stat, p_value = stats.kruskal(*groups)
sig = "*" if p_value < 0.05 else ""
print(f"{dim}: H={h_stat:.2f}, p={p_value:.4f} {sig}")
else:
print(f"{dim}: insufficient data for comparison")
else:
print("Need at least 2 conditions with data for statistical comparison.")
# ================================
# Export results
# ================================
output = {
'analysis_timestamp': datetime.utcnow().isoformat(),
'experiment_id': assessment_data['experiment_id'],
'total_ratings': len(ratings),
'raters': raters,
'rater_count': len(raters),
'condition_stats': condition_stats,
'enriched_ratings': enriched_ratings
}
output_path = RESULTS_DIR / 'analysis_results.json'
with open(output_path, 'w', encoding='utf-8') as f:
json.dump(output, f, ensure_ascii=False, indent=2, default=str)
print()
print("-" * 60)
print(f"Results exported to: {output_path}")
print("=" * 60)
if __name__ == '__main__':
analyze_ratings()

View File

@@ -0,0 +1 @@
"""Assessment backend package."""

View File

@@ -0,0 +1,374 @@
"""
FastAPI backend for human assessment of creative ideas.
"""
import json
from datetime import datetime
from pathlib import Path
from typing import Any
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
try:
from . import database as db
from .models import (
DIMENSION_DEFINITIONS,
ExportData,
ExportRating,
IdeaForRating,
Progress,
QueryInfo,
QueryWithIdeas,
Rater,
RaterCreate,
RaterProgress,
Rating,
RatingSubmit,
Statistics,
)
except ImportError:
import database as db
from models import (
DIMENSION_DEFINITIONS,
ExportData,
ExportRating,
IdeaForRating,
Progress,
QueryInfo,
QueryWithIdeas,
Rater,
RaterCreate,
RaterProgress,
Rating,
RatingSubmit,
Statistics,
)
# Load assessment data
DATA_PATH = Path(__file__).parent.parent / 'data' / 'assessment_items.json'
def load_assessment_data() -> dict[str, Any]:
"""Load the assessment items data."""
if not DATA_PATH.exists():
raise RuntimeError(f"Assessment data not found at {DATA_PATH}. Run prepare_data.py first.")
with open(DATA_PATH, 'r', encoding='utf-8') as f:
return json.load(f)
# Initialize FastAPI app
app = FastAPI(
title="Creative Idea Assessment API",
description="API for human assessment of creative ideas using Torrance-inspired metrics",
version="1.0.0"
)
# CORS middleware
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Cache for assessment data
_assessment_data: dict[str, Any] | None = None
def get_assessment_data() -> dict[str, Any]:
"""Get cached assessment data."""
global _assessment_data
if _assessment_data is None:
_assessment_data = load_assessment_data()
return _assessment_data
# Rater endpoints
@app.get("/api/raters", response_model=list[Rater])
def list_raters() -> list[dict[str, Any]]:
"""List all registered raters."""
return db.list_raters()
@app.post("/api/raters", response_model=Rater)
def create_or_get_rater(rater_data: RaterCreate) -> dict[str, Any]:
"""Register a new rater or get existing one."""
return db.create_rater(rater_data.rater_id, rater_data.name)
@app.get("/api/raters/{rater_id}", response_model=Rater)
def get_rater(rater_id: str) -> dict[str, Any]:
"""Get a specific rater."""
rater = db.get_rater(rater_id)
if not rater:
raise HTTPException(status_code=404, detail="Rater not found")
return rater
# Query endpoints
@app.get("/api/queries", response_model=list[QueryInfo])
def list_queries() -> list[dict[str, Any]]:
"""List all queries available for assessment."""
data = get_assessment_data()
return [
{
'query_id': q['query_id'],
'query_text': q['query_text'],
'category': q.get('category', ''),
'idea_count': q['idea_count']
}
for q in data['queries']
]
@app.get("/api/queries/{query_id}", response_model=QueryWithIdeas)
def get_query_with_ideas(query_id: str) -> dict[str, Any]:
"""Get a query with all its ideas for rating (without hidden metadata)."""
data = get_assessment_data()
for query in data['queries']:
if query['query_id'] == query_id:
ideas = [
IdeaForRating(
idea_id=idea['idea_id'],
text=idea['text'],
index=idx
)
for idx, idea in enumerate(query['ideas'])
]
return QueryWithIdeas(
query_id=query['query_id'],
query_text=query['query_text'],
category=query.get('category', ''),
ideas=ideas,
total_count=len(ideas)
)
raise HTTPException(status_code=404, detail="Query not found")
@app.get("/api/queries/{query_id}/unrated", response_model=QueryWithIdeas)
def get_unrated_ideas(query_id: str, rater_id: str) -> dict[str, Any]:
"""Get unrated ideas for a query by a specific rater."""
data = get_assessment_data()
for query in data['queries']:
if query['query_id'] == query_id:
# Get already rated idea IDs
rated_ids = db.get_rated_idea_ids(rater_id, query_id)
# Filter to unrated ideas
unrated_ideas = [
IdeaForRating(
idea_id=idea['idea_id'],
text=idea['text'],
index=idx
)
for idx, idea in enumerate(query['ideas'])
if idea['idea_id'] not in rated_ids
]
return QueryWithIdeas(
query_id=query['query_id'],
query_text=query['query_text'],
category=query.get('category', ''),
ideas=unrated_ideas,
total_count=query['idea_count']
)
raise HTTPException(status_code=404, detail="Query not found")
# Rating endpoints
@app.post("/api/ratings", response_model=dict[str, Any])
def submit_rating(rating: RatingSubmit) -> dict[str, Any]:
"""Submit a rating for an idea."""
# Validate that rater exists
rater = db.get_rater(rating.rater_id)
if not rater:
raise HTTPException(status_code=404, detail="Rater not found. Please register first.")
# Validate idea exists
data = get_assessment_data()
idea_found = False
for query in data['queries']:
for idea in query['ideas']:
if idea['idea_id'] == rating.idea_id:
idea_found = True
break
if idea_found:
break
if not idea_found:
raise HTTPException(status_code=404, detail="Idea not found")
# If not skipped, require all ratings
if not rating.skipped:
if rating.originality is None or rating.elaboration is None or rating.coherence is None or rating.usefulness is None:
raise HTTPException(
status_code=400,
detail="All dimensions must be rated unless skipping"
)
# Save rating
return db.save_rating(
rater_id=rating.rater_id,
idea_id=rating.idea_id,
query_id=rating.query_id,
originality=rating.originality,
elaboration=rating.elaboration,
coherence=rating.coherence,
usefulness=rating.usefulness,
skipped=rating.skipped
)
@app.get("/api/ratings/{rater_id}/{idea_id}", response_model=Rating | None)
def get_rating(rater_id: str, idea_id: str) -> dict[str, Any] | None:
"""Get a specific rating."""
return db.get_rating(rater_id, idea_id)
@app.get("/api/ratings/rater/{rater_id}", response_model=list[Rating])
def get_ratings_by_rater(rater_id: str) -> list[dict[str, Any]]:
"""Get all ratings by a rater."""
return db.get_ratings_by_rater(rater_id)
# Progress endpoints
@app.get("/api/progress/{rater_id}", response_model=RaterProgress)
def get_rater_progress(rater_id: str) -> RaterProgress:
"""Get complete progress for a rater."""
rater = db.get_rater(rater_id)
if not rater:
raise HTTPException(status_code=404, detail="Rater not found")
data = get_assessment_data()
# Get rated idea counts per query
ratings = db.get_ratings_by_rater(rater_id)
ratings_per_query: dict[str, int] = {}
for r in ratings:
qid = r['query_id']
ratings_per_query[qid] = ratings_per_query.get(qid, 0) + 1
# Build progress list
query_progress = []
total_completed = 0
total_ideas = 0
for query in data['queries']:
qid = query['query_id']
completed = ratings_per_query.get(qid, 0)
total = query['idea_count']
query_progress.append(Progress(
rater_id=rater_id,
query_id=qid,
completed_count=completed,
total_count=total
))
total_completed += completed
total_ideas += total
percentage = (total_completed / total_ideas * 100) if total_ideas > 0 else 0
return RaterProgress(
rater_id=rater_id,
queries=query_progress,
total_completed=total_completed,
total_ideas=total_ideas,
percentage=round(percentage, 1)
)
# Statistics endpoint
@app.get("/api/statistics", response_model=Statistics)
def get_statistics() -> Statistics:
"""Get overall assessment statistics."""
stats = db.get_statistics()
return Statistics(**stats)
# Dimension definitions endpoint
@app.get("/api/dimensions")
def get_dimensions() -> dict[str, Any]:
"""Get dimension definitions for the UI."""
return DIMENSION_DEFINITIONS
# Export endpoint
@app.get("/api/export", response_model=ExportData)
def export_ratings() -> ExportData:
"""Export all ratings with hidden metadata for analysis."""
data = get_assessment_data()
all_ratings = db.get_all_ratings()
# Build idea lookup with hidden metadata
idea_lookup: dict[str, dict[str, Any]] = {}
query_lookup: dict[str, str] = {}
for query in data['queries']:
query_lookup[query['query_id']] = query['query_text']
for idea in query['ideas']:
idea_lookup[idea['idea_id']] = {
'text': idea['text'],
'condition': idea['_hidden']['condition'],
'expert_name': idea['_hidden']['expert_name'],
'keyword': idea['_hidden']['keyword']
}
# Build export ratings
export_ratings = []
for r in all_ratings:
idea_data = idea_lookup.get(r['idea_id'], {})
export_ratings.append(ExportRating(
rater_id=r['rater_id'],
idea_id=r['idea_id'],
query_id=r['query_id'],
query_text=query_lookup.get(r['query_id'], ''),
idea_text=idea_data.get('text', ''),
originality=r['originality'],
elaboration=r['elaboration'],
coherence=r['coherence'],
usefulness=r['usefulness'],
skipped=bool(r['skipped']),
condition=idea_data.get('condition', ''),
expert_name=idea_data.get('expert_name', ''),
keyword=idea_data.get('keyword', ''),
timestamp=r['timestamp']
))
return ExportData(
experiment_id=data['experiment_id'],
export_timestamp=datetime.utcnow(),
rater_count=len(db.list_raters()),
rating_count=len(export_ratings),
ratings=export_ratings
)
# Health check
@app.get("/api/health")
def health_check() -> dict[str, str]:
"""Health check endpoint."""
return {"status": "healthy"}
# Info endpoint
@app.get("/api/info")
def get_info() -> dict[str, Any]:
"""Get assessment session info."""
data = get_assessment_data()
return {
'experiment_id': data['experiment_id'],
'total_ideas': data['total_ideas'],
'query_count': data['query_count'],
'conditions': data['conditions'],
'randomization_seed': data['randomization_seed']
}

View File

@@ -0,0 +1,309 @@
"""
SQLite database setup and operations for assessment ratings storage.
"""
import sqlite3
from contextlib import contextmanager
from datetime import datetime
from pathlib import Path
from typing import Any, Generator
# Database path
DB_PATH = Path(__file__).parent.parent / 'results' / 'ratings.db'
def get_db_path() -> Path:
"""Get the database path, ensuring directory exists."""
DB_PATH.parent.mkdir(parents=True, exist_ok=True)
return DB_PATH
@contextmanager
def get_connection() -> Generator[sqlite3.Connection, None, None]:
"""Get a database connection as a context manager."""
conn = sqlite3.connect(get_db_path())
conn.row_factory = sqlite3.Row
try:
yield conn
finally:
conn.close()
def init_db() -> None:
"""Initialize the database with required tables."""
with get_connection() as conn:
cursor = conn.cursor()
# Raters table
cursor.execute('''
CREATE TABLE IF NOT EXISTS raters (
rater_id TEXT PRIMARY KEY,
name TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
''')
# Ratings table
cursor.execute('''
CREATE TABLE IF NOT EXISTS ratings (
id INTEGER PRIMARY KEY AUTOINCREMENT,
rater_id TEXT NOT NULL,
idea_id TEXT NOT NULL,
query_id TEXT NOT NULL,
originality INTEGER CHECK(originality BETWEEN 1 AND 5),
elaboration INTEGER CHECK(elaboration BETWEEN 1 AND 5),
coherence INTEGER CHECK(coherence BETWEEN 1 AND 5),
usefulness INTEGER CHECK(usefulness BETWEEN 1 AND 5),
skipped INTEGER DEFAULT 0,
timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (rater_id) REFERENCES raters(rater_id),
UNIQUE(rater_id, idea_id)
)
''')
# Progress table
cursor.execute('''
CREATE TABLE IF NOT EXISTS progress (
rater_id TEXT NOT NULL,
query_id TEXT NOT NULL,
completed_count INTEGER DEFAULT 0,
total_count INTEGER DEFAULT 0,
started_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (rater_id, query_id),
FOREIGN KEY (rater_id) REFERENCES raters(rater_id)
)
''')
# Create indexes for common queries
cursor.execute('''
CREATE INDEX IF NOT EXISTS idx_ratings_rater
ON ratings(rater_id)
''')
cursor.execute('''
CREATE INDEX IF NOT EXISTS idx_ratings_idea
ON ratings(idea_id)
''')
conn.commit()
# Rater operations
def create_rater(rater_id: str, name: str | None = None) -> dict[str, Any]:
"""Create a new rater."""
with get_connection() as conn:
cursor = conn.cursor()
try:
cursor.execute(
'INSERT INTO raters (rater_id, name) VALUES (?, ?)',
(rater_id, name or rater_id)
)
conn.commit()
return {'rater_id': rater_id, 'name': name or rater_id, 'created': True}
except sqlite3.IntegrityError:
# Rater already exists
return get_rater(rater_id)
def get_rater(rater_id: str) -> dict[str, Any] | None:
"""Get a rater by ID."""
with get_connection() as conn:
cursor = conn.cursor()
cursor.execute('SELECT * FROM raters WHERE rater_id = ?', (rater_id,))
row = cursor.fetchone()
if row:
return dict(row)
return None
def list_raters() -> list[dict[str, Any]]:
"""List all raters."""
with get_connection() as conn:
cursor = conn.cursor()
cursor.execute('SELECT * FROM raters ORDER BY created_at')
return [dict(row) for row in cursor.fetchall()]
# Rating operations
def save_rating(
rater_id: str,
idea_id: str,
query_id: str,
originality: int | None,
elaboration: int | None,
coherence: int | None,
usefulness: int | None,
skipped: bool = False
) -> dict[str, Any]:
"""Save or update a rating."""
with get_connection() as conn:
cursor = conn.cursor()
cursor.execute('''
INSERT INTO ratings (rater_id, idea_id, query_id, originality, elaboration, coherence, usefulness, skipped, timestamp)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
ON CONFLICT(rater_id, idea_id) DO UPDATE SET
originality = excluded.originality,
elaboration = excluded.elaboration,
coherence = excluded.coherence,
usefulness = excluded.usefulness,
skipped = excluded.skipped,
timestamp = excluded.timestamp
''', (rater_id, idea_id, query_id, originality, elaboration, coherence, usefulness, int(skipped), datetime.utcnow()))
conn.commit()
# Update progress
update_progress(rater_id, query_id)
return {
'rater_id': rater_id,
'idea_id': idea_id,
'saved': True
}
def get_rating(rater_id: str, idea_id: str) -> dict[str, Any] | None:
"""Get a specific rating."""
with get_connection() as conn:
cursor = conn.cursor()
cursor.execute(
'SELECT * FROM ratings WHERE rater_id = ? AND idea_id = ?',
(rater_id, idea_id)
)
row = cursor.fetchone()
if row:
return dict(row)
return None
def get_ratings_by_rater(rater_id: str) -> list[dict[str, Any]]:
"""Get all ratings by a rater."""
with get_connection() as conn:
cursor = conn.cursor()
cursor.execute(
'SELECT * FROM ratings WHERE rater_id = ? ORDER BY timestamp',
(rater_id,)
)
return [dict(row) for row in cursor.fetchall()]
def get_ratings_by_idea(idea_id: str) -> list[dict[str, Any]]:
"""Get all ratings for an idea."""
with get_connection() as conn:
cursor = conn.cursor()
cursor.execute(
'SELECT * FROM ratings WHERE idea_id = ? ORDER BY rater_id',
(idea_id,)
)
return [dict(row) for row in cursor.fetchall()]
def get_all_ratings() -> list[dict[str, Any]]:
"""Get all ratings."""
with get_connection() as conn:
cursor = conn.cursor()
cursor.execute('SELECT * FROM ratings ORDER BY timestamp')
return [dict(row) for row in cursor.fetchall()]
# Progress operations
def update_progress(rater_id: str, query_id: str) -> None:
"""Update progress for a rater on a query."""
with get_connection() as conn:
cursor = conn.cursor()
# Count completed ratings for this query
cursor.execute('''
SELECT COUNT(*) as count FROM ratings
WHERE rater_id = ? AND query_id = ?
''', (rater_id, query_id))
completed = cursor.fetchone()['count']
# Update or insert progress
cursor.execute('''
INSERT INTO progress (rater_id, query_id, completed_count, updated_at)
VALUES (?, ?, ?, ?)
ON CONFLICT(rater_id, query_id) DO UPDATE SET
completed_count = excluded.completed_count,
updated_at = excluded.updated_at
''', (rater_id, query_id, completed, datetime.utcnow()))
conn.commit()
def set_progress_total(rater_id: str, query_id: str, total: int) -> None:
"""Set the total count for a query's progress."""
with get_connection() as conn:
cursor = conn.cursor()
cursor.execute('''
INSERT INTO progress (rater_id, query_id, total_count, completed_count)
VALUES (?, ?, ?, 0)
ON CONFLICT(rater_id, query_id) DO UPDATE SET
total_count = excluded.total_count
''', (rater_id, query_id, total))
conn.commit()
def get_progress(rater_id: str) -> list[dict[str, Any]]:
"""Get progress for all queries for a rater."""
with get_connection() as conn:
cursor = conn.cursor()
cursor.execute(
'SELECT * FROM progress WHERE rater_id = ? ORDER BY query_id',
(rater_id,)
)
return [dict(row) for row in cursor.fetchall()]
def get_progress_for_query(rater_id: str, query_id: str) -> dict[str, Any] | None:
"""Get progress for a specific query."""
with get_connection() as conn:
cursor = conn.cursor()
cursor.execute(
'SELECT * FROM progress WHERE rater_id = ? AND query_id = ?',
(rater_id, query_id)
)
row = cursor.fetchone()
if row:
return dict(row)
return None
def get_rated_idea_ids(rater_id: str, query_id: str) -> set[str]:
"""Get the set of idea IDs already rated by a rater for a query."""
with get_connection() as conn:
cursor = conn.cursor()
cursor.execute(
'SELECT idea_id FROM ratings WHERE rater_id = ? AND query_id = ?',
(rater_id, query_id)
)
return {row['idea_id'] for row in cursor.fetchall()}
# Statistics
def get_statistics() -> dict[str, Any]:
"""Get overall statistics."""
with get_connection() as conn:
cursor = conn.cursor()
cursor.execute('SELECT COUNT(*) as count FROM raters')
rater_count = cursor.fetchone()['count']
cursor.execute('SELECT COUNT(*) as count FROM ratings WHERE skipped = 0')
rating_count = cursor.fetchone()['count']
cursor.execute('SELECT COUNT(*) as count FROM ratings WHERE skipped = 1')
skip_count = cursor.fetchone()['count']
cursor.execute('SELECT COUNT(DISTINCT idea_id) as count FROM ratings')
rated_ideas = cursor.fetchone()['count']
return {
'rater_count': rater_count,
'rating_count': rating_count,
'skip_count': skip_count,
'rated_ideas': rated_ideas
}
# Initialize on import
init_db()

View File

@@ -0,0 +1,183 @@
"""
Pydantic models for the assessment API.
"""
from datetime import datetime
from pydantic import BaseModel, Field
# Request models
class RaterCreate(BaseModel):
"""Request to create or login as a rater."""
rater_id: str = Field(..., min_length=1, max_length=50, description="Unique rater identifier")
name: str | None = Field(None, max_length=100, description="Optional display name")
class RatingSubmit(BaseModel):
"""Request to submit a rating."""
rater_id: str = Field(..., description="Rater identifier")
idea_id: str = Field(..., description="Idea identifier")
query_id: str = Field(..., description="Query identifier")
originality: int | None = Field(None, ge=1, le=5, description="Originality score 1-5")
elaboration: int | None = Field(None, ge=1, le=5, description="Elaboration score 1-5")
coherence: int | None = Field(None, ge=1, le=5, description="Coherence score 1-5")
usefulness: int | None = Field(None, ge=1, le=5, description="Usefulness score 1-5")
skipped: bool = Field(False, description="Whether the idea was skipped")
# Response models
class Rater(BaseModel):
"""Rater information."""
rater_id: str
name: str | None
created_at: datetime | None = None
class Rating(BaseModel):
"""A single rating."""
id: int
rater_id: str
idea_id: str
query_id: str
originality: int | None
elaboration: int | None
coherence: int | None
usefulness: int | None
skipped: int
timestamp: datetime | None
class Progress(BaseModel):
"""Progress for a rater on a query."""
rater_id: str
query_id: str
completed_count: int
total_count: int
started_at: datetime | None = None
updated_at: datetime | None = None
class QueryInfo(BaseModel):
"""Information about a query."""
query_id: str
query_text: str
category: str
idea_count: int
class IdeaForRating(BaseModel):
"""An idea presented for rating (without hidden metadata)."""
idea_id: str
text: str
index: int # Position in the randomized list for this query
class QueryWithIdeas(BaseModel):
"""A query with its ideas for rating."""
query_id: str
query_text: str
category: str
ideas: list[IdeaForRating]
total_count: int
class Statistics(BaseModel):
"""Overall statistics."""
rater_count: int
rating_count: int
skip_count: int
rated_ideas: int
class RaterProgress(BaseModel):
"""Complete progress summary for a rater."""
rater_id: str
queries: list[Progress]
total_completed: int
total_ideas: int
percentage: float
# Export response models
class ExportRating(BaseModel):
"""Rating with hidden metadata for export."""
rater_id: str
idea_id: str
query_id: str
query_text: str
idea_text: str
originality: int | None
elaboration: int | None
coherence: int | None
usefulness: int | None
skipped: bool
condition: str
expert_name: str
keyword: str
timestamp: datetime | None
class ExportData(BaseModel):
"""Full export data structure."""
experiment_id: str
export_timestamp: datetime
rater_count: int
rating_count: int
ratings: list[ExportRating]
# Dimension definitions (for frontend)
DIMENSION_DEFINITIONS = {
"originality": {
"name": "Originality",
"question": "How unexpected or surprising is this idea? Would most people NOT think of this?",
"scale": {
1: "Very common/obvious idea anyone would suggest",
2: "Somewhat common, slight variation on expected ideas",
3: "Moderately original, some unexpected elements",
4: "Quite original, notably different approach",
5: "Highly unexpected, truly novel concept"
},
"low_label": "Common",
"high_label": "Unexpected"
},
"elaboration": {
"name": "Elaboration",
"question": "How detailed and well-developed is this idea?",
"scale": {
1: "Vague, minimal detail, just a concept",
2: "Basic idea with little specificity",
3: "Moderately detailed, some specifics provided",
4: "Well-developed with clear implementation hints",
5: "Highly specific, thoroughly developed concept"
},
"low_label": "Vague",
"high_label": "Detailed"
},
"coherence": {
"name": "Coherence",
"question": "Does this idea make logical sense and relate to the query object?",
"scale": {
1: "Nonsensical, irrelevant, or incomprehensible",
2: "Mostly unclear, weak connection to query",
3: "Partially coherent, some logical gaps",
4: "Mostly coherent with minor issues",
5: "Fully coherent, clearly relates to query"
},
"low_label": "Nonsense",
"high_label": "Coherent"
},
"usefulness": {
"name": "Usefulness",
"question": "Could this idea have practical value or inspire real innovation?",
"scale": {
1: "No practical value whatsoever",
2: "Minimal usefulness, highly impractical",
3: "Some potential value with major limitations",
4: "Useful idea with realistic applications",
5: "Highly useful, clear practical value"
},
"low_label": "Useless",
"high_label": "Useful"
}
}

View File

@@ -0,0 +1,3 @@
fastapi>=0.109.0
uvicorn>=0.27.0
pydantic>=2.5.0

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,13 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/vite.svg" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Creative Idea Assessment</title>
</head>
<body>
<div id="root"></div>
<script type="module" src="/src/main.tsx"></script>
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,32 @@
{
"name": "assessment-frontend",
"private": true,
"version": "1.0.0",
"type": "module",
"scripts": {
"dev": "vite",
"build": "tsc -b && vite build",
"lint": "eslint .",
"preview": "vite preview"
},
"dependencies": {
"@ant-design/icons": "^6.1.0",
"antd": "^6.0.0",
"react": "^19.2.0",
"react-dom": "^19.2.0"
},
"devDependencies": {
"@eslint/js": "^9.39.1",
"@types/node": "^24.10.1",
"@types/react": "^19.2.5",
"@types/react-dom": "^19.2.3",
"@vitejs/plugin-react": "^5.1.1",
"eslint": "^9.39.1",
"eslint-plugin-react-hooks": "^7.0.1",
"eslint-plugin-react-refresh": "^0.4.24",
"globals": "^16.5.0",
"typescript": "~5.9.3",
"typescript-eslint": "^8.46.4",
"vite": "^7.2.4"
}
}

View File

@@ -0,0 +1,109 @@
/**
* Main application component for the assessment interface.
*/
import { ConfigProvider, theme, Spin } from 'antd';
import { useAssessment } from './hooks/useAssessment';
import { RaterLogin } from './components/RaterLogin';
import { InstructionsPage } from './components/InstructionsPage';
import { AssessmentPage } from './components/AssessmentPage';
import { CompletionPage } from './components/CompletionPage';
function App() {
const assessment = useAssessment();
const renderContent = () => {
// Show loading spinner for initial load
if (assessment.loading && !assessment.rater) {
return (
<div style={{
display: 'flex',
justifyContent: 'center',
alignItems: 'center',
minHeight: '100vh'
}}>
<Spin size="large" />
</div>
);
}
switch (assessment.view) {
case 'login':
return (
<RaterLogin
onLogin={assessment.login}
loading={assessment.loading}
error={assessment.error}
/>
);
case 'instructions':
return (
<InstructionsPage
dimensions={assessment.dimensions}
onStart={assessment.startAssessment}
loading={assessment.loading}
/>
);
case 'assessment':
if (!assessment.rater || !assessment.currentQuery || !assessment.currentIdea || !assessment.dimensions) {
return (
<div style={{
display: 'flex',
justifyContent: 'center',
alignItems: 'center',
minHeight: '100vh'
}}>
<Spin size="large" tip="Loading..." />
</div>
);
}
return (
<AssessmentPage
raterId={assessment.rater.rater_id}
queryId={assessment.currentQuery.query_id}
queryText={assessment.currentQuery.query_text}
idea={assessment.currentIdea}
ideaIndex={assessment.currentIdeaIndex}
totalIdeas={assessment.currentQuery.total_count}
dimensions={assessment.dimensions}
progress={assessment.progress}
onNext={assessment.nextIdea}
onPrev={assessment.prevIdea}
onShowDefinitions={assessment.showInstructions}
onLogout={assessment.logout}
canGoPrev={assessment.currentIdeaIndex > 0}
/>
);
case 'completion':
return (
<CompletionPage
raterId={assessment.rater?.rater_id ?? ''}
progress={assessment.progress}
onLogout={assessment.logout}
/>
);
default:
return null;
}
};
return (
<ConfigProvider
theme={{
algorithm: theme.defaultAlgorithm,
token: {
colorPrimary: '#1677ff',
borderRadius: 6,
},
}}
>
{renderContent()}
</ConfigProvider>
);
}
export default App;

View File

@@ -0,0 +1,199 @@
/**
* Main assessment page for rating ideas.
*/
import { Card, Button, Space, Alert, Typography } from 'antd';
import {
ArrowLeftOutlined,
ArrowRightOutlined,
ForwardOutlined,
BookOutlined,
LogoutOutlined
} from '@ant-design/icons';
import type { IdeaForRating, DimensionDefinitions, RaterProgress } from '../types';
import { useRatings } from '../hooks/useRatings';
import { IdeaCard } from './IdeaCard';
import { RatingSlider } from './RatingSlider';
import { ProgressBar } from './ProgressBar';
const { Text } = Typography;
interface AssessmentPageProps {
raterId: string;
queryId: string;
queryText: string;
idea: IdeaForRating;
ideaIndex: number;
totalIdeas: number;
dimensions: DimensionDefinitions;
progress: RaterProgress | null;
onNext: () => void;
onPrev: () => void;
onShowDefinitions: () => void;
onLogout: () => void;
canGoPrev: boolean;
}
export function AssessmentPage({
raterId,
queryId,
queryText,
idea,
ideaIndex,
totalIdeas,
dimensions,
progress,
onNext,
onPrev,
onShowDefinitions,
onLogout,
canGoPrev
}: AssessmentPageProps) {
const {
ratings,
setRating,
isComplete,
submit,
skip,
submitting,
error
} = useRatings({
raterId,
queryId,
ideaId: idea.idea_id,
onSuccess: onNext
});
const handleSubmit = async () => {
await submit();
};
const handleSkip = async () => {
await skip();
};
// Calculate query progress
const queryProgress = progress?.queries.find(q => q.query_id === queryId);
const queryCompleted = queryProgress?.completed_count ?? ideaIndex;
const queryTotal = totalIdeas;
return (
<div style={{ maxWidth: 800, margin: '0 auto', padding: 24 }}>
{/* Header with query info and overall progress */}
<Card size="small" style={{ marginBottom: 16 }}>
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', marginBottom: 8 }}>
<Text strong style={{ fontSize: 16 }}>Query: "{queryText}"</Text>
<Space>
<Button
icon={<BookOutlined />}
onClick={onShowDefinitions}
size="small"
>
Definitions
</Button>
<Button
icon={<LogoutOutlined />}
onClick={onLogout}
size="small"
danger
>
Exit
</Button>
</Space>
</div>
<ProgressBar
completed={queryCompleted}
total={queryTotal}
label="Query Progress"
/>
{progress && (
<div style={{ marginTop: 8 }}>
<ProgressBar
completed={progress.total_completed}
total={progress.total_ideas}
label="Overall Progress"
/>
</div>
)}
</Card>
{/* Error display */}
{error && (
<Alert
message={error}
type="error"
showIcon
closable
style={{ marginBottom: 16 }}
/>
)}
{/* Idea card */}
<IdeaCard
ideaNumber={ideaIndex + 1}
text={idea.text}
queryText={queryText}
/>
{/* Rating inputs */}
<Card style={{ marginBottom: 16 }}>
<RatingSlider
dimension={dimensions.originality}
value={ratings.originality}
onChange={(v) => setRating('originality', v)}
disabled={submitting}
/>
<RatingSlider
dimension={dimensions.elaboration}
value={ratings.elaboration}
onChange={(v) => setRating('elaboration', v)}
disabled={submitting}
/>
<RatingSlider
dimension={dimensions.coherence}
value={ratings.coherence}
onChange={(v) => setRating('coherence', v)}
disabled={submitting}
/>
<RatingSlider
dimension={dimensions.usefulness}
value={ratings.usefulness}
onChange={(v) => setRating('usefulness', v)}
disabled={submitting}
/>
</Card>
{/* Navigation buttons */}
<Card>
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center' }}>
<Button
icon={<ArrowLeftOutlined />}
onClick={onPrev}
disabled={!canGoPrev || submitting}
>
Back
</Button>
<Space>
<Button
icon={<ForwardOutlined />}
onClick={handleSkip}
loading={submitting}
>
Skip
</Button>
<Button
type="primary"
icon={<ArrowRightOutlined />}
onClick={handleSubmit}
loading={submitting}
disabled={!isComplete()}
>
Submit & Next
</Button>
</Space>
</div>
</Card>
</div>
);
}

View File

@@ -0,0 +1,105 @@
/**
* Completion page shown when all ideas have been rated.
*/
import { Card, Button, Typography, Space, Result, Statistic, Row, Col } from 'antd';
import { CheckCircleOutlined, BarChartOutlined, LogoutOutlined } from '@ant-design/icons';
import type { RaterProgress } from '../types';
const { Title, Text } = Typography;
interface CompletionPageProps {
raterId: string;
progress: RaterProgress | null;
onLogout: () => void;
}
export function CompletionPage({ raterId, progress, onLogout }: CompletionPageProps) {
const completed = progress?.total_completed ?? 0;
const total = progress?.total_ideas ?? 0;
const percentage = progress?.percentage ?? 0;
const isFullyComplete = completed >= total;
return (
<div style={{
display: 'flex',
justifyContent: 'center',
alignItems: 'center',
minHeight: '100vh',
padding: 24
}}>
<Card style={{ maxWidth: 600, width: '100%' }}>
<Result
status={isFullyComplete ? 'success' : 'info'}
icon={isFullyComplete ? <CheckCircleOutlined /> : <BarChartOutlined />}
title={isFullyComplete ? 'Assessment Complete!' : 'Session Summary'}
subTitle={
isFullyComplete
? 'Thank you for completing the assessment.'
: 'You have made progress on the assessment.'
}
extra={[
<Button
type="primary"
key="logout"
icon={<LogoutOutlined />}
onClick={onLogout}
>
Exit
</Button>
]}
>
<Row gutter={16} style={{ marginTop: 24 }}>
<Col span={8}>
<Statistic
title="Ideas Rated"
value={completed}
suffix={`/ ${total}`}
/>
</Col>
<Col span={8}>
<Statistic
title="Progress"
value={percentage}
suffix="%"
precision={1}
/>
</Col>
<Col span={8}>
<Statistic
title="Rater ID"
value={raterId}
valueStyle={{ fontSize: 16 }}
/>
</Col>
</Row>
{progress && progress.queries.length > 0 && (
<div style={{ marginTop: 24 }}>
<Title level={5}>Progress by Query</Title>
<Space direction="vertical" style={{ width: '100%' }}>
{progress.queries.map((q) => (
<div
key={q.query_id}
style={{
display: 'flex',
justifyContent: 'space-between',
padding: '4px 0'
}}
>
<Text>{q.query_id}</Text>
<Text type={q.completed_count >= q.total_count ? 'success' : 'secondary'}>
{q.completed_count} / {q.total_count}
{q.completed_count >= q.total_count && ' ✓'}
</Text>
</div>
))}
</Space>
</div>
)}
</Result>
</Card>
</div>
);
}

View File

@@ -0,0 +1,36 @@
/**
* Card displaying a single idea for rating.
*/
import { Card, Typography, Tag } from 'antd';
const { Text, Paragraph } = Typography;
interface IdeaCardProps {
ideaNumber: number;
text: string;
queryText: string;
}
export function IdeaCard({ ideaNumber, text, queryText }: IdeaCardProps) {
return (
<Card
title={
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center' }}>
<Text strong>IDEA #{ideaNumber}</Text>
<Tag color="blue">Query: {queryText}</Tag>
</div>
}
style={{ marginBottom: 24 }}
>
<Paragraph style={{
fontSize: 16,
lineHeight: 1.8,
margin: 0,
padding: '8px 0'
}}>
"{text}"
</Paragraph>
</Card>
);
}

View File

@@ -0,0 +1,134 @@
/**
* Instructions page showing dimension definitions.
*/
import { useState } from 'react';
import { Card, Button, Typography, Space, Checkbox, Divider, Tag } from 'antd';
import { PlayCircleOutlined } from '@ant-design/icons';
import type { DimensionDefinitions } from '../types';
const { Title, Text, Paragraph } = Typography;
interface InstructionsPageProps {
dimensions: DimensionDefinitions | null;
onStart: () => void;
onBack?: () => void;
loading: boolean;
isReturning?: boolean;
}
export function InstructionsPage({
dimensions,
onStart,
onBack,
loading,
isReturning = false
}: InstructionsPageProps) {
const [acknowledged, setAcknowledged] = useState(isReturning);
if (!dimensions) {
return (
<div style={{ padding: 24, textAlign: 'center' }}>
<Text>Loading instructions...</Text>
</div>
);
}
const dimensionOrder = ['originality', 'elaboration', 'coherence', 'usefulness'] as const;
return (
<div style={{
maxWidth: 800,
margin: '0 auto',
padding: 24
}}>
<Card>
<Space direction="vertical" size="large" style={{ width: '100%' }}>
<div style={{ textAlign: 'center' }}>
<Title level={2}>Assessment Instructions</Title>
<Paragraph type="secondary">
You will rate creative ideas on 4 dimensions using a 1-5 scale.
Please read each definition carefully before beginning.
</Paragraph>
</div>
<Divider />
{dimensionOrder.map((key) => {
const dim = dimensions[key];
return (
<Card
key={key}
size="small"
title={
<Space>
<Tag color="blue">{dim.name}</Tag>
<Text type="secondary">{dim.question}</Text>
</Space>
}
style={{ marginBottom: 16 }}
>
<div style={{
display: 'grid',
gridTemplateColumns: 'auto 1fr',
gap: '8px 16px',
fontSize: 14
}}>
{([1, 2, 3, 4, 5] as const).map((score) => (
<>
<Tag
key={`score-${score}`}
color={score <= 2 ? 'red' : score === 3 ? 'orange' : 'green'}
>
{score}
</Tag>
<Text key={`text-${score}`}>
{dim.scale[score]}
</Text>
</>
))}
</div>
<Divider style={{ margin: '12px 0' }} />
<div style={{ display: 'flex', justifyContent: 'space-between' }}>
<Text type="secondary">{dim.low_label}</Text>
<Text type="secondary">{dim.high_label}</Text>
</div>
</Card>
);
})}
<Divider />
<Space direction="vertical" style={{ width: '100%' }}>
{!isReturning && (
<Checkbox
checked={acknowledged}
onChange={(e) => setAcknowledged(e.target.checked)}
>
I have read and understood the instructions
</Checkbox>
)}
<Space style={{ width: '100%', justifyContent: 'center' }}>
{onBack && (
<Button onClick={onBack}>
Back to Assessment
</Button>
)}
<Button
type="primary"
size="large"
icon={<PlayCircleOutlined />}
onClick={onStart}
loading={loading}
disabled={!acknowledged}
>
{isReturning ? 'Continue Rating' : 'Begin Rating'}
</Button>
</Space>
</Space>
</Space>
</Card>
</div>
);
}

View File

@@ -0,0 +1,39 @@
/**
* Progress bar component showing assessment progress.
*/
import { Progress, Typography, Space } from 'antd';
const { Text } = Typography;
interface ProgressBarProps {
completed: number;
total: number;
label?: string;
}
export function ProgressBar({ completed, total, label }: ProgressBarProps) {
const percentage = total > 0 ? Math.round((completed / total) * 100) : 0;
return (
<div style={{ width: '100%' }}>
{label && (
<Space style={{ marginBottom: 4, justifyContent: 'space-between', width: '100%' }}>
<Text type="secondary">{label}</Text>
<Text type="secondary">
{completed}/{total} ({percentage}%)
</Text>
</Space>
)}
<Progress
percent={percentage}
showInfo={!label}
status="active"
strokeColor={{
'0%': '#108ee9',
'100%': '#87d068',
}}
/>
</div>
);
}

View File

@@ -0,0 +1,116 @@
/**
* Rater login component.
*/
import { useState, useEffect } from 'react';
import { Card, Input, Button, Typography, Space, List, Alert } from 'antd';
import { UserOutlined, LoginOutlined } from '@ant-design/icons';
import * as api from '../services/api';
import type { Rater } from '../types';
const { Title, Text } = Typography;
interface RaterLoginProps {
onLogin: (raterId: string, name?: string) => void;
loading: boolean;
error: string | null;
}
export function RaterLogin({ onLogin, loading, error }: RaterLoginProps) {
const [raterId, setRaterId] = useState('');
const [existingRaters, setExistingRaters] = useState<Rater[]>([]);
useEffect(() => {
api.listRaters()
.then(setExistingRaters)
.catch(console.error);
}, []);
const handleLogin = () => {
if (raterId.trim()) {
onLogin(raterId.trim());
}
};
const handleQuickLogin = (rater: Rater) => {
onLogin(rater.rater_id);
};
return (
<div style={{
display: 'flex',
justifyContent: 'center',
alignItems: 'center',
minHeight: '100vh',
padding: 24
}}>
<Card
style={{ width: 400, maxWidth: '100%' }}
styles={{ body: { padding: 32 } }}
>
<Space direction="vertical" size="large" style={{ width: '100%' }}>
<div style={{ textAlign: 'center' }}>
<Title level={3} style={{ marginBottom: 8 }}>
Creative Idea Assessment
</Title>
<Text type="secondary">
Enter your rater ID to begin
</Text>
</div>
{error && (
<Alert message={error} type="error" showIcon />
)}
<Input
size="large"
placeholder="Enter your rater ID"
prefix={<UserOutlined />}
value={raterId}
onChange={(e) => setRaterId(e.target.value)}
onPressEnter={handleLogin}
disabled={loading}
/>
<Button
type="primary"
size="large"
icon={<LoginOutlined />}
onClick={handleLogin}
loading={loading}
disabled={!raterId.trim()}
block
>
Start Assessment
</Button>
{existingRaters.length > 0 && (
<div>
<Text type="secondary" style={{ display: 'block', marginBottom: 8 }}>
Existing raters:
</Text>
<List
size="small"
bordered
dataSource={existingRaters}
renderItem={(rater) => (
<List.Item
style={{ cursor: 'pointer' }}
onClick={() => handleQuickLogin(rater)}
>
<Text code>{rater.rater_id}</Text>
{rater.name && rater.name !== rater.rater_id && (
<Text type="secondary" style={{ marginLeft: 8 }}>
({rater.name})
</Text>
)}
</List.Item>
)}
/>
</div>
)}
</Space>
</Card>
</div>
);
}

View File

@@ -0,0 +1,74 @@
/**
* Rating input component with radio buttons for 1-5 scale.
*/
import { Radio, Typography, Space, Tooltip, Button } from 'antd';
import { QuestionCircleOutlined } from '@ant-design/icons';
import type { DimensionDefinition } from '../types';
const { Text } = Typography;
interface RatingSliderProps {
dimension: DimensionDefinition;
value: number | null;
onChange: (value: number | null) => void;
disabled?: boolean;
}
export function RatingSlider({ dimension, value, onChange, disabled }: RatingSliderProps) {
return (
<div style={{ marginBottom: 24 }}>
<div style={{ display: 'flex', alignItems: 'center', marginBottom: 8 }}>
<Text strong style={{ marginRight: 8 }}>
{dimension.name.toUpperCase()}
</Text>
<Tooltip
title={
<div>
<p style={{ marginBottom: 8 }}>{dimension.question}</p>
{([1, 2, 3, 4, 5] as const).map((score) => (
<div key={score} style={{ marginBottom: 4 }}>
<strong>{score}:</strong> {dimension.scale[score]}
</div>
))}
</div>
}
placement="right"
overlayStyle={{ maxWidth: 400 }}
>
<Button
type="text"
size="small"
icon={<QuestionCircleOutlined />}
style={{ padding: 0, height: 'auto' }}
/>
</Tooltip>
</div>
<div style={{ display: 'flex', alignItems: 'center', gap: 16 }}>
<Text type="secondary" style={{ minWidth: 80, textAlign: 'right' }}>
{dimension.low_label}
</Text>
<Radio.Group
value={value}
onChange={(e) => onChange(e.target.value)}
disabled={disabled}
style={{ flex: 1 }}
>
<Space size="large">
{[1, 2, 3, 4, 5].map((score) => (
<Radio key={score} value={score}>
{score}
</Radio>
))}
</Space>
</Radio.Group>
<Text type="secondary" style={{ minWidth: 80 }}>
{dimension.high_label}
</Text>
</div>
</div>
);
}

View File

@@ -0,0 +1,272 @@
/**
* Hook for managing the assessment session state.
*/
import { useState, useCallback, useEffect } from 'react';
import type {
AppView,
DimensionDefinitions,
QueryInfo,
QueryWithIdeas,
Rater,
RaterProgress,
} from '../types';
import * as api from '../services/api';
interface AssessmentState {
view: AppView;
rater: Rater | null;
queries: QueryInfo[];
currentQueryIndex: number;
currentQuery: QueryWithIdeas | null;
currentIdeaIndex: number;
progress: RaterProgress | null;
dimensions: DimensionDefinitions | null;
loading: boolean;
error: string | null;
}
const initialState: AssessmentState = {
view: 'login',
rater: null,
queries: [],
currentQueryIndex: 0,
currentQuery: null,
currentIdeaIndex: 0,
progress: null,
dimensions: null,
loading: false,
error: null,
};
export function useAssessment() {
const [state, setState] = useState<AssessmentState>(initialState);
// Load dimension definitions on mount
useEffect(() => {
api.getDimensionDefinitions()
.then((dimensions) => setState((s) => ({ ...s, dimensions })))
.catch((err) => console.error('Failed to load dimensions:', err));
}, []);
// Login as a rater
const login = useCallback(async (raterId: string, name?: string) => {
setState((s) => ({ ...s, loading: true, error: null }));
try {
const rater = await api.createOrGetRater({ rater_id: raterId, name });
const queries = await api.listQueries();
const progress = await api.getRaterProgress(raterId);
setState((s) => ({
...s,
rater,
queries,
progress,
view: 'instructions',
loading: false,
}));
} catch (err) {
setState((s) => ({
...s,
error: err instanceof Error ? err.message : 'Login failed',
loading: false,
}));
}
}, []);
// Start assessment (move from instructions to assessment)
const startAssessment = useCallback(async () => {
if (!state.rater || state.queries.length === 0) return;
setState((s) => ({ ...s, loading: true }));
try {
// Find first query with unrated ideas
let queryIndex = 0;
let queryData: QueryWithIdeas | null = null;
for (let i = 0; i < state.queries.length; i++) {
const unrated = await api.getUnratedIdeas(state.queries[i].query_id, state.rater.rater_id);
if (unrated.ideas.length > 0) {
queryIndex = i;
queryData = unrated;
break;
}
}
if (!queryData) {
// All done
setState((s) => ({
...s,
view: 'completion',
loading: false,
}));
return;
}
setState((s) => ({
...s,
view: 'assessment',
currentQueryIndex: queryIndex,
currentQuery: queryData,
currentIdeaIndex: 0,
loading: false,
}));
} catch (err) {
setState((s) => ({
...s,
error: err instanceof Error ? err.message : 'Failed to start assessment',
loading: false,
}));
}
}, [state.rater, state.queries]);
// Move to next idea
const nextIdea = useCallback(async () => {
if (!state.currentQuery || !state.rater) return;
const nextIndex = state.currentIdeaIndex + 1;
if (nextIndex < state.currentQuery.ideas.length) {
// More ideas in current query
setState((s) => ({ ...s, currentIdeaIndex: nextIndex }));
} else {
// Query complete, try to move to next query
const nextQueryIndex = state.currentQueryIndex + 1;
if (nextQueryIndex < state.queries.length) {
setState((s) => ({ ...s, loading: true }));
try {
const unrated = await api.getUnratedIdeas(
state.queries[nextQueryIndex].query_id,
state.rater.rater_id
);
if (unrated.ideas.length > 0) {
setState((s) => ({
...s,
currentQueryIndex: nextQueryIndex,
currentQuery: unrated,
currentIdeaIndex: 0,
loading: false,
}));
} else {
// Try to find next query with unrated ideas
for (let i = nextQueryIndex + 1; i < state.queries.length; i++) {
const nextUnrated = await api.getUnratedIdeas(
state.queries[i].query_id,
state.rater.rater_id
);
if (nextUnrated.ideas.length > 0) {
setState((s) => ({
...s,
currentQueryIndex: i,
currentQuery: nextUnrated,
currentIdeaIndex: 0,
loading: false,
}));
return;
}
}
// All queries complete
setState((s) => ({
...s,
view: 'completion',
loading: false,
}));
}
} catch (err) {
setState((s) => ({
...s,
error: err instanceof Error ? err.message : 'Failed to load next query',
loading: false,
}));
}
} else {
// All queries complete
setState((s) => ({ ...s, view: 'completion' }));
}
}
// Refresh progress
try {
const progress = await api.getRaterProgress(state.rater.rater_id);
setState((s) => ({ ...s, progress }));
} catch (err) {
console.error('Failed to refresh progress:', err);
}
}, [state.currentQuery, state.currentIdeaIndex, state.currentQueryIndex, state.queries, state.rater]);
// Move to previous idea
const prevIdea = useCallback(() => {
if (state.currentIdeaIndex > 0) {
setState((s) => ({ ...s, currentIdeaIndex: s.currentIdeaIndex - 1 }));
}
}, [state.currentIdeaIndex]);
// Jump to a specific query
const jumpToQuery = useCallback(async (queryIndex: number) => {
if (!state.rater || queryIndex < 0 || queryIndex >= state.queries.length) return;
setState((s) => ({ ...s, loading: true }));
try {
const queryData = await api.getQueryWithIdeas(state.queries[queryIndex].query_id);
setState((s) => ({
...s,
currentQueryIndex: queryIndex,
currentQuery: queryData,
currentIdeaIndex: 0,
view: 'assessment',
loading: false,
}));
} catch (err) {
setState((s) => ({
...s,
error: err instanceof Error ? err.message : 'Failed to load query',
loading: false,
}));
}
}, [state.rater, state.queries]);
// Refresh progress
const refreshProgress = useCallback(async () => {
if (!state.rater) return;
try {
const progress = await api.getRaterProgress(state.rater.rater_id);
setState((s) => ({ ...s, progress }));
} catch (err) {
console.error('Failed to refresh progress:', err);
}
}, [state.rater]);
// Show definitions
const showInstructions = useCallback(() => {
setState((s) => ({ ...s, view: 'instructions' }));
}, []);
// Return to assessment
const returnToAssessment = useCallback(() => {
setState((s) => ({ ...s, view: 'assessment' }));
}, []);
// Logout
const logout = useCallback(() => {
setState(initialState);
}, []);
// Get current idea
const currentIdea = state.currentQuery?.ideas[state.currentIdeaIndex] ?? null;
return {
...state,
currentIdea,
login,
startAssessment,
nextIdea,
prevIdea,
jumpToQuery,
refreshProgress,
showInstructions,
returnToAssessment,
logout,
};
}

View File

@@ -0,0 +1,133 @@
/**
* Hook for managing rating submission.
*/
import { useState, useCallback } from 'react';
import type { RatingState, DimensionKey } from '../types';
import * as api from '../services/api';
interface UseRatingsOptions {
raterId: string | null;
queryId: string | null;
ideaId: string | null;
onSuccess?: () => void;
}
export function useRatings({ raterId, queryId, ideaId, onSuccess }: UseRatingsOptions) {
const [ratings, setRatings] = useState<RatingState>({
originality: null,
elaboration: null,
coherence: null,
usefulness: null,
});
const [submitting, setSubmitting] = useState(false);
const [error, setError] = useState<string | null>(null);
// Set a single rating
const setRating = useCallback((dimension: DimensionKey, value: number | null) => {
setRatings((prev) => ({ ...prev, [dimension]: value }));
}, []);
// Reset all ratings
const resetRatings = useCallback(() => {
setRatings({
originality: null,
elaboration: null,
coherence: null,
usefulness: null,
});
setError(null);
}, []);
// Check if all ratings are set
const isComplete = useCallback(() => {
return (
ratings.originality !== null &&
ratings.elaboration !== null &&
ratings.coherence !== null &&
ratings.usefulness !== null
);
}, [ratings]);
// Submit rating
const submit = useCallback(async () => {
if (!raterId || !queryId || !ideaId) {
setError('Missing required information');
return false;
}
if (!isComplete()) {
setError('Please rate all dimensions');
return false;
}
setSubmitting(true);
setError(null);
try {
await api.submitRating({
rater_id: raterId,
idea_id: ideaId,
query_id: queryId,
originality: ratings.originality,
elaboration: ratings.elaboration,
coherence: ratings.coherence,
usefulness: ratings.usefulness,
skipped: false,
});
resetRatings();
onSuccess?.();
return true;
} catch (err) {
setError(err instanceof Error ? err.message : 'Failed to submit rating');
return false;
} finally {
setSubmitting(false);
}
}, [raterId, queryId, ideaId, ratings, isComplete, resetRatings, onSuccess]);
// Skip idea
const skip = useCallback(async () => {
if (!raterId || !queryId || !ideaId) {
setError('Missing required information');
return false;
}
setSubmitting(true);
setError(null);
try {
await api.submitRating({
rater_id: raterId,
idea_id: ideaId,
query_id: queryId,
originality: null,
elaboration: null,
coherence: null,
usefulness: null,
skipped: true,
});
resetRatings();
onSuccess?.();
return true;
} catch (err) {
setError(err instanceof Error ? err.message : 'Failed to skip idea');
return false;
} finally {
setSubmitting(false);
}
}, [raterId, queryId, ideaId, resetRatings, onSuccess]);
return {
ratings,
setRating,
resetRatings,
isComplete,
submit,
skip,
submitting,
error,
};
}

View File

@@ -0,0 +1,43 @@
:root {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif;
line-height: 1.5;
font-weight: 400;
color-scheme: light;
color: rgba(0, 0, 0, 0.88);
background-color: #f5f5f5;
font-synthesis: none;
text-rendering: optimizeLegibility;
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
body {
margin: 0;
min-height: 100vh;
}
#root {
min-height: 100vh;
}
/* Custom scrollbar */
::-webkit-scrollbar {
width: 8px;
height: 8px;
}
::-webkit-scrollbar-track {
background: #f1f1f1;
border-radius: 4px;
}
::-webkit-scrollbar-thumb {
background: #c1c1c1;
border-radius: 4px;
}
::-webkit-scrollbar-thumb:hover {
background: #a8a8a8;
}

View File

@@ -0,0 +1,10 @@
import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import './index.css'
import App from './App'
createRoot(document.getElementById('root')!).render(
<StrictMode>
<App />
</StrictMode>,
)

View File

@@ -0,0 +1,116 @@
/**
* API client for the assessment backend.
*/
import type {
DimensionDefinitions,
QueryInfo,
QueryWithIdeas,
Rater,
RaterCreate,
RaterProgress,
Rating,
RatingSubmit,
SessionInfo,
Statistics,
} from '../types';
const API_BASE = '/api';
async function fetchJson<T>(url: string, options?: RequestInit): Promise<T> {
const response = await fetch(`${API_BASE}${url}`, {
headers: {
'Content-Type': 'application/json',
...options?.headers,
},
...options,
});
if (!response.ok) {
const error = await response.json().catch(() => ({ detail: response.statusText }));
throw new Error(error.detail || 'API request failed');
}
return response.json();
}
// Rater API
export async function listRaters(): Promise<Rater[]> {
return fetchJson<Rater[]>('/raters');
}
export async function createOrGetRater(data: RaterCreate): Promise<Rater> {
return fetchJson<Rater>('/raters', {
method: 'POST',
body: JSON.stringify(data),
});
}
export async function getRater(raterId: string): Promise<Rater> {
return fetchJson<Rater>(`/raters/${encodeURIComponent(raterId)}`);
}
// Query API
export async function listQueries(): Promise<QueryInfo[]> {
return fetchJson<QueryInfo[]>('/queries');
}
export async function getQueryWithIdeas(queryId: string): Promise<QueryWithIdeas> {
return fetchJson<QueryWithIdeas>(`/queries/${encodeURIComponent(queryId)}`);
}
export async function getUnratedIdeas(queryId: string, raterId: string): Promise<QueryWithIdeas> {
return fetchJson<QueryWithIdeas>(
`/queries/${encodeURIComponent(queryId)}/unrated?rater_id=${encodeURIComponent(raterId)}`
);
}
// Rating API
export async function submitRating(rating: RatingSubmit): Promise<{ saved: boolean }> {
return fetchJson<{ saved: boolean }>('/ratings', {
method: 'POST',
body: JSON.stringify(rating),
});
}
export async function getRating(raterId: string, ideaId: string): Promise<Rating | null> {
try {
return await fetchJson<Rating>(`/ratings/${encodeURIComponent(raterId)}/${encodeURIComponent(ideaId)}`);
} catch {
return null;
}
}
export async function getRatingsByRater(raterId: string): Promise<Rating[]> {
return fetchJson<Rating[]>(`/ratings/rater/${encodeURIComponent(raterId)}`);
}
// Progress API
export async function getRaterProgress(raterId: string): Promise<RaterProgress> {
return fetchJson<RaterProgress>(`/progress/${encodeURIComponent(raterId)}`);
}
// Statistics API
export async function getStatistics(): Promise<Statistics> {
return fetchJson<Statistics>('/statistics');
}
// Dimension definitions API
export async function getDimensionDefinitions(): Promise<DimensionDefinitions> {
return fetchJson<DimensionDefinitions>('/dimensions');
}
// Session info API
export async function getSessionInfo(): Promise<SessionInfo> {
return fetchJson<SessionInfo>('/info');
}
// Health check
export async function healthCheck(): Promise<boolean> {
try {
await fetchJson<{ status: string }>('/health');
return true;
} catch {
return false;
}
}

View File

@@ -0,0 +1,142 @@
/**
* TypeScript types for the assessment frontend.
*/
// Rater types
export interface Rater {
rater_id: string;
name: string | null;
created_at?: string;
}
export interface RaterCreate {
rater_id: string;
name?: string;
}
// Query types
export interface QueryInfo {
query_id: string;
query_text: string;
category: string;
idea_count: number;
}
export interface IdeaForRating {
idea_id: string;
text: string;
index: number;
}
export interface QueryWithIdeas {
query_id: string;
query_text: string;
category: string;
ideas: IdeaForRating[];
total_count: number;
}
// Rating types
export interface RatingSubmit {
rater_id: string;
idea_id: string;
query_id: string;
originality: number | null;
elaboration: number | null;
coherence: number | null;
usefulness: number | null;
skipped: boolean;
}
export interface Rating {
id: number;
rater_id: string;
idea_id: string;
query_id: string;
originality: number | null;
elaboration: number | null;
coherence: number | null;
usefulness: number | null;
skipped: number;
timestamp: string | null;
}
// Progress types
export interface QueryProgress {
rater_id: string;
query_id: string;
completed_count: number;
total_count: number;
started_at?: string;
updated_at?: string;
}
export interface RaterProgress {
rater_id: string;
queries: QueryProgress[];
total_completed: number;
total_ideas: number;
percentage: number;
}
// Statistics types
export interface Statistics {
rater_count: number;
rating_count: number;
skip_count: number;
rated_ideas: number;
}
// Dimension definition types
export interface DimensionScale {
1: string;
2: string;
3: string;
4: string;
5: string;
}
export interface DimensionDefinition {
name: string;
question: string;
scale: DimensionScale;
low_label: string;
high_label: string;
}
export interface DimensionDefinitions {
originality: DimensionDefinition;
elaboration: DimensionDefinition;
coherence: DimensionDefinition;
usefulness: DimensionDefinition;
}
// Session info
export interface SessionInfo {
experiment_id: string;
total_ideas: number;
query_count: number;
conditions: string[];
randomization_seed: number;
}
// UI State types
export type AppView = 'login' | 'instructions' | 'assessment' | 'completion';
export interface RatingState {
originality: number | null;
elaboration: number | null;
coherence: number | null;
usefulness: number | null;
}
export const EMPTY_RATING_STATE: RatingState = {
originality: null,
elaboration: null,
coherence: null,
usefulness: null,
};
export type DimensionKey = keyof RatingState;
export const DIMENSION_KEYS: DimensionKey[] = ['originality', 'elaboration', 'coherence', 'usefulness'];

View File

@@ -0,0 +1,20 @@
{
"compilerOptions": {
"target": "ES2020",
"useDefineForClassFields": true,
"lib": ["ES2020", "DOM", "DOM.Iterable"],
"module": "ESNext",
"skipLibCheck": true,
"moduleResolution": "bundler",
"allowImportingTsExtensions": true,
"isolatedModules": true,
"moduleDetection": "force",
"noEmit": true,
"jsx": "react-jsx",
"strict": true,
"noUnusedLocals": true,
"noUnusedParameters": true,
"noFallthroughCasesInSwitch": true
},
"include": ["src"]
}

View File

@@ -0,0 +1,16 @@
import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
export default defineConfig({
plugins: [react()],
server: {
host: '0.0.0.0',
port: 5174,
proxy: {
'/api': {
target: 'http://localhost:8002',
changeOrigin: true
}
}
},
})

View File

@@ -0,0 +1,375 @@
#!/usr/bin/env python3
"""
Prepare assessment data from experiment results.
Extracts unique ideas from deduped experiment results, assigns stable IDs,
and randomizes the order within each query for unbiased human assessment.
Usage:
python prepare_data.py # Use latest, all ideas
python prepare_data.py --sample 100 # Sample 100 ideas total
python prepare_data.py --per-query 10 # 10 ideas per query
python prepare_data.py --per-condition 5 # 5 ideas per condition per query
python prepare_data.py --list # List available files
"""
import argparse
import json
import random
from pathlib import Path
from typing import Any
def load_experiment_data(filepath: Path) -> dict[str, Any]:
"""Load experiment data from JSON file."""
with open(filepath, 'r', encoding='utf-8') as f:
return json.load(f)
def sample_ideas_stratified(
ideas: list[dict[str, Any]],
per_condition: int | None = None,
total_limit: int | None = None,
rng: random.Random | None = None
) -> list[dict[str, Any]]:
"""
Sample ideas with stratification by condition.
Args:
ideas: List of ideas with _hidden.condition metadata
per_condition: Max ideas per condition (stratified sampling)
total_limit: Max total ideas (after stratified sampling)
rng: Random number generator for reproducibility
Returns:
Sampled list of ideas
"""
if rng is None:
rng = random.Random()
if per_condition is None and total_limit is None:
return ideas
# Group by condition
by_condition: dict[str, list[dict[str, Any]]] = {}
for idea in ideas:
condition = idea['_hidden']['condition']
if condition not in by_condition:
by_condition[condition] = []
by_condition[condition].append(idea)
# Sample per condition
sampled = []
for condition, cond_ideas in by_condition.items():
rng.shuffle(cond_ideas)
if per_condition is not None:
cond_ideas = cond_ideas[:per_condition]
sampled.extend(cond_ideas)
# Apply total limit if specified
if total_limit is not None and len(sampled) > total_limit:
rng.shuffle(sampled)
sampled = sampled[:total_limit]
return sampled
def extract_ideas_from_condition(
query_id: str,
condition_name: str,
condition_data: dict[str, Any],
idea_counter: dict[str, int]
) -> list[dict[str, Any]]:
"""Extract ideas from a single condition with hidden metadata."""
ideas = []
dedup_data = condition_data.get('dedup', {})
unique_ideas_with_source = dedup_data.get('unique_ideas_with_source', [])
for item in unique_ideas_with_source:
idea_text = item.get('idea', '')
if not idea_text:
continue
# Generate stable idea ID
current_count = idea_counter.get(query_id, 0)
idea_id = f"{query_id}_I{current_count:03d}"
idea_counter[query_id] = current_count + 1
ideas.append({
'idea_id': idea_id,
'text': idea_text,
'_hidden': {
'condition': condition_name,
'expert_name': item.get('expert_name', ''),
'keyword': item.get('keyword', '')
}
})
return ideas
def prepare_assessment_data(
experiment_filepath: Path,
output_filepath: Path,
seed: int = 42,
sample_total: int | None = None,
per_query: int | None = None,
per_condition: int | None = None
) -> dict[str, Any]:
"""
Prepare assessment data from experiment results.
Args:
experiment_filepath: Path to deduped experiment JSON
output_filepath: Path to write assessment items JSON
seed: Random seed for reproducible shuffling
sample_total: Total number of ideas to sample (across all queries)
per_query: Maximum ideas per query
per_condition: Maximum ideas per condition per query (stratified)
Returns:
Assessment data structure
"""
rng = random.Random(seed)
# Load experiment data
data = load_experiment_data(experiment_filepath)
experiment_id = data.get('experiment_id', 'unknown')
conditions = data.get('conditions', [])
results = data.get('results', [])
print(f"Loading experiment: {experiment_id}")
print(f"Conditions: {conditions}")
print(f"Number of queries: {len(results)}")
# Show sampling config
if sample_total or per_query or per_condition:
print(f"Sampling config: total={sample_total}, per_query={per_query}, per_condition={per_condition}")
assessment_queries = []
total_ideas = 0
idea_counter: dict[str, int] = {}
for result in results:
query_id = result.get('query_id', '')
query_text = result.get('query', '')
category = result.get('category', '')
query_ideas = []
# Extract ideas from all conditions
conditions_data = result.get('conditions', {})
for condition_name, condition_data in conditions_data.items():
ideas = extract_ideas_from_condition(
query_id, condition_name, condition_data, idea_counter
)
query_ideas.extend(ideas)
# Apply stratified sampling if per_condition is specified
if per_condition is not None:
query_ideas = sample_ideas_stratified(
query_ideas,
per_condition=per_condition,
rng=rng
)
# Apply per-query limit
if per_query is not None and len(query_ideas) > per_query:
rng.shuffle(query_ideas)
query_ideas = query_ideas[:per_query]
# Shuffle ideas within this query
rng.shuffle(query_ideas)
assessment_queries.append({
'query_id': query_id,
'query_text': query_text,
'category': category,
'ideas': query_ideas,
'idea_count': len(query_ideas)
})
total_ideas += len(query_ideas)
print(f" Query '{query_text}' ({query_id}): {len(query_ideas)} ideas")
# Apply total sample limit across all queries (proportionally)
if sample_total is not None and total_ideas > sample_total:
print(f"\nApplying total sample limit: {sample_total} (from {total_ideas})")
# Calculate proportion to keep
keep_ratio = sample_total / total_ideas
new_total = 0
for query in assessment_queries:
n_keep = max(1, int(len(query['ideas']) * keep_ratio))
rng.shuffle(query['ideas'])
query['ideas'] = query['ideas'][:n_keep]
query['idea_count'] = len(query['ideas'])
new_total += len(query['ideas'])
total_ideas = new_total
# Build output structure
assessment_data = {
'experiment_id': experiment_id,
'queries': assessment_queries,
'total_ideas': total_ideas,
'query_count': len(assessment_queries),
'conditions': conditions,
'randomization_seed': seed,
'sampling': {
'sample_total': sample_total,
'per_query': per_query,
'per_condition': per_condition
},
'metadata': {
'source_file': str(experiment_filepath.name),
'prepared_for': 'human_assessment'
}
}
# Write output
output_filepath.parent.mkdir(parents=True, exist_ok=True)
with open(output_filepath, 'w', encoding='utf-8') as f:
json.dump(assessment_data, f, ensure_ascii=False, indent=2)
print(f"\nTotal ideas for assessment: {total_ideas}")
print(f"Output written to: {output_filepath}")
return assessment_data
def list_experiment_files(results_dir: Path) -> list[Path]:
"""List available deduped experiment files."""
return sorted(results_dir.glob('*_deduped.json'), key=lambda p: p.stat().st_mtime, reverse=True)
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(
description='Prepare assessment data from experiment results.',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
python prepare_data.py # Use latest, all ideas
python prepare_data.py --sample 100 # Sample 100 ideas total
python prepare_data.py --per-query 20 # Max 20 ideas per query
python prepare_data.py --per-condition 4 # 4 ideas per condition per query
python prepare_data.py --per-condition 4 --per-query 15 # Combined limits
python prepare_data.py --list # List available files
Recommended for human assessment:
# 5 conditions × 4 ideas × 10 queries = 200 ideas (balanced)
python prepare_data.py --per-condition 4
# Or limit total to ~150 ideas
python prepare_data.py --sample 150
"""
)
parser.add_argument(
'experiment_file',
nargs='?',
default=None,
help='Experiment file name (e.g., experiment_20260119_165650_deduped.json)'
)
parser.add_argument(
'--list', '-l',
action='store_true',
help='List available experiment files'
)
parser.add_argument(
'--sample',
type=int,
default=None,
metavar='N',
help='Total number of ideas to sample (proportionally across queries)'
)
parser.add_argument(
'--per-query',
type=int,
default=None,
metavar='N',
help='Maximum ideas per query'
)
parser.add_argument(
'--per-condition',
type=int,
default=None,
metavar='N',
help='Maximum ideas per condition per query (stratified sampling)'
)
parser.add_argument(
'--seed', '-s',
type=int,
default=42,
help='Random seed for shuffling (default: 42)'
)
args = parser.parse_args()
# Paths
base_dir = Path(__file__).parent.parent
results_dir = base_dir / 'results'
output_file = Path(__file__).parent / 'data' / 'assessment_items.json'
# List available files
available_files = list_experiment_files(results_dir)
if args.list:
print("Available experiment files (most recent first):")
for f in available_files:
size_kb = f.stat().st_size / 1024
print(f" {f.name} ({size_kb:.1f} KB)")
return
# Determine which file to use
if args.experiment_file:
experiment_file = results_dir / args.experiment_file
if not experiment_file.exists():
# Try without .json extension
experiment_file = results_dir / f"{args.experiment_file}.json"
else:
# Use the latest deduped file
if not available_files:
print("Error: No deduped experiment files found in results directory.")
return
experiment_file = available_files[0]
print(f"Using latest experiment file: {experiment_file.name}")
if not experiment_file.exists():
print(f"Error: Experiment file not found: {experiment_file}")
print("\nAvailable files:")
for f in available_files:
print(f" {f.name}")
return
prepare_assessment_data(
experiment_file,
output_file,
seed=args.seed,
sample_total=args.sample,
per_query=args.per_query,
per_condition=args.per_condition
)
# Verify output
with open(output_file, 'r') as f:
data = json.load(f)
print("\n--- Verification ---")
print(f"Queries: {data['query_count']}")
print(f"Total ideas: {data['total_ideas']}")
# Show distribution by condition (from hidden metadata)
condition_counts: dict[str, int] = {}
for query in data['queries']:
for idea in query['ideas']:
condition = idea['_hidden']['condition']
condition_counts[condition] = condition_counts.get(condition, 0) + 1
print("\nIdeas per condition:")
for condition, count in sorted(condition_counts.items()):
print(f" {condition}: {count}")
if __name__ == '__main__':
main()

Binary file not shown.

101
experiments/assessment/start.sh Executable file
View File

@@ -0,0 +1,101 @@
#!/bin/bash
# Human Assessment Web Interface Start Script
# This script starts both the backend API and frontend dev server
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$SCRIPT_DIR"
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
echo -e "${GREEN}================================${NC}"
echo -e "${GREEN}Creative Idea Assessment System${NC}"
echo -e "${GREEN}================================${NC}"
echo
# Find Python with FastAPI (use project venv or system)
VENV_PYTHON="$SCRIPT_DIR/../../backend/venv/bin/python"
if [ -x "$VENV_PYTHON" ]; then
PYTHON_CMD="$VENV_PYTHON"
UVICORN_CMD="$SCRIPT_DIR/../../backend/venv/bin/uvicorn"
else
PYTHON_CMD="python3"
UVICORN_CMD="uvicorn"
fi
# Check if assessment data exists
if [ ! -f "data/assessment_items.json" ]; then
echo -e "${YELLOW}Assessment data not found. Running prepare_data.py...${NC}"
$PYTHON_CMD prepare_data.py
echo
fi
# Check if node_modules exist in frontend
if [ ! -d "frontend/node_modules" ]; then
echo -e "${YELLOW}Installing frontend dependencies...${NC}"
cd frontend
npm install
cd ..
echo
fi
# Function to cleanup background processes on exit
cleanup() {
echo
echo -e "${YELLOW}Shutting down...${NC}"
kill $BACKEND_PID 2>/dev/null || true
kill $FRONTEND_PID 2>/dev/null || true
exit 0
}
trap cleanup SIGINT SIGTERM
# Start backend
echo -e "${GREEN}Starting backend API on port 8002...${NC}"
cd backend
$UVICORN_CMD app:app --host 0.0.0.0 --port 8002 --reload &
BACKEND_PID=$!
cd ..
# Wait for backend to start
echo "Waiting for backend to initialize..."
sleep 2
# Check if backend is running
if ! curl -s http://localhost:8002/api/health > /dev/null 2>&1; then
echo -e "${RED}Backend failed to start. Check for errors above.${NC}"
kill $BACKEND_PID 2>/dev/null || true
exit 1
fi
echo -e "${GREEN}Backend is running.${NC}"
echo
# Start frontend
echo -e "${GREEN}Starting frontend on port 5174...${NC}"
cd frontend
npm run dev &
FRONTEND_PID=$!
cd ..
# Wait for frontend to start
sleep 3
echo
echo -e "${GREEN}================================${NC}"
echo -e "${GREEN}Assessment system is running!${NC}"
echo -e "${GREEN}================================${NC}"
echo
echo -e "Backend API: ${YELLOW}http://localhost:8002${NC}"
echo -e "Frontend UI: ${YELLOW}http://localhost:5174${NC}"
echo
echo -e "Press Ctrl+C to stop all services"
echo
# Wait for any process to exit
wait

13
experiments/assessment/stop.sh Executable file
View File

@@ -0,0 +1,13 @@
#!/bin/bash
# Stop the assessment system
echo "Stopping assessment system..."
# Kill backend (uvicorn on port 8002)
pkill -f "uvicorn app:app.*8002" 2>/dev/null && echo "Backend stopped" || echo "Backend not running"
# Kill frontend (vite on port 5174)
pkill -f "vite.*5174" 2>/dev/null && echo "Frontend stopped" || echo "Frontend not running"
echo "Done"