Files
pdf_signature_extraction/signature_analysis/21_expanded_validation.py
gbanyan 53125d11d9 Paper A v3.20.0: partner Jimmy 2026-04-27 review + DOCX rendering overhaul
Substantive content (addresses partner Jimmy's 2026-04-27 review of v3.19.1):

Must-fix items (6/6):
- §III-F SSIM/pixel rejection rewritten from first principles (design-level
  argument from luminance/contrast/structure local-window product, not the
  prior empirical 0.70 result)
- Table VI restructured by population × method; added missing Firm A
  logit-Gaussian-2 0.999 row; KDE marked undefined (unimodal), BD/McCrary
  marked bin-unstable (Appendix A)
- Tables IX / XI / §IV-F.3 dHash 5/8/15 inconsistency resolved: ≤8 demoted
  from "operational dual" to "calibration-fold-adjacent reference"; the
  actual classifier rule cos>0.95 AND dH≤15 = 92.46% added throughout
- New Fig. 4 (yearly per-firm best-match cosine, 5 lines, 2013-2023, Firm A
  on top); script 30_yearly_big4_comparison.py
- Tables XIV / XV extended with top-20% (94.8%) and top-30% (81.3%) brackets
- §III-K reframed P7.5 from "round-number lower-tail boundary" to operating
  point; new Table XII-B (cosine-FAR-capture tradeoff at 5 thresholds:
  0.9407 / 0.945 / 0.95 / 0.977 / 0.985)

Nice-to-have items (3/3):
- Table XII expanded to 6-cut classifier sensitivity grid (0.940-0.985)
- Defensive parentheticals (84,386 vs 85,042; 30,226 vs 30,222) moved to
  table notes; cut "invite reviewer skepticism" and "non-load-bearing"

Codex 3-pass verification cleanup:
- Stale 0.973/0.977/0.979 references unified on canonical 0.977 (Firm A
  Beta-2 forced-fit crossing from beta_mixture_results.json)
- dHash≤8 wording corrected to P95-adjacent (P95 = 9, ≤8 is the integer
  immediately below) instead of misleading "rounded down"
- Table XII-B prose corrected: per-segment qualification of "non-Firm-A
  capture falls faster" (true on 0.95→0.977 segment but contracts on
  0.977→0.985 segment); arithmetic now from exact counts

Within-year analyses removed:
- Within-year ranking robustness check (Class A) was added in nice-to-have
  pass but contradicts v3.14 A2-removal stance; removed from §IV-G.2 + the
  Appendix B provenance row
- Within-CPA future-work disclosures (Class B) removed from Discussion
  limitation #5 and Conclusion future-work paragraph; subsequent limitations
  renumbered Sixth → Fifth, Seventh → Sixth

DOCX rendering pipeline overhaul (paper/export_v3.py):

Critical fix - every v3 DOCX since v3.0 was shipping WITHOUT TABLES:
strip_comments() was wholesale-deleting HTML comments, but every numerical
table is wrapped in <!-- TABLE X: ... -->, so the table body was deleted
alongside the wrapper. Now unwraps TABLE comments (emit synthetic
__TABLE_CAPTION__: marker + table body) while still stripping non-TABLE
editorial comments. Result: 19 tables now render in the DOCX.

Other rendering fixes:
- LaTeX → Unicode conversion (50+ token replacements: Greek alphabet, ≤≥,
  ×·≈, →↔⇒, etc.); \frac/\sqrt linearisation; TeX brace tricks ({=}, {,})
- Math-context-scoped sub/superscript via PUA sentinels (/):
  no more underscore-eating in identifiers like signature_analysis
- Display equations rendered via matplotlib mathtext to PNG (3 equations:
  cosine sim, mixture crossing, BD/McCrary Z statistic), embedded as
  numbered equation blocks (1), (2), (3); content-addressed cache at
  paper/equations/ (gitignored, regenerable)
- Manual numbered/bulleted list rendering with hanging indent (replaces
  python-docx style="List Number" which silently drops the number prefix
  when no numbering definition is bound)
- Markdown blockquote (> ...) defensively stripped
- Pandoc footnote ([^name]) markers no longer leak (inlined at source)
- Heading text cleaned of LaTeX residue + PUA sentinels
- File paths in body text (signature_analysis/X.py, reports/Y.json)
  trimmed to "(reproduction artifact in Appendix B)" pointers

New leak linter: paper/lint_paper_v3.py - two-pass markdown source +
rendered DOCX leak detector; auto-runs at end of export_v3.py.

Script changes:
- 21_expanded_validation.py: added 0.9407, 0.977, 0.985 to canonical FAR
  threshold list so Table XII-B is reproducible from persisted JSON
- 30_yearly_big4_comparison.py: NEW; generates Fig. 4 + per-firm yearly
  data (writes to reports/figures/ and reports/firm_yearly_comparison/)
- 31_within_year_ranking_robustness.py: NEW; supports the within-year
  robustness check (no longer cited in paper but kept as repo-internal
  due-diligence artifact)

Partner handoff DOCX shipped to
~/Downloads/Paper_A_IEEE_Access_Draft_v3.20.0_20260505.docx (536 KB:
19 tables + 4 figures + 3 equation images).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 13:44:49 +08:00

473 lines
18 KiB
Python

#!/usr/bin/env python3
"""
Script 21: Expanded Validation with Larger Negative Anchor + Held-out Firm A
============================================================================
Addresses three weaknesses of Script 19's pixel-identity validation:
(a) Negative anchor of n=35 (cosine<0.70) is too small to give
meaningful FAR confidence intervals.
(b) Pixel-identical positive anchor is a CONSERVATIVE SUBSET of the
true non-hand-signed class, not representative of the broader
positive class. Recall against this subset is therefore a
lower-bound calibration check, not a generalizable recall
estimate.
(c) Firm A is both the calibration anchor and a validation anchor
(circular). The 70/30 fold split makes within-Firm-A sampling
variance visible without claiming external validation.
This script:
1. Constructs a large inter-CPA negative anchor (~50,000 pairs) by
randomly sampling pairs from different CPAs. Inter-CPA high
similarity is highly unlikely to arise from legitimate signing.
2. Splits Firm A CPAs 70/30 into CALIBRATION and HELDOUT folds.
Re-derives signature-level thresholds from the calibration fold
only, then reports capture rates on the heldout fold.
3. Computes 95% Wilson confidence intervals for FAR at canonical
thresholds (Table X in the manuscript).
Legacy / diagnostic-only metrics:
Helper functions for EER, Precision, Recall, F1, and FRR remain in
this script for backward compatibility. The manuscript intentionally
OMITS these metrics from Table X because the byte-identical positive
anchor has cosine ~= 1 by construction (so FRR / EER are arithmetic
tautologies) and because positive and negative anchors are
constructed from different sampling units, making prevalence
arbitrary (so Precision and F1 have no meaningful population
interpretation). Only FAR against the large inter-CPA negative
anchor is reported as a biometric metric in the paper.
Output:
reports/expanded_validation/expanded_validation_report.md
reports/expanded_validation/expanded_validation_results.json
"""
import sqlite3
import json
import numpy as np
from pathlib import Path
from datetime import datetime
from scipy.stats import norm
DB = '/Volumes/NV2/PDF-Processing/signature-analysis/signature_analysis.db'
OUT = Path('/Volumes/NV2/PDF-Processing/signature-analysis/reports/'
'expanded_validation')
OUT.mkdir(parents=True, exist_ok=True)
FIRM_A = '勤業眾信聯合'
N_INTER_PAIRS = 50_000
SEED = 42
def wilson_ci(k, n, alpha=0.05):
if n == 0:
return (0.0, 1.0)
z = norm.ppf(1 - alpha / 2)
phat = k / n
denom = 1 + z * z / n
center = (phat + z * z / (2 * n)) / denom
pm = z * np.sqrt(phat * (1 - phat) / n + z * z / (4 * n * n)) / denom
return (max(0.0, center - pm), min(1.0, center + pm))
def load_signatures():
conn = sqlite3.connect(DB)
cur = conn.cursor()
cur.execute('''
SELECT s.signature_id, s.assigned_accountant, a.firm,
s.max_similarity_to_same_accountant,
s.min_dhash_independent, s.pixel_identical_to_closest
FROM signatures s
LEFT JOIN accountants a ON s.assigned_accountant = a.name
WHERE s.max_similarity_to_same_accountant IS NOT NULL
''')
rows = cur.fetchall()
conn.close()
return rows
def load_signature_ids_for_negative_pool(seed=SEED):
"""Load lightweight (sig_id, accountant) pool from the entire matched
corpus. Per Gemini round-19 review, the prior implementation drew
50,000 inter-CPA pairs from a tiny LIMIT-3000 random subset, reusing
each signature ~33 times and artificially tightening Wilson FAR CIs.
The corrected implementation samples pairs i.i.d. across the FULL
matched corpus (~168k signatures); only the unique signatures that
actually appear in the sampled pairs need feature vectors loaded.
"""
conn = sqlite3.connect(DB)
cur = conn.cursor()
cur.execute('''
SELECT signature_id, assigned_accountant
FROM signatures
WHERE feature_vector IS NOT NULL
AND assigned_accountant IS NOT NULL
''')
rows = cur.fetchall()
conn.close()
sig_ids = np.array([r[0] for r in rows], dtype=np.int64)
accts = np.array([r[1] for r in rows])
return sig_ids, accts
def load_features_for_ids(sig_ids):
conn = sqlite3.connect(DB)
cur = conn.cursor()
placeholders = ','.join('?' * len(sig_ids))
cur.execute(
f'SELECT signature_id, feature_vector FROM signatures '
f'WHERE signature_id IN ({placeholders})',
[int(s) for s in sig_ids],
)
rows = cur.fetchall()
conn.close()
feat_by_id = {}
for sid, blob in rows:
feat_by_id[int(sid)] = np.frombuffer(blob, dtype=np.float32)
return feat_by_id
def build_inter_cpa_negative(sig_ids, accts, n_pairs=N_INTER_PAIRS, seed=SEED):
"""Sample i.i.d. random cross-CPA pairs from the full matched corpus
and return their cosine similarities.
"""
rng = np.random.default_rng(seed)
n = len(sig_ids)
pairs = []
tries = 0
seen_pairs = set()
while len(pairs) < n_pairs and tries < n_pairs * 10:
i = rng.integers(n)
j = rng.integers(n)
if i == j or accts[i] == accts[j]:
tries += 1
continue
a, b = (i, j) if i < j else (j, i)
if (a, b) in seen_pairs:
tries += 1
continue
seen_pairs.add((a, b))
pairs.append((a, b))
tries += 1
needed_ids = sorted({int(sig_ids[i]) for pair in pairs for i in pair})
feat_by_id = load_features_for_ids(needed_ids)
sims = []
for i, j in pairs:
fi = feat_by_id[int(sig_ids[i])]
fj = feat_by_id[int(sig_ids[j])]
sims.append(float(fi @ fj))
return np.array(sims)
def classification_metrics(y_true, y_pred):
y_true = np.asarray(y_true).astype(int)
y_pred = np.asarray(y_pred).astype(int)
tp = int(np.sum((y_true == 1) & (y_pred == 1)))
fp = int(np.sum((y_true == 0) & (y_pred == 1)))
fn = int(np.sum((y_true == 1) & (y_pred == 0)))
tn = int(np.sum((y_true == 0) & (y_pred == 0)))
p_den = max(tp + fp, 1)
r_den = max(tp + fn, 1)
far_den = max(fp + tn, 1)
frr_den = max(fn + tp, 1)
precision = tp / p_den
recall = tp / r_den
f1 = (2 * precision * recall / (precision + recall)
if (precision + recall) > 0 else 0.0)
far = fp / far_den
frr = fn / frr_den
far_ci = wilson_ci(fp, far_den)
frr_ci = wilson_ci(fn, frr_den)
return {
'tp': tp, 'fp': fp, 'fn': fn, 'tn': tn,
'precision': float(precision),
'recall': float(recall),
'f1': float(f1),
'far': float(far),
'frr': float(frr),
'far_ci95': [float(x) for x in far_ci],
'frr_ci95': [float(x) for x in frr_ci],
'n_pos': int(tp + fn),
'n_neg': int(tn + fp),
}
def sweep_threshold(scores, y, direction, thresholds):
out = []
for t in thresholds:
if direction == 'above':
y_pred = (scores > t).astype(int)
else:
y_pred = (scores < t).astype(int)
m = classification_metrics(y, y_pred)
m['threshold'] = float(t)
out.append(m)
return out
def find_eer(sweep):
thr = np.array([s['threshold'] for s in sweep])
far = np.array([s['far'] for s in sweep])
frr = np.array([s['frr'] for s in sweep])
diff = far - frr
signs = np.sign(diff)
changes = np.where(np.diff(signs) != 0)[0]
if len(changes) == 0:
idx = int(np.argmin(np.abs(diff)))
return {'threshold': float(thr[idx]), 'far': float(far[idx]),
'frr': float(frr[idx]),
'eer': float(0.5 * (far[idx] + frr[idx]))}
i = int(changes[0])
w = abs(diff[i]) / (abs(diff[i]) + abs(diff[i + 1]) + 1e-12)
thr_i = (1 - w) * thr[i] + w * thr[i + 1]
far_i = (1 - w) * far[i] + w * far[i + 1]
frr_i = (1 - w) * frr[i] + w * frr[i + 1]
return {'threshold': float(thr_i), 'far': float(far_i),
'frr': float(frr_i),
'eer': float(0.5 * (far_i + frr_i))}
def main():
print('=' * 70)
print('Script 21: Expanded Validation')
print('=' * 70)
rows = load_signatures()
print(f'\nLoaded {len(rows):,} signatures')
sig_ids = [r[0] for r in rows]
accts = [r[1] for r in rows]
firms = [r[2] or '(unknown)' for r in rows]
cos = np.array([r[3] for r in rows], dtype=float)
dh = np.array([-1 if r[4] is None else r[4] for r in rows], dtype=float)
pix = np.array([r[5] or 0 for r in rows], dtype=int)
firm_a_mask = np.array([f == FIRM_A for f in firms])
print(f'Firm A signatures: {int(firm_a_mask.sum()):,}')
# --- (1) INTER-CPA NEGATIVE ANCHOR ---
print(f'\n[1] Building inter-CPA negative anchor ({N_INTER_PAIRS} '
f'i.i.d. pairs from full matched corpus)...')
pool_sig_ids, pool_accts = load_signature_ids_for_negative_pool()
print(f' pool size: {len(pool_sig_ids):,} matched signatures')
inter_cos = build_inter_cpa_negative(pool_sig_ids, pool_accts,
n_pairs=N_INTER_PAIRS)
print(f' inter-CPA cos: mean={inter_cos.mean():.4f}, '
f'p95={np.percentile(inter_cos, 95):.4f}, '
f'p99={np.percentile(inter_cos, 99):.4f}, '
f'max={inter_cos.max():.4f}')
# --- (2) POSITIVES ---
# Pixel-identical (gold) + optional Firm A extension
pos_pix_mask = pix == 1
n_pix = int(pos_pix_mask.sum())
print(f'\n[2] Positive anchors:')
print(f' pixel-identical signatures: {n_pix}')
# Build negative anchor scores = inter-CPA cosine distribution
# Positive anchor scores = pixel-identical signatures' max same-CPA cosine
# NB: the two distributions are not drawn from the same random variable
# (one is intra-CPA max, the other is inter-CPA random), so we treat the
# inter-CPA distribution as a negative reference for threshold sweep.
# Combined labeled set: positives=pixel-identical sigs' max cosine,
# negatives=inter-CPA random pair cosines.
pos_scores = cos[pos_pix_mask]
neg_scores = inter_cos
y = np.concatenate([np.ones(len(pos_scores)),
np.zeros(len(neg_scores))])
scores = np.concatenate([pos_scores, neg_scores])
# Sweep thresholds
thr = np.linspace(0.30, 1.00, 141)
sweep = sweep_threshold(scores, y, 'above', thr)
eer = find_eer(sweep)
print(f'\n[3] Cosine EER (pos=pixel-identical, neg=inter-CPA n={len(inter_cos)}):')
print(f" threshold={eer['threshold']:.4f}, EER={eer['eer']:.4f}")
# Canonical threshold evaluations with Wilson CIs
canonical = {}
for tt in [0.70, 0.80, 0.837, 0.90, 0.9407, 0.945, 0.95, 0.973, 0.977,
0.979, 0.985]:
y_pred = (scores > tt).astype(int)
m = classification_metrics(y, y_pred)
m['threshold'] = float(tt)
canonical[f'cos>{tt:.3f}'] = m
print(f" @ {tt:.3f}: P={m['precision']:.3f}, R={m['recall']:.3f}, "
f"FAR={m['far']:.4f} (CI95={m['far_ci95'][0]:.4f}-"
f"{m['far_ci95'][1]:.4f}), FRR={m['frr']:.4f}")
# --- (3) HELD-OUT FIRM A ---
print('\n[4] Held-out Firm A 70/30 split:')
rng = np.random.default_rng(SEED)
firm_a_accts = sorted(set(a for a, f in zip(accts, firms) if f == FIRM_A))
rng.shuffle(firm_a_accts)
n_calib = int(0.7 * len(firm_a_accts))
calib_accts = set(firm_a_accts[:n_calib])
heldout_accts = set(firm_a_accts[n_calib:])
print(f' Calibration fold CPAs: {len(calib_accts)}, '
f'heldout fold CPAs: {len(heldout_accts)}')
calib_mask = np.array([a in calib_accts for a in accts])
heldout_mask = np.array([a in heldout_accts for a in accts])
print(f' Calibration sigs: {int(calib_mask.sum())}, '
f'heldout sigs: {int(heldout_mask.sum())}')
# Derive per-signature thresholds from calibration fold:
# - Firm A cos median, 1st-pct, 5th-pct
# - Firm A dHash median, 95th-pct
calib_cos = cos[calib_mask]
calib_dh = dh[calib_mask]
calib_dh = calib_dh[calib_dh >= 0]
cal_cos_med = float(np.median(calib_cos))
cal_cos_p1 = float(np.percentile(calib_cos, 1))
cal_cos_p5 = float(np.percentile(calib_cos, 5))
cal_dh_med = float(np.median(calib_dh))
cal_dh_p95 = float(np.percentile(calib_dh, 95))
print(f' Calib Firm A cos: median={cal_cos_med:.4f}, P1={cal_cos_p1:.4f}, P5={cal_cos_p5:.4f}')
print(f' Calib Firm A dHash: median={cal_dh_med:.2f}, P95={cal_dh_p95:.2f}')
# Apply canonical rules to heldout fold
held_cos = cos[heldout_mask]
held_dh = dh[heldout_mask]
held_dh_valid = held_dh >= 0
held_rates = {}
for tt in [0.837, 0.945, 0.95, cal_cos_p5]:
rate = float(np.mean(held_cos > tt))
k = int(np.sum(held_cos > tt))
lo, hi = wilson_ci(k, len(held_cos))
held_rates[f'cos>{tt:.4f}'] = {
'rate': rate, 'k': k, 'n': int(len(held_cos)),
'wilson95': [float(lo), float(hi)],
}
for tt in [5, 8, 15, cal_dh_p95]:
rate = float(np.mean(held_dh[held_dh_valid] <= tt))
k = int(np.sum(held_dh[held_dh_valid] <= tt))
lo, hi = wilson_ci(k, int(held_dh_valid.sum()))
held_rates[f'dh_indep<={tt:.2f}'] = {
'rate': rate, 'k': k, 'n': int(held_dh_valid.sum()),
'wilson95': [float(lo), float(hi)],
}
# Dual rule
dual_mask = (held_cos > 0.95) & (held_dh >= 0) & (held_dh <= 8)
rate = float(np.mean(dual_mask))
k = int(dual_mask.sum())
lo, hi = wilson_ci(k, len(dual_mask))
held_rates['cos>0.95 AND dh<=8'] = {
'rate': rate, 'k': k, 'n': int(len(dual_mask)),
'wilson95': [float(lo), float(hi)],
}
print(' Heldout Firm A rates:')
for k, v in held_rates.items():
print(f' {k}: {v["rate"]*100:.2f}% '
f'[{v["wilson95"][0]*100:.2f}, {v["wilson95"][1]*100:.2f}]')
# --- Save ---
summary = {
'generated_at': datetime.now().isoformat(),
'n_signatures': len(rows),
'n_firm_a': int(firm_a_mask.sum()),
'n_pixel_identical': n_pix,
'n_inter_cpa_negatives': len(inter_cos),
'inter_cpa_cos_stats': {
'mean': float(inter_cos.mean()),
'p95': float(np.percentile(inter_cos, 95)),
'p99': float(np.percentile(inter_cos, 99)),
'max': float(inter_cos.max()),
},
'cosine_eer': eer,
'canonical_thresholds': canonical,
'held_out_firm_a': {
'calibration_cpas': len(calib_accts),
'heldout_cpas': len(heldout_accts),
'calibration_sig_count': int(calib_mask.sum()),
'heldout_sig_count': int(heldout_mask.sum()),
'calib_cos_median': cal_cos_med,
'calib_cos_p1': cal_cos_p1,
'calib_cos_p5': cal_cos_p5,
'calib_dh_median': cal_dh_med,
'calib_dh_p95': cal_dh_p95,
'heldout_rates': held_rates,
},
}
with open(OUT / 'expanded_validation_results.json', 'w') as f:
json.dump(summary, f, indent=2, ensure_ascii=False)
print(f'\nJSON: {OUT / "expanded_validation_results.json"}')
# Markdown
md = [
'# Expanded Validation Report',
f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}",
'',
'## 1. Inter-CPA Negative Anchor',
'',
f'* N random cross-CPA pairs sampled: {len(inter_cos):,}',
f'* Inter-CPA cosine: mean={inter_cos.mean():.4f}, '
f'P95={np.percentile(inter_cos, 95):.4f}, '
f'P99={np.percentile(inter_cos, 99):.4f}, max={inter_cos.max():.4f}',
'',
'This anchor is a meaningful negative set because inter-CPA pairs',
'cannot arise from legitimate reuse of a single signer\'s image.',
'',
'## 2. Cosine Threshold Sweep (pos=pixel-identical, neg=inter-CPA)',
'',
f"EER threshold: {eer['threshold']:.4f}, EER: {eer['eer']:.4f}",
'',
'| Threshold | Precision | Recall | F1 | FAR | FAR 95% CI | FRR |',
'|-----------|-----------|--------|----|-----|------------|-----|',
]
for k, m in canonical.items():
md.append(
f"| {m['threshold']:.3f} | {m['precision']:.3f} | "
f"{m['recall']:.3f} | {m['f1']:.3f} | {m['far']:.4f} | "
f"[{m['far_ci95'][0]:.4f}, {m['far_ci95'][1]:.4f}] | "
f"{m['frr']:.4f} |"
)
md += [
'',
'## 3. Held-out Firm A 70/30 Validation',
'',
f'* Firm A CPAs randomly split by CPA (not by signature) into',
f' calibration (n={len(calib_accts)}) and heldout (n={len(heldout_accts)}).',
f'* Calibration Firm A signatures: {int(calib_mask.sum()):,}. '
f'Heldout signatures: {int(heldout_mask.sum()):,}.',
'',
'### Calibration-fold anchor statistics (for thresholds)',
'',
f'* Firm A cosine: median = {cal_cos_med:.4f}, '
f'P1 = {cal_cos_p1:.4f}, P5 = {cal_cos_p5:.4f}',
f'* Firm A dHash (independent min): median = {cal_dh_med:.2f}, '
f'P95 = {cal_dh_p95:.2f}',
'',
'### Heldout-fold capture rates (with Wilson 95% CIs)',
'',
'| Rule | Heldout rate | Wilson 95% CI | k / n |',
'|------|--------------|---------------|-------|',
]
for k, v in held_rates.items():
md.append(
f"| {k} | {v['rate']*100:.2f}% | "
f"[{v['wilson95'][0]*100:.2f}%, {v['wilson95'][1]*100:.2f}%] | "
f"{v['k']}/{v['n']} |"
)
md += [
'',
'## Interpretation',
'',
'The inter-CPA negative anchor (N ~50,000) gives tight confidence',
'intervals on FAR at each threshold, addressing the small-negative',
'anchor limitation of Script 19 (n=35).',
'',
'The 70/30 Firm A split breaks the circular-validation concern of',
'using the same calibration anchor for threshold derivation and',
'validation. Calibration-fold percentiles derive the thresholds;',
'heldout-fold rates with Wilson 95% CIs show how those thresholds',
'generalize to Firm A CPAs that did not contribute to calibration.',
]
(OUT / 'expanded_validation_report.md').write_text('\n'.join(md),
encoding='utf-8')
print(f'Report: {OUT / "expanded_validation_report.md"}')
if __name__ == '__main__':
main()