← Back to Dashboard
F3
48.6
F3 Score (strict)
F2
50.3
F2 Score (strict)
46.9%
Recall (strict)
71.2%
Precision (strict)
25
Repos Scored
claude-opus-4-7
Model
$
$32.40
Total Cost
76s
Avg Latency
Per-Repository Breakdown TP / FP / FN
Per-Repository Scores click headers to sort
Repository F2 Recall Precision TP FP FN
damn-vulnerable-flask-application 62.5 62.2 64.6 9 5 6
damn-vulnerable-graphql-application 40.8 37.1 67.1 13 6 22
djangoat 42.3 39.3 62.6 20 13 30
dsvpwa 63.9 62.5 70.7 20 8 12
dsvw 73.3 70.4 88.3 19 3 8
dvblab 65.0 62.1 80.7 14 3 8
dvpwa 39.4 34.8 82.5 8 2 14
extremely-vulnerable-flask-app 53.8 50.0 77.8 14 4 14
flask-xss 27.0 25.0 84.2 7 3 21
insecure-web 73.0 74.1 69.4 7 3 2
intentionally-vulnerable-python-application 81.6 81.0 84.9 6 1 1
lets-be-bad-guys 52.1 48.6 74.0 12 4 12
owasp-web-playground 60.1 58.6 70.1 17 8 12
pygoat 58.8 57.9 65.6 40 23 30
python-app 72.5 72.5 72.6 14 6 6
python-insecure-app 58.5 54.2 87.8 4 1 4
pythonssti 100.0 100.0 100.0 2 0 0
threatbyte 56.2 52.8 75.7 13 4 11
vampi 72.5 71.8 75.6 9 3 4
vfapi 83.9 88.9 68.7 8 4 1
vulnerable-api 71.9 69.0 86.7 10 2 4
vulnerable-flask-app 54.0 51.7 66.9 10 5 10
vulnerable-python-apps 70.2 69.7 73.6 15 5 7
vulnerable-tornado-app 73.2 71.4 82.8 10 2 4
vulpy 33.1 29.6 61.5 16 10 38
Detection by Severity
critical
91%
TP 69 / FP 0 / FN 7
high
61%
TP 131 / FP 1 / FN 84
medium
43%
TP 109 / FP 0 / FN 142
low
30%
TP 17 / FP 0 / FN 39
LLM Operational Metrics
Model & Prompt
Modelclaude-opus-4-7
Prompt Versionsha256:3481f1432c23
Prompt Label
Token Usage avg per run
Input14
Output5,440
Total287,918
Cost
Total$32.40
Per Repo$0.49
Per 100 LOC$0.1844
Reliability
Success Rate85%
Timeouts0
JSON Repair Rate0%
Avg Latency76.1s
CWE Family Heatmap recall by repository
Repository Broken Access Co.. Code Injection /.. Command / OS Inj.. Denial of Service Hardcoded Creden.. HTTP Header Inje.. Insecure Deseria.. Missing Authenti.. Open Redirect Other Path Traversal Security Misconf.. Sensitive Data E.. SQL Injection Server-Side Requ.. XPath Injection Cross-Site Scrip.. XML External Ent..
damn-vulnerable-flask-application 100% 100% 100% 0% 67% 0% 50% 100%
damn-vulnerable-graphql-application 50% 100% 0% 100% 33% 18% 100% 0% 20% 100% 100% 0%
djangoat 0% 100% 100% 67% 100% 29% 0% 23% 50% 50% 25% 100% 14%
dsvpwa 100% 0% 100% 0% 100% 50% 100% 67% 0% 100% 100% 67%
dsvw 100% 100% 100% 100% 100% 100% 100% 0% 100% 100% 50% 100% 100% 100% 50% 100%
dvblab 100% 75% 100% 0% 25% 100% 0% 100%
dvpwa 67% 22% 100% 0% 100% 0%
extremely-vulnerable-flask-app 100% 0% 67% 100% 100% 33% 0% 0% 100% 100% 60%
flask-xss 0% 50% 33% 100% 50% 100% 33% 0% 44%
insecure-web 100% 100% 0% 100% 100% 100%
intentionally-vulnerable-python-application 100% 100% 100% 50% 100% 0% 100%
lets-be-bad-guys 100% 100% 100% 100% 100% 43% 100% 33% 0% 0% 0%
owasp-web-playground 100% 50% 100% 67% 56% 100% 50% 100% 100% 0% 0%
pygoat 100% 100% 67% 89% 100% 75% 46% 100% 67% 60% 100% 100% 0% 100%
python-app 100% 100% 100% 100% 67% 100% 0% 100% 50% 100%
python-insecure-app 100% 100% 0% 0% 0% 50%
pythonssti 100% 100%
threatbyte 100% 50% 100% 22% 100% 50% 50% 100% 100% 33%
vampi 100% 0% 0% 0% 100% 100% 100%
vfapi 100% 100% 0% 100%
vulnerable-api 100% 100% 0% 67% 50% 50% 100% 50% 100%
vulnerable-flask-app 50% 50% 100% 57% 0% 25% 100% 0% 0%
vulnerable-python-apps 100% 50% 100% 60% 100% 0% 100% 100% 100%
vulnerable-tornado-app 100% 100% 0% 20% 100% 100% 100% 100% 100%
vulpy 0% 0% 50% 0% 14% 50% 50% 17% 83% 25%
CWE Family Detection aggregate