← Back to Dashboard
F3
45.8
F3 Score (strict)
F2
47.8
F2 Score (strict)
43.9%
Recall (strict)
75.3%
Precision (strict)
22
Repos Scored
glm-5
Model
$
$6.55
Total Cost
409s
Avg Latency
Per-Repository Breakdown TP / FP / FN
Per-Repository Scores click headers to sort
Repository F2 Recall Precision TP FP FN
damn-vulnerable-flask-application 57.0 53.3 79.8 8 2 7
damn-vulnerable-graphql-application 42.1 42.9 39.5 15 23 20
djangoat 35.2 31.3 69.1 16 7 34
dsvpwa 69.2 65.6 88.8 21 3 11
dsvw 62.8 58.0 94.0 16 1 11
dvblab 65.2 61.4 87.1 14 2 8
dvpwa 37.2 34.1 61.9 8 6 14
flask-xss 53.0 48.8 80.3 14 3 14
insecure-web 70.9 74.1 62.9 7 4 2
intentionally-vulnerable-python-application 61.6 64.3 52.8 4 4 2
lets-be-bad-guys 55.6 51.4 83.5 12 3 12
pygoat 38.4 35.2 73.7 25 14 45
python-app 65.7 63.3 78.0 13 4 7
python-insecure-app 55.6 50.0 100.0 4 0 4
pythonssti 55.6 50.0 100.0 1 0 1
threatbyte 48.0 44.4 73.6 11 4 13
vfapi 83.1 88.9 70.2 8 4 1
vulnerable-api 60.3 57.1 77.6 8 2 6
vulnerable-flask-app 69.0 65.0 92.4 13 1 7
vulnerable-tornado-app 64.1 61.9 77.0 9 3 5
vulnpy 73.6 70.1 93.3 55 4 23
vulpy 30.0 25.9 81.2 14 3 40
Detection by Severity
critical
90%
TP 69 / FP 0 / FN 8
high
57%
TP 119 / FP 0 / FN 89
medium
39%
TP 95 / FP 0 / FN 151
low
26%
TP 14 / FP 0 / FN 39
LLM Operational Metrics
Model & Prompt
Modelglm-5
Prompt Versionsha256:828b00245b42
Prompt Labeldefault-v1
Token Usage avg per run
Input57,126
Output4,789
Total123,606
Cost
Total$6.55
Per Repo$0.11
Per 100 LOC$0.0342
Reliability
Success Rate81%
Timeouts9
JSON Repair Rate1%
Avg Latency409.3s
CWE Family Heatmap recall by repository
Repository Broken Access Co.. Code Injection /.. Command / OS Inj.. Denial of Service Hardcoded Creden.. HTTP Header Inje.. Insecure Deseria.. Missing Authenti.. Open Redirect Other Path Traversal Security Misconf.. Sensitive Data E.. SQL Injection Server-Side Requ.. XPath Injection Cross-Site Scrip.. XML External Ent..
damn-vulnerable-flask-application 100% 100% 100% 100% 33% 0% 0% 100%
damn-vulnerable-graphql-application 100% 100% 0% 100% 17% 18% 100% 0% 60% 100% 100% 0%
djangoat 50% 100% 100% 33% 100% 14% 0% 15% 50% 0% 0% 100% 14%
dsvpwa 100% 50% 100% 0% 100% 40% 50% 100% 0% 100% 100% 67%
dsvw 100% 100% 100% 100% 0% 100% 100% 0% 100% 0% 0% 100% 100% 100% 50% 100%
dvblab 100% 75% 100% 0% 38% 0% 100% 100%
dvpwa 67% 22% 67% 0% 100% 20%
flask-xss 0% 50% 33% 100% 38% 0% 67% 0% 56%
insecure-web 100% 100% 33% 100% 100% 100%
intentionally-vulnerable-python-application 100% 100% 100% 0% 100% 0% 0%
lets-be-bad-guys 100% 100% 100% 100% 100% 43% 100% 0% 0% 0% 0%
pygoat 0% 100% 67% 56% 100% 50% 23% 100% 0% 20% 100% 100% 0% 100%
python-app 100% 100% 100% 0% 50% 50% 0% 100% 50% 100%
python-insecure-app 100% 100% 0% 0% 0% 50%
pythonssti 100% 0%
threatbyte 0% 50% 100% 33% 100% 50% 0% 100% 100% 0%
vfapi 100% 100% 0% 100%
vulnerable-api 100% 100% 0% 67% 50% 0% 100% 50% 100%
vulnerable-flask-app 50% 50% 100% 57% 0% 25% 100% 0% 0%
vulnerable-tornado-app 100% 100% 0% 0% 100% 100% 100% 100% 0%
vulnpy 67% 67% 100% 100% 62% 88% 100% 100% 69% 100% 92% 100%
vulpy 0% 0% 25% 0% 23% 0% 0% 0% 83% 50%
CWE Family Detection aggregate