← Back to Dashboard
F3
54.6
F3 Score (strict)
F2
56.4
F2 Score (strict)
52.9%
Recall (strict)
77.1%
Precision (strict)
25
Repos Scored
kimi-k2.6
Model
$
$6.24
Total Cost
603s
Avg Latency
Per-Repository Breakdown TP / FP / FN
Per-Repository Scores click headers to sort
Repository F2 Recall Precision TP FP FN
damn-vulnerable-flask-application 76.4 75.6 82.6 11 2 4
damn-vulnerable-graphql-application 46.4 42.9 78.3 15 6 20
djangoat 44.0 41.0 62.1 20 12 30
dsvw 74.2 70.4 95.0 19 1 8
dvblab 49.5 45.5 86.6 10 2 12
dvpwa 58.2 56.8 65.8 12 7 10
extremely-vulnerable-flask-app 58.5 54.8 81.4 15 4 13
flask-xss 40.3 35.7 83.3 10 2 18
insecure-web 71.5 70.4 76.4 6 2 3
intentionally-vulnerable-python-application 72.5 71.4 79.4 5 1 2
lets-be-bad-guys 67.1 62.5 95.7 15 1 9
owasp-web-playground 68.8 65.5 86.2 19 3 10
pygoat 48.1 44.8 70.5 31 13 39
python-app 33.9 32.5 40.6 6 10 14
python-insecure-app 57.5 54.2 76.7 4 1 4
pythonssti 85.2 83.3 100.0 2 0 0
threatbyte 61.4 58.3 77.8 14 4 10
vampi 70.5 69.2 76.2 9 3 4
vfapi 75.0 72.2 100.0 6 0 2
vulnerable-api 64.8 60.7 89.4 8 1 6
vulnerable-flask-app 57.6 58.3 66.4 12 8 8
vulnerable-python-apps 41.5 37.9 72.4 8 3 14
vulnerable-tornado-app 56.1 52.4 78.5 7 2 7
vulnpy 87.3 88.5 84.3 69 14 9
vulpy 49.2 44.5 87.0 24 4 30
Detection by Severity
critical
85%
TP 70 / FP 0 / FN 12
high
59%
TP 137 / FP 1 / FN 96
medium
50%
TP 134 / FP 1 / FN 134
low
20%
TP 12 / FP 0 / FN 49
LLM Operational Metrics
Model & Prompt
Modelkimi-k2.6
Prompt Versionsha256:3481f1432c23
Prompt Label
Token Usage avg per run
Input26,815
Output17,762
Total291,901
Cost
Total$6.24
Per Repo$0.10
Per 100 LOC$0.0321
Reliability
Success Rate77%
Timeouts2
JSON Repair Rate6%
Avg Latency602.6s
CWE Family Heatmap recall by repository
Repository Broken Access Co.. Code Injection /.. Command / OS Inj.. Denial of Service Hardcoded Creden.. HTTP Header Inje.. Insecure Deseria.. Missing Authenti.. Open Redirect Other Path Traversal Security Misconf.. Sensitive Data E.. SQL Injection Server-Side Requ.. XPath Injection Cross-Site Scrip.. XML External Ent..
damn-vulnerable-flask-application 100% 100% 100% 100% 33% 0% 100% 100%
damn-vulnerable-graphql-application 100% 100% 0% 0% 50% 18% 100% 0% 60% 100% 100% 100%
djangoat 0% 100% 100% 50% 100% 29% 100% 0% 100% 100% 0% 100% 43%
dsvw 100% 100% 100% 100% 100% 100% 100% 0% 100% 50% 50% 100% 100% 100% 100% 100%
dvblab 50% 25% 100% 0% 0% 0% 100% 50%
dvpwa 67% 22% 67% 100% 100% 80%
extremely-vulnerable-flask-app 100% 0% 100% 100% 100% 33% 0% 0% 100% 100% 60%
flask-xss 0% 50% 0% 100% 12% 100% 33% 0% 56%
insecure-web 100% 100% 0% 100% 100% 100%
intentionally-vulnerable-python-application 100% 100% 100% 50% 100% 0% 100%
lets-be-bad-guys 100% 33% 100% 100% 100% 29% 100% 0% 50% 0% 100%
owasp-web-playground 100% 50% 100% 67% 56% 0% 25% 67% 100% 100% 0%
pygoat 40% 75% 67% 100% 100% 50% 41% 100% 0% 20% 100% 100% 0% 100%
python-app 100% 100% 100% 100% 50% 100% 0% 100% 0% 100%
python-insecure-app 50% 100% 0% 100% 0% 50%
pythonssti 100% 0%
threatbyte 100% 100% 50% 44% 100% 50% 0% 100% 100% 67%
vampi 100% 0% 100% 50% 40% 100% 100%
vfapi 0% 0% 0% 80%
vulnerable-api 100% 100% 0% 67% 50% 50% 100% 50% 100%
vulnerable-flask-app 0% 100% 100% 43% 0% 75% 100% 0% 0%
vulnerable-python-apps 50% 0% 0% 60% 0% 0% 0% 100% 50%
vulnerable-tornado-app 100% 100% 0% 20% 100% 100% 100% 100% 0%
vulnpy 100% 100% 69% 100% 0% 100% 0% 100% 100% 100% 83% 100%
vulpy 50% 0% 62% 50% 27% 0% 0% 0% 100% 50%
CWE Family Detection aggregate