← Back to Dashboard
F3
39.0
F3 Score (strict)
F2
41.0
F2 Score (strict)
37.1%
Recall (strict)
70.7%
Precision (strict)
22
Repos Scored
MiniMax-M2.7
Model
$
$1.11
Total Cost
119s
Avg Latency
Per-Repository Breakdown TP / FP / FN
Per-Repository Scores click headers to sort
Repository F2 Recall Precision TP FP FN
damn-vulnerable-flask-application 45.4 43.3 70.0 6 4 8
damn-vulnerable-graphql-application 39.0 37.1 49.9 13 14 22
dsvpwa 55.7 51.6 86.6 16 2 16
dsvw 58.6 55.6 75.1 15 5 12
dvblab 53.0 52.3 56.2 12 9 10
dvpwa 26.5 24.2 52.4 5 7 17
extremely-vulnerable-flask-app 40.7 35.7 90.9 10 1 18
flask-xss 41.1 36.9 75.5 10 3 18
intentionally-vulnerable-python-application 60.6 57.1 82.2 4 1 3
lets-be-bad-guys 47.3 43.8 70.1 10 4 14
pygoat 41.6 38.1 66.4 27 13 43
python-app 46.1 43.3 63.9 9 5 11
python-insecure-app 42.3 37.5 87.5 3 0 5
pythonssti 55.6 50.0 100.0 1 0 1
threatbyte 32.7 29.2 63.6 7 4 17
vampi 67.3 66.7 71.2 9 4 4
vfapi 59.9 61.1 70.0 6 4 4
vulnerable-api 57.8 57.1 68.5 8 5 6
vulnerable-flask-app 38.1 36.7 46.1 7 8 13
vulnerable-tornado-app 46.2 42.9 66.7 6 3 8
vulnpy 65.0 61.1 89.7 48 5 30
vulpy 39.9 35.2 86.4 19 3 35
Detection by Severity
critical
78%
TP 57 / FP 0 / FN 16
high
56%
TP 111 / FP 0 / FN 87
medium
34%
TP 81 / FP 0 / FN 155
low
5%
TP 3 / FP 0 / FN 56
LLM Operational Metrics
Model & Prompt
ModelMiniMax-M2.7
Prompt Versionsha256:828b00245b42
Prompt Labeldefault-v1
Token Usage avg per run
Input30,099
Output5,274
Total168,656
Cost
Total$1.11
Per Repo$0.02
Per 100 LOC$0.0075
Reliability
Success Rate69%
Timeouts0
JSON Repair Rate6%
Avg Latency118.9s
CWE Family Heatmap recall by repository
Repository Broken Access Co.. Code Injection /.. Command / OS Inj.. Denial of Service Hardcoded Creden.. HTTP Header Inje.. Insecure Deseria.. Missing Authenti.. Open Redirect Other Path Traversal Security Misconf.. Sensitive Data E.. SQL Injection Server-Side Requ.. XPath Injection Cross-Site Scrip.. XML External Ent..
damn-vulnerable-flask-application 100% 100% 100% 0% 0% 0% 0% 100%
damn-vulnerable-graphql-application 100% 100% 0% 0% 0% 18% 100% 0% 40% 100% 100% 0%
dsvpwa 100% 0% 100% 0% 50% 20% 50% 33% 0% 100% 0% 67%
dsvw 100% 100% 100% 100% 0% 100% 100% 0% 100% 0% 0% 100% 100% 100% 25% 100%
dvblab 100% 25% 100% 100% 38% 0% 0% 100%
dvpwa 67% 11% 33% 0% 100% 0%
extremely-vulnerable-flask-app 100% 0% 33% 100% 100% 17% 0% 0% 100% 100% 20%
flask-xss 100% 50% 33% 100% 0% 100% 33% 0% 56%
intentionally-vulnerable-python-application 100% 100% 100% 0% 100% 0% 0%
lets-be-bad-guys 100% 67% 100% 100% 100% 43% 100% 0% 0% 0% 0%
pygoat 60% 100% 67% 44% 67% 50% 4% 100% 0% 20% 100% 100% 0% 100%
python-app 100% 50% 100% 0% 17% 50% 0% 100% 0% 100%
python-insecure-app 50% 0% 0% 0% 100% 50%
pythonssti 100% 0%
threatbyte 0% 50% 50% 11% 100% 50% 0% 100% 100% 0%
vampi 100% 0% 0% 0% 80% 100% 100%
vfapi 0% 0% 0% 100%
vulnerable-api 100% 100% 0% 33% 50% 0% 100% 50% 100%
vulnerable-flask-app 0% 50% 100% 14% 0% 0% 100% 0% 100%
vulnerable-tornado-app 100% 100% 0% 20% 50% 0% 0% 100% 100%
vulnpy 100% 67% 100% 100% 0% 88% 0% 100% 100% 100% 83% 100%
vulpy 50% 0% 62% 0% 23% 0% 0% 0% 100% 50%
CWE Family Detection aggregate