πΊοΈ
SF LLM Spatial Reasoning Eval
An interactive dashboard comparing how accurately different LLMs identify San Francisco neighborhood boundaries against authoritative ground truth sources.
Launch Streamlit Dashboard βModels
9 LLMs tested
Places
25 SF neighborhoods
Levels
5 levels of geospatial computation