500 Cities Project Data - Things to Keep in Mind
Updated: May 26, 2020
The500 Cities Project was launched in 2015 (with an updated data release in 2017) by the Robert Wood Johnson Foundation and the Centers for Disease Control and Prevention (CDC) in partnership with the CDC Foundation. The project provides city- and census-tract-level data for three categories of indicators throughout the 500 largest cities in the US: (1) Health Outcomes, (2) Unhealthy Behaviors, and (3) Preventive Service Use. The project’s goal is to provide public health departments, professionals, and organizations (including hospital systems) large-scale, local-level data to help identify emerging health problems and inform targeted community health improvement interventions.
The 500 Cities Project data can help us identify neighborhoods with unique needs.
Prior to the 500 Cities Project, city- and county-level data had been available, but large-scale neighborhood (census tract) data were not available to guide targeted public health interventions to smaller subgroups within a city. Now, local health departments can use the 500 Cities Project data to identify neighborhoods with unique needs. For example, those with high diabetes rates can be targeted for diabetes prevention and management programs. Also, because the 27 indicators in the dataset are consistent across the 500 cities, both individual cities and groups of cities can use the data to plan community health promotion efforts.
However, there are important limitations to the dataset.
Because sufficient primary data are not available for every neighborhood, hyper-localized numbers in the 500 Cities dataset are actually small-area estimates, calculated using statistical data modeling. In other words, more reliable data from a larger population (e.g. state level) are used to infer conclusions about a smaller population (e.g. neighborhood), and these are built into a statistical model for calculation of neighborhood-level numbers.
Generally, small-area estimates are helpful for “telling a story” about communities and neighborhoods. However, users of these data must remember that they are estimates with potential for large margins of error, rather than actual data counts from the neighborhood. Additionally, the 500 Cities data should not be used to measure impact of a community intervention because local health promotion programming is not incorporated into statistical models. Users might be tempted to look at the diabetes rates from the 500 Cities data over time to determine whether their diabetes management programming is effective, but the data cannot realistically explain whether any change in diabetes rates resulted from a particular program.
No large-scale datasets are perfect.
At the end of the day, no large-scale datasets are perfect; there are always limitations. Access to data at the local level is invaluable for creating and implementing targeted health promotion interventions. The key to smart use of large-scale data is understanding existing limitations and taking them into consideration when interpreting and acting upon results. We recommend pairing large-scale, publicly-available data with primary data and/or qualitative data -- collected at the local level -- to provide a more comprehensive picture of community health needs in a particular neighborhood.
Want to talk more about our simple, effective approach to community assessment? Get in touch!