Methodology
Current Crime Trends
How the Data is Received
There are four primary ways of gathering Current Crime Trend data from agencies for the Crime Index: 1) aggregated by an agency, 2) from an agency’s publicly available Records Management System (RMS), 3) from a state UCR program, or 4) the FBI’s Crime Data Explorer. Data is either published by a local or state agency on its website, sent to the Crime Index by the local or state agency, or published by the FBI on the CDE.
Next Steps
Data comes to the Crime Index in all sorts of formats, so once the data is acquired it is cleaned and transformed into a standardized format. See information on translating NIBRS offenses to UCR offenses under the Glossary on the Data page. From there, submissions are audited to ensure faulty data is removed. Agencies that did not report every month or agencies that have been found to have severely underreported recent months are removed from the state and national samples. Aggregations only include agencies with complete data through the most recent reported month (eg., including “Full Sample,” “Population Groups,” and “Regions.”). See agencies included in the current national and state samples (ie., with complete data) in this list, on the participation map, or in the sourcing table.
Crime Data Caveats
Current crime counts are a snapshot in time and may vary from what is published by that agency or eventually published by the FBI depending on the reporting methodology. These differences are almost always minor and should not impact that agency’s overall trend. Agencies have until April of the following year to report crime data for a given year leading to slightly different figures depending on the source. The purpose of the Crime Index is to accurately portray crime trends rather than precise counts of crimes, which requires accepting that crime data is often inexact and incomplete. Since national crime estimates take a long time to make, gathering data from hundreds of cities presents an achievable workaround that will highlight trends as they develop without having to wait many months for national data.
For more information on sampling, see FAQ: Why Do You Sample Agencies?
Supplementing Data
In some cases, data from agencies for recent years has been supplemented with monthly data reported to the FBI which has been consolidated by Jacob Kaplan. Some agencies supplied data on missing months upon direct request and missing 2021 data for many California agencies was supplied by the California Department of Justice. In a few rare cases, historical monthly reporting for an agency has been inferred when a few reported months are missing and either crime totals for that year are known from other sources or the totals for a few months can be averaged from the rest of the year so that agency can be included in the overall sample. For example, the totals for October through November 2019 for Corvallis, Oregon were not reported to the FBI, but the counts for those months can be inferred from the number of crimes reported for Corvallis in 2019 by the Oregon State Police. These efforts improve the quality of the sample while avoiding any changes to current or previous year crime counts.
Clearance Rate Data
We collect data for clearance rates at two levels, nationwide and agency. Each report monthly counts and 12-month rolling counts of clearance rates. Reporting is through the most recent month of data in the Current Crime Trends visualization. Data is collected and published using two different pipelines as described below:
- Nationwide - These data are scraped monthly from the FBI’s Crime Data Explorer (CDE) as the United States totals for each crime type.
- Agency-level - These data are derived from three sources, much like in the Current Crime Trends section. Data is either published by a local or state agency on its website, sent to the Crime Index by the local or state agency, or published by the FBI on the CDE. The base rate crime counts are the same as the monthly crime counts used in our Current Crime Trends sample for consistency's sake.
Clearance rates are percentages and calculated for individual crime types and aggregate crime categories (Violent and Property Crimes) by the following formula:
(# of Crimes Solved/Total # of Crimes) *100
Current clearance rate visualizations display both monthly raw percentages along with 12-month rolling percentages to account for seasonality.
Clearance Rate Data Caveats
Not every agency accounted for in the Current Crime Trends visualization will be included in the Current Agency-level Clearance Rate visualization due to the unavailability of data, though many will be included.
Agencies will be excluded from the Current Agency-level Clearance Rate if they reported no data, or under-reported data. Under-reported data is simply defined as not having cleared (or solved) more than (>) 1 property crime in a given month. This property crime threshold is included to remove agencies that likely did not report but included a 0 rather than a generic signifier or empty dataset.
Note: Historical clearance data visualizations only report annualized clearance rates and will not include monthly or 12-month rolling rates.
Non-Crime Data
Population figures and regional boundaries are as defined by the FBI for consistency. See FBI regional definitions at the bottom of the glossary.
One exception is made for assigning population to historical data between 1930-1959 for crime and staffing counts. These population figures are derived from CESTA Stanford. "Historical US City Populations, 1790-2010." and Can be accessed independently at, <https://github.com/cestastanford/historical-us-city-populations> The original data comes from the US Census Bureau, Decennial Censuses of Population, 1790-2010 as a product of the decennial census program <https://www.census.gov/programs-surveys/decennial-census.html>. This dataset compiles decennial population figures for incorporated US cities spanning 1790 to 2010. It covers approximately 7,500 incorporated places and is provided in wide format with one column per census decade. The 1930, 1940, 1950, and 1960 figures are drawn directly from published Census Bureau enumeration results — not modeled estimates or intercensal approximations.
Secondary sources (used for cities absent from Census tabulations or for filling gaps) include State Data Centers and Jan Lahmeyer's Populstat database, which itself draws primarily from national statistical agencies and census reports.
Why not just use data directly from the Census Bureau?
The Census Bureau API does not expose place-level data prior to 1990. The National Historical GIS (NHGIS) provides county/state data for 1930-1960 but does not include place-level data tables for these years. The Stanford compilation is therefore among the only machine-readable, city-level sources covering the 1930-1959 era.
How do we calculate annual estimates?
For years between decennial censuses (e.g., 1931-1939, 1941-1949, 1951-1959), annual population estimates are derived by linear interpolation between the two surrounding census anchor years. For example:
1935 estimate = 1930 value + (5/10) * (1940 value - 1930 value)
This is the standard approach for historical intercensal estimation when no official annual estimates are available. The Census Bureau itself used linear interpolation for most historical intercensal estimates prior to the establishment of the Population Estimates Program in the 1960s.
This data does have a few caveats to consider:
- Cities below ~2,500 population may not appear in the dataset. Small agencies reporting to UCR but below this threshold will have no population match and will show a blank population in the combined output.
- Name and boundary changes between census years are not systematically tracked. A city that was renamed or incorporated differently across decades may appear as a partial match or no match.
- Roughly 6-9% of pre-1960 ORIs could not be matched due to: name spelling variants in the historical source data, "Township" / "Village" suffixes not present in the Stanford dataset, or genuinely small jurisdictions not covered, though manually matching attempted to alleviate many of these un-matched place names.
- Per capita rates are not calculated for those agencies with 0 population or for those years in which there were 0 months of data reported.
Caution Against Ranking
The data collection methodology differs between agencies, not every city or county is included in the Crime Index sample, not every agency reports every offense type, and not every agency has complete data through the most recent reporting period. The FBI similarly cautions against ranking, writing: “Data users should not rank locales because there are many factors that cause the nature and type of crime to vary from place to place. UCR statistics include only jurisdictional population figures along with reported crime, clearance, or arrest data. Rankings ignore the uniqueness of each locale.”
As such, ranking between locations is imprecise and inadvisable, and users should be cautious when comparing crime counts for one agency against another agency’s counts. This caution is especially valid when using the current crime data given that recent agency counts are preliminary and subject to change.
Data Sourcing
Sourcing for every agency can be found in the below table with a link to the source where available. This table identifies the type of data received by the Crime Index (aggregated by the agency, through an agency’s Records Management System, from a state UCR program, or from the FBI’s CDE) as well as how it was received by the Crime Index (through open data, direct from an agency, or direct from a state UCR program). For a list of agencies in the current national and state samples (ie., with complete murder data), click here, or filter to “Nationwide” in the table below. Check out the Crime Index participation map here.
Additional Information
- Frequently Asked Questions
- Data Download
- The Crime Index Github Repository
- Contact Us if you see a mistake, are aware of a new potential data source, or wish to have your agency or state participate.
Search Sources by Agency
Historical Data Library
How the Data Was Obtained
Collecting historical crime data from 1930 to the present required a multi-step process. For data from 1930 to 1959, we worked directly from the FBI's annual Crime in the United States reports, which are only available as archived PDF scans. We identified all relevant tables on crime and law enforcement staffing, then used optical character recognition (OCR) software to convert each table into a spreadsheet. Because these documents are decades-old scans - often blurry, poorly copied, or marked with coffee stains and handwritten notes - the OCR process wasn't perfect and sometimes produced incorrect or unreadable numbers. To address this, we manually reviewed every table, comparing the extracted values against the original PDFs and correcting any errors by hand. We followed the same process for state and national tables covering 1960 to the present.
How the Data Is Maintained
Moving forward, this data will be updated on an annual basis once the FBI releases their figures for the previous year. This will happen sometime in late summer or early fall, wholly dependent on the FBI’s capacity to update their annual files.
Other Data Sources
Population Figures
Population figures are sourced from the UCR program as part of Jacob Kaplan’s collection of Return A’s from 1960-2024. Earlier population figures are documented above under “Non-crime Data”.
Crime Index Sample
Historical Crime and Clearance Data was updated where appropriate for the years 2017-2024 from the Crime Index sample (See above for more information on that collection process and sources). Much of this concerns only 2021 data where the Crime Index team were able to source data to fill in the gaps created by the NIBRS transition. The Crime Index team identified multiple agencies where crime counts were empty or significantly under-reported. The most significant example being in Chicago, IL where Chicago PD’s UCR submissions were drastically under-reported (for more on this particular issue see Jeff Asher’s Substack article).
California DOJ Data
As previously mentioned in the Supplemental data section, data from the California Department of Justice was used to fill in the gaps for California agencies. This includes data for crime and clearances only where missing for 2021.
Next Steps
These data will be updated annually as the FBI releases new master files. Previous years may be updated depending on the activity of individual agencies, but most new data will concern the new year (e.g., 2025).