Simpson’s Paradox

Rajiv Sethi published an article recently, A Fallacy of Composition, that again addresses bias and police violence. This time, stimulated by an article by Peter Moskos, he looks at the effects of grouping data and how that can obscure or reveal patterns:

But Moskos offers another, quite different reason why bias in individual incidents might not be detected in aggregate data: large regional variations in the use of lethal force.

To illustrate his point, Sethi constructs data for two cities that each show discrimination, but when their data are combined, the bias disappears.

Although not mentioned by name in Sethi’s article, the effect being discussed is sometimes labeled Simpson’s Paradox.

The appearance or disappearance of bias when data is combined or divided is counter intuitive. It will help you to work through some examples to see how it works and I may come back here to write more about this in the future.

For now, I just want to give you Simpson’s Paradox as a search term.

How is the New York Times Reporting of Fryer’s Police Violence Study Flawed?

The New York Times reporting on Fryer’s police violence study is flawed (and will remain flawed) because all three articles, including the Public Editor’s piece, accept the interpretation that Fryer has given: that he has gathered data that represents a set of civilians at equivalent risk of getting shot by police based on civilian behavior and that the actual shootings display no bias (or have blacks less likely to be shot at than whites).

Providing context around the number of stops or arrests for blacks and whites (which the NYT didn’t do sufficiently in their original article) would lessen the damage, but doesn’t do anything to fix the flaw at the heart of the analysis and interpretation of the data. No amount of context can fix an article built around a mistake.

Simply put Fryer has not gathered data for a set of civilians at equivalent risk of getting shot by police.

You can read more about the problem with Fryer’s interpretation of the data here: FRYER STUDY DEMONSTRATES POLICE SHOW BIAS WHEN APPLYING CHARGES THEY BELIEVE JUSTIFY EXTREME FORCE

New York Times Reporting on AN EMPIRICAL ANALYSIS OF RACIAL DIFFERENCES IN POLICE USE OF FORCE

This page contains links to the New York Times articles on AN EMPIRICAL ANALYSIS OF RACIAL DIFFERENCES IN POLICE USE OF FORCE (Roland G. Fryer, Jr), Working Paper 22399, NATIONAL BUREAU OF ECONOMIC RESEARCH [https://www.nber.org/papers/w22399.pdf]

I find the New York Times reporting to be misleading at best. You can see some of the reasons by reading Articles Related to AN EMPIRICAL ANALYSIS OF RACIAL DIFFERENCES IN POLICE USE OF FORCE.

All the same, here are the links to the main New York Times articles about Fryer’s study on police violence:

Surprising New Evidence Shows Bias in Police Use of Force but Not in Shootings
http://www.nytimes.com/2016/07/12/upshot/surprising-new-evidence-shows-bias-in-police-use-of-force-but-not-in-shootings.html
Quoctrung Bui : @qdbui
Amanda Cox : @amandacox

Roland Fryer Answers Reader Questions About His Police Force Study
http://www.nytimes.com/2016/07/13/upshot/roland-fryer-answers-reader-questions-about-his-police-force-study.html
Amanda Cox : @amandacox

Did a Hot-Button Upshot Story Lose Sight of the Readers?
http://publiceditor.blogs.nytimes.com/2016/07/14/upshot-fryer-liz-spayd-public-editor/
Liz Spayd : @spaydl

Articles Related to AN EMPIRICAL ANALYSIS OF RACIAL DIFFERENCES IN POLICE USE OF FORCE

This page contains links to articles and posts commenting on AN EMPIRICAL ANALYSIS OF RACIAL DIFFERENCES IN POLICE USE OF FORCE (Roland G. Fryer, Jr), Working Paper 22399, NATIONAL BUREAU OF ECONOMIC RESEARCH [https://www.nber.org/papers/w22399.pdf]

The selection of links makes no attempt to be exhaustive, neither does it attempt to be representative of all published content (if for no other reason than most articles regurgitate the New York Times reporting on this paper). I’m particularly interested in analyses based on the paper itself, so if you have an interesting link, please send my way. Apologies to anyone I’ve left out.

FRYER STUDY DEMONSTRATES POLICE SHOW BIAS WHEN APPLYING CHARGES THEY BELIEVE JUSTIFY EXTREME FORCE
https://europile.wordpress.com/2016/07/06/fryer-shows-police-bias-in-justifying-extreme-force/
Europile : @europile

yes, there is racial “bias” in police shootings
https://scatter.wordpress.com/2016/07/11/yes-there-is-racial-bias-in-police-shootings/
Michelle S. Phelps : @MichelleSPhelps

WHY IT’S SO HARD TO MEASURE RACIAL BIAS IN POLICE SHOOTINGS
http://www.mtv.com/news/2903957/why-its-so-hard-to-measure-racial-bias-in-police-shootings/
Ezekiel Kweku : @theshrillest

Stop using that one study to pretend racism doesn’t exist in police shootings
https://medium.com/@samswey/stop-using-that-one-study-to-pretend-racism-doesnt-exist-in-police-shootings-17a9e47a7117
Samuel Sinyangwe : @samswey

On Arrest Filters and Empirical Inferences
https://rajivsethi.blogspot.co.uk/2016/07/on-arrest-filters-and-empirical.html
Rajiv Sethi : @rajivatbarnard

Police Use of Force: Notes on a Study
https://rajivsethi.blogspot.co.uk/2016/07/police-use-of-force-notes-on-study.html
Rajiv Sethi : @rajivatbarnard

Roland Fryer is wrong: There is racial bias in shootings by police
http://scholar.harvard.edu/jfeldman/blog/roland-fryer-wrong-there-racial-bias-shootings-police
Justin Feldman

[50] Teenagers in Bikinis: Interpreting Police-Shooting Data
http://datacolada.org/50
Uri Simonsohn : @uri_sohn

About that claim that police are less likely to shoot blacks than whites
http://andrewgelman.com/2016/07/14/about-that-claim-that-police-are-less-likely-to-shoot-blacks-than-whites/
Josh Miller
Andrew Gelman : @StatModeling

Why it’s impossible to calculate the percentage of police shootings that are legitimate
https://www.washingtonpost.com/news/the-watch/wp/2016/07/14/why-its-impossible-to-calculate-the-percentage-of-police-shootings-that-are-legitimate/
Radley Balko : @radleybalko

Here’s why I’m skeptical of Roland Fryer’s new, much-hyped study on police shootings
http://www.vox.com/2016/7/11/12149468/racism-police-shootings-data
Dara Lind : @DLind

Bayes, Race, and Police Killing
http://www.samefacts.com/2016/07/crime-incarceration/bayes-race-and-police-killing/
W. David Ball : @WDavidBall

Did a study really find there aren’t racial disparities in police shootings? Not so fast.
http://www.vox.com/2016/7/11/12148452/police-shootings-racism-study
German Lopez : @germanrlopez

Don’t believe the headlines on the Harvard study purporting to show racial bias does not exist in police shootings
http://gritsforbreakfast.blogspot.co.uk/2016/07/harvard-economist-publishes-study.html
Amanda Woog : @grits4breakfast

Race and Police Shootings: Why Data Sampling Matters
https://mathbabe.org/2016/07/19/race-and-police-shootings-why-data-sampling-matters/
Brian D’Alessandro : @delbrians

Reality Check: A new study claims that while black people might experience more use of force by the police, they’re no more likely to be shot – but the data is misleading
https://www.theguardian.com/news/reality-check/2016/jul/11/study-finds-no-racial-bias-police-shootings-data
Mona Chalabi : @MonaChalabi

 

 

 

WHAT DOES “CRIME AGAINST COP” MEAN?

“Crime against cop” is an informal categorization of those crimes that require the presence of a police officer during the commission of the crime and do not require a 3rd party complainant. This set of crimes is important because it provides an opportunity for police officers to use their discretion in criminalizing civilians or imposing extra penalties. If police officers are unfair, vindictive, or biased, it’s relatively easy for them to add “crime against cop” charges.

This category of crimes is particularly relevant given the release of Fryer’s working paper, AN EMPIRICAL ANALYSIS OF RACIAL DIFFERENCES IN POLICE USE OF FORCE. The surprising result in that paper is based entirely on treating “crime against cop” arrests as truthful indicators of a civilian’s propensity for violence. Fryer uses arrest codes “for the following offenses, from 2000 – 2015: aggravated assault on a peace officer, attempted capital murder of a peace officer, resisting arrest, evading arrest, and interfering in an arrest.” All of these arrest codes are in the category “crime against cop”.

Importantly the Fryer paper contains no discussion on how this set of arrest codes was arrived at or how the data might compare with a different set of arrest codes. Understanding this set of arrest codes is key to interpreting the data.

FRYER STUDY DOES NOT SHOW POLICE LESS LIKELY TO SHOOT BLACKS IN SAME CIRCUMSTANCES

Despite what Fryer claims and the New York Times reports, the data in AN EMPIRICAL ANALYSIS OF RACIAL DIFFERENCES IN POLICE USE OF FORCE does not show that police are less likely to shoot blacks in the same circumstances. It simply shows that blacks are underrepresented in one set of data compared to another. Fryer has interpreted that difference in a particular way, but there are a number of possible interpretations. A more likely explanation is that the data in AN EMPIRICAL ANALYSIS OF RACIAL DIFFERENCES IN POLICE USE OF FORCE shows police bias in using “crime against cop” arrests (which the Fryer study highlights can be used by police to justify extreme force).

You can find a more detailed explanation in FRYER STUDY DEMONSTRATES POLICE SHOW BIAS WHEN APPLYING CHARGES THEY BELIEVE JUSTIFY EXTREME FORCE.

FRYER STUDY DEMONSTRATES POLICE SHOW BIAS WHEN APPLYING CHARGES THEY BELIEVE JUSTIFY EXTREME FORCE

The Fryer report shows racial bias in how police use force (even after controlling for all kinds of variables).

I’m going to say that again: the Fryer report shows that police are racially biased in how they apply force.

Unfortunately, this report is being publicized not for the demonstration of police bias, but for a surprising result which the media has reported like this “when it comes to the most lethal form of force — police shootings — the study finds no racial bias.”.

Let’s look closer at the surprising result because both the Fryer paper and the reporting on it are misleading at best.

The surprising result in the Fryer report is based on data from 1 police department (Houston).

The analysis proceeds by taking a set of data that is claimed to represent “interactions with police that might have resulted in the use of lethal force” and compares it with data that represents “officer involved shootings” (data from incidents that did result in the use of lethal force even if no one actually died).

We’ll call the datasets POTENTIAL and ACTUAL since I suspect this is how Fryer probably thinks of this data.

The paper states that the POTENTIAL dataset is a “random sample of police-civilian interactions from the Houston police department from arrests codes in which lethal force is more likely to be justified”.
The arrest codes used are “for the following offenses, from 2000 – 2015: aggravated assault on a peace officer, attempted capital murder of a peace officer, resisting arrest, evading arrest, and interfering in an arrest.”
The report does not discuss in detail how these arrest codes were chosen, which arrest codes were omitted, and whether any analysis was done using arrest codes other than the ones published.

The report states that the ACTUAL dataset is “all officer involved shootings in Houston from 2000 – 2015”
The report does not discuss which arrest codes end up in incidents contained in the ACTUAL dataset.

Note that the POTENTIAL dataset is not obviously a superset of the ACTUAL dataset (neither is the set of incidents from which the POTENTIAL dataset is sampled).

Fryer combines and analyzes the POTENTIAL and ACTUAL datasets to produce a conclusion: “Blacks are 23.8 percent less likely to be shot by police, relative to whites.”

Unsurprisingly this statement has been taken out of context and reported extensively. (Admittedly, even in context this statement is incredibly misleading).

It doesn’t mean what they think it means.

All it means is blacks are more strongly represented in the POTENTIAL dataset than in the ACTUAL dataset.

Let’s think about this:

The paper claims that the arrest codes used for the POTENTIAL dataset are more likely to justify extreme force. I think we can say that they are more likely to be used to justify extreme force. Arrest codes are controlled by police who have an incentive to add charges. If there are racist police, they will inflate the number of charges against black civilians relative to white ones. Of particular interest is that none of the arrest codes used for the POTENTIAL dataset require a crime to be committed against a civilian. All of the arrest codes used in the POTENTIAL dataset are for crimes against police, so there is no 3rd party validation (unlike, say, a burglary where there would typically be a non-police complainant).

On the other hand, data that ends up in the ACTUAL dataset is different. Police have no incentive to inflate the figures for shootings. Racist police have an incentive to underreport shootings against blacks compared to shootings against whites (whether they ever do is another matter).

The point is that blacks being more strongly represented in the POTENTIAL dataset than in the ACTUAL dataset is easily explained by racism. In fact, there’s an argument to be made that far from showing that police shootings show no racial bias, what this study actually demonstrates is racial bias in the charges police choose to apply; specifically that police show racial bias when applying charges that they believe justify extreme force. Given that we know that police apply force with bias, this result isn’t so surprising is it?