By Tamara Sepper
It’s a tactic that has “served to undermine legitimate law enforcement efforts in this country” and should “be relegated to the dark corridors of our past.” This pointed criticism of fake stash house sting operations came in an opinion by Rubén Castillo, a federal judge in Chicago. Increasingly, the controversial practice is falling into disrepute with jurists across the U.S. Though legal, it has been challenged on constitutional grounds in an unprecedented “criminal class action” litigation spearheaded by Alison Siegler, Founding Director of the Federal Criminal Justice Clinic at the University of Chicago Law School. She and a team of lawyers represented 43 defendants across 12 cases who alleged that the ATF, the Bureau of Alcohol, Tobacco, Firearms, and Explosives, engaged in racial discrimination in violation of the Equal Protection Clause. I spoke with her about the Herculean effort for the second edition of Office Hours.
The following transcript has been edited for brevity and clarity.
What are fake stash house sting operations and how do they generally work?
In these fake stash house cases, what happens is that a federal law enforcement agency literally creates a crime and then chooses people to commit that crime. Sometimes it’s the FBI or the DEA. Often, it’s a federal law enforcement agency known as the ATF, the Bureau of Alcohol, Tobacco, Firearms, and Explosives. These cases typically start with an informant working for the ATF who will go up to a guy on the street who’s down on his luck, and say, “Hey, I know of this stash house, it’s filled with drugs. You want to go and rob this thing with me? If we do this, you will make like half a million dollars for just one day’s work.”
If the targeted person agrees to participate, which often people do because it’s sort of a pot of gold at the end of the rainbow for a lot of people who are poor, a law enforcement officer will get involved and say to this person, “okay, so this stash house is going to be swarming with guards, so you need to bring your friends, you need to bring your guns, you need to have all your friends bring guns,” and they kind of amp this up. In the end, this whole thing turns out to be completely made up. There’s no stash house. There are no drugs. There are no guards. The whole thing’s a total fiction, nothing exists. But the ATF gets it all on tape and then federal prosecutors step in and bring not fictitious, but very real charges against the people who’ve been targeted by the ATF.
To make matters worse, overwhelmingly the ATF goes after poor people of color to commit these made up crimes. A nationwide study showed that 91% of the people they selected were people of color. And in our Chicago cases, it was 92% people of color.
What types of charges are brought in these cases and what kind of sentences do they carry?
Everybody who’s caught up in this gets charged with three federal crimes: drug crimes, gun crimes, and robbery crimes. Two of those crimes carry really high, mandatory minimum penalties. The drug conspiracy carries a 10-year mandatory minimum. But if somebody has even one prior conviction for another drug case, they’re looking at a 15 or 20 year mandatory minimum, just on that crime. And then the gun case that gets brought is a 5-year mandatory minimum. If you were charged with a second gun, you would get a 30-year minimum just for the guns, plus 10 or 20 for the drugs. So you could be looking at a 50-years minimum penalty. Fifty years in prison for this fabricated crime.
Looking at the historical backdrop, how did these sting operations get started and what was the reasoning behind them?
This all got started a while back in Miami on the heels of the war on drugs, which we all know now has been a failed war. And there was one ATF agent who was spearheading this strategy and going all around the country teaching other ATF agents and offices. The reason federal prosecutors and the ATF like this strategy is that it’s like shooting fish in a barrel. They orchestrate it and then take people down. I think they convince themselves and publicly say, we’re going after the worst of the worst, but that’s absolutely not worn out in practice.
How often does the ATF end up arresting dangerous, violent criminals and seizing weapons as opposed to arresting people with minor criminal records?
In our cases, the ATF said we are targeting established robbery crews, people with serious prior convictions, people with access to guns. In reality, the vast majority of the black and Latinx individuals who they went after didn’t meet those requirements. Many weren’t currently criminally active, many didn’t have ready access to guns. They didn’t meet the requirement of being an established robbery crew. In one of our cases, our clients could only find an ancient gun from like the 1800s that they had to duct tape together. I read about another case where the ATF literally provided the gun to the people that it had swept up to commit this crime, taking this fabricated nature of the crime to a whole new level.
To a non-lawyer’s ear, this sounds like entrapment. Why wasn’t the entrapment defense viable here?
People do often feel like this sounds like entrapment. They’re like, how is this even legal? But in practice, the entrapment defense is virtually impossible to prove at trial as a matter of law, because if the prosecution can show that someone was predisposed to commit a crime, then it’s not entrapment. And they can use almost anything to show predisposition, like drug addiction, prior criminal history, things like that. And so, almost everybody who’s tried to fight these cases as entrapment has been convicted and gone to prison for decades.
This litigation had a lot of firsts, but one of the strategies that was unique was that you treated it as a class action. In the criminal law context, there are no class actions. How did this work procedurally?
Before my clinic even got involved, a group of lawyers in Chicago came up with a brilliant idea of attacking these cases as race discrimination, because they saw almost everybody the ATF was targeting to commit this made up crime was black or Latinx. Those lawyers included criminal defense attorney Steven Saltzman, then-federal public defender Candace Jackson-Akiwumi, who’s now a federal judge, and a lot of her colleagues at the federal public defender’s office. They convinced a number of federal judges to do something really rare, which is to order prosecutors to turn over discovery that included internal ATF documents about how they construct these stings and data about what the race of everybody ever charged in a Chicago case was. This was a remarkable discovery victory. A couple months later, lawyers from that group reached out to my clinic and asked us to get involved. And we joined forces with them to to convince federal judges in Chicago to dismiss these cases on race discrimination grounds. I was fortunate to be working with this incredible team. The original lawyers, my colleagues in the clinic, including professors Judith Miller and Erica Zunkel, both of whom are incredibly accomplished former federal public defenders, and many law students of ours who contributed monumentally over the years
Two things made our strategy unique. First, we decided to prove racially biased policing with statistics — to hire an expert and develop statistical evidence that wasn’t available in a lot of the prior cases. And second, this collective or class action strategy. There were 43 individual clients charged in 12 separate cases and they were all pending trial in federal court in Chicago. We became pro bono co-counsel for all these clients and decided to go after this in a coordinated programmatic way, which is rare because usually in the world of criminal defense, you’re talking one client, one case, individual representation. I describe it as a “criminal class action” because the idea was to get the judges to see that the ATF is engaged in a problematic practice that spans a whole lot of cases and has swept up a lot of individual people.
In what other context might this sort of “criminal class action” approach work?
Any time that a lawyer believes that there’s something racially discriminatory happening, especially operations that have been spearheaded by the police or the law enforcement agencies, there’s a possibility of going after it in this kind of way. In fact, during the time that our cases were pending, there was a very successful and somewhat similar litigation for a totally different kind of drug operation happening in San Francisco where they also coordinated across cases and brought a similar racial discrimination claim against the police and had some real success. So anytime things are happening across a whole bunch of people in a similar kind of case, it’s worth considering whether to do it in this coordinated way.
Let’s talk about the legal standards. In order to prove racial discrimination, you have to show that the government engaged in selective enforcement. What’s the legal standard for proving that and why is it so hard?
It goes back to a 1996 case called United States v. Armstrong. That case created a virtually insurmountable standard for proving race discrimination by prosecutors. There, the defense presented evidence that 100% of the people charged in federal crack cocaine cases in LA were black. They said, this is race discrimination, we want information from the prosecution about how are they selecting the people they’re going after in these crack cases. The Supreme Court said, no, you don’t get that discovery. And they created what I call a “catch-22.” The Armstrong Court said, to even get discovery to support your claim of race discrimination, you have to present evidence of race discrimination, the very thing you’re asking for. It’s this totally circular requirement. And it just closed off almost all avenues to alleged racial discrimination by prosecutors. And then, courts expanded Armstrong to race discrimination claims against the police.
What was the Armstrong Court’s rationale for creating this almost impossible to meet standard?
It was very much based on the concept that we have to be really careful about asking questions about why prosecutors are charging cases in the way that they are and what their strategy is. That was basically the Armstrong Court’s concern, which does not apply to the police. And so after Armstrong, part of what we were facing when we began the Chicago litigation was that no court since had ever dismissed a case, either for race discrimination by prosecutors or for race discrimination by the police. And no court had ever even granted discovery information to the defense in support of that kind of race discrimination claim.
Armstrong creates a two-prong test where you have to prove (1) that there was a discriminatory effect and (2) that there was discriminatory intent. What kind of evidence are you presenting to prove these two prongs?
Discriminatory effect is a very high bar. The accused has to present evidence that similarly situated people were available to be targeted by the prosecutors or police, but weren’t in-fact targeted. And when we say “similarly situated people,” we mean similarly situated people of other races, similarly situated white people, for example, in this instance. And the defense in Armstrong lost because they couldn’t make that showing. This similarly situated requirement is really hard to meet because, how do you find somebody who was not prosecuted? How do you find somebody who the ATF did not go after? How do you identify an individual person? There’s no database of those people. As far as discriminatory intent goes, it’s the idea that the policing agency intentionally chose someone, not in spite of their race, but because of their race. And that’s very hard to meet also.
How do you impute discriminatory intent to an organization like the ATF or a police department when there is no smoking gun evidence that would prove discrimination?
The way we decided to prove it was to say, let’s get statistics. We hired an expert, Professor Jeffrey Fagan from Columbia Law School, to do statistical analysis. The 94 people charged in Chicago in every stash house case after 2006 were 92% people of color. Professor Fagan had to compare the racial composition of the people who were targeted by the ATF to the racial composition of a similarly situated benchmark group, which is people out there in the world available to be targeted by the ATF, who met the selection criteria, but were not in fact chosen by the ATF to commit this fake crime.
Professor Fagan said, if you look at the selected group of people in Chicago who the ATF went after, those 94 people, that group is 79% black. If you look at the comparative population of people who also had prior convictions for guns, drugs, and violence out there in the same universe, the same geographical area around the same time period, that group of people is just 55% black. And Professor Fagan, using statistical analysis, said there’s an approximately 0% likelihood that you’re going to get this high number of black people just by chance, which suggests that there’s selectivity afoot by the ATF.
He also then did regression analysis, ruling out any race neutral explanation. And he said, I cannot find any reason that justifies or explains this racial disparity. The only thing left is race. We said, this is evidence of discriminatory effect. This is also evidence of discriminatory intent.
But there were some other things. The ATF has internal criteria for how they select people to commit these crimes. It turns out we were able to show that when they were going after people of color, they absolutely were not following their criteria. But in the few cases where they targeted white individuals for these stings, they were following their criteria very closely. And that, under another Supreme Court case, we argued, was clear evidence of discriminatory intent.
Also, in some of our cases, the ATF had a racialized script that they would use. They would send an agent into the field, someone who either was or appeared to be Mexican and the agent would make a point of explaining or alluding to the fact of his own race, and then would say to a black individual, don’t bring me people who look like me. Suggesting, I want people who look like you, meaning I want you to bring me other black individuals. To us, that was clear evidence of discriminatory intent.
The government hired their own expert witness and made a bunch of counter arguments. Among them that the 290,000 people that were part of the comparison group in Fagan’s report were “absurdly broad,” that it should have been confined to people who were actually arrested for home invasions. They also argued that he didn’t control for various factors like employment and level of education. Also, that the geographic location was too wide, that this effort was part of a surge to stem an uptick in violence, and that local officials had asked for federal assistance. In a hearing, you said, “If you torture the data long enough, it will confess to anything. That’s what the government’s done here.” What did you mean by that? Was the government engaging in bad faith arguments?
It’s not bad faith, no. What I meant by saying, “if you torture the data enough, it will confess to anything” is, look, everybody knows that statistics and data can say a lot of things. And some people will say, it will say whatever you want it to say, really. And so the government’s expert tried to poke holes in Professor Fagan’s report. They said, well, his benchmark group isn’t the right comparison. They’re not similarly situated enough to meet the Armstrong discriminatory effect standard, all of that. But it was very frustrating because Professor Fagan’s statistics clearly met the legal standard. In trying to meet the standard, we ended up raising the bar for ourselves or one of the judges raised the bar for us and said, you’ve finally come closer than anybody else has ever come to meeting the standard, let’s just move it up a little bit and make it even harder. That’s the way it felt when we were litigating this.
One of the judges ultimately agreed with the government’s expert. And he said, the defense evidence is insufficient. That was either wrong or it shows that Armstrong is totally unmeetable or both because it underscores the idea that it’s pretty much impossible to prove race discrimination, even if you have the resources and the time to mount the kind of gold standard defense that our team mounted.
There are two phases where the Armstrong standard comes up: the discovery phase and the merits phase. You distinguished the case at hand from Armstrong in arguing that for law enforcement agencies like the ATF, FBI, or on the local level, police, there should be a lower standard of proof in selective enforcement cases because the nature of their jobs is different from that of prosecutors. Can you explain that distinction?
This is one of the biggest victories of our collective litigation. We actually changed the law on this issue. We made it a lot easier for the defense to get discovery, meaning information that would support a claim of race discrimination by the police specifically, separate and apart from prosecutors. Our clinic collaborated with the federal public defender’s office and their chief of appeals over there, and we litigated a stash house case all the way to the Seventh Circuit Court of Appeals in Chicago that covers Illinois, Indiana, and Wisconsin. We convinced the Seventh Circuit to draw this distinction between prosecutors, which the Armstrong case deals with, and police, which our cases were dealing with, the ATF, the law enforcement folks.
The Seventh Circuit agreed with us. They said, look, that impossible to meet Armstrong standard really shouldn’t apply when you’re trying to get evidence about race discrimination by the police, because the police are different; there’s not the same deference that is paid to prosecutorial decisions. In fact, we often question policing decisions, as we’ve seen in this country. We don’t put them on such a pedestal as prosecutorial decisions. Since then, a couple of other federal courts of appeals have agreed with the Seventh Circuit and even taken this distinction further. So now you’ve got three courts of appeals that have gotten rid of the really high Armstrong hurdle to getting discovery about race discrimination by the police. Now you can get discovery simply by saying, I am aware that there’s a group of people who have been targeted by law enforcement and that group has a really high percentage of people of color. And that alone is enough to get discovery. So it’s a really big and important change.
This change is now leading to an even more important distinction between police and prosecutors. It’s changing the standard for proving discrimination on the merits, getting a case dismissed for race discrimination. Because under Armstrong, if you want to show race discrimination by prosecutors, you have to meet the “clear and convincing evidence” standard. But the Seventh Circuit Court of Appeals just recently became the very first court in the country to say it’s different for police. If you’re trying to get your case dismissed, because the police discriminated on the basis of race, you just have to prove discrimination by a “preponderance of the evidence.” Very different, 51% basically, which, looking back, if that was the standard that the judge had held us to, I think we would’ve met it. He held us to that “clear and convincing evidence” standard.
There was a hearing that took place in December 2017 and it’s been described as unprecedented. What made it so extraordinary?
There were 43 individual people charged across 12 separate cases. Well, those cases were presided over by different judges, and there were nine different judges because a couple of judges had two cases. But those nine judges could have each had their own individual hearing on the experts, on the evidence, on the motion to dismiss because we were filing motions to dismiss across all these cases. But instead, we said to the judges, look, it’s going to be a lot more efficient and practical, and a lot less expensive to have the experts testify just once in one hearing that is in front of all the judges sitting together. And it was amazing, because the judges cleared their schedules for several days to hold this joint hearing. If you could picture this, it’s kind of like the Supreme Court where it’s all those judges sitting up there on one high bench presiding over a single court case, that’s basically what it looked like physically.
It was this big, huge, long bench in the ceremonial courtroom of our federal district court in Chicago with nine judges sitting up there and all the evidence and all the witnesses were directed to all nine judges. But the interesting thing is, because each judge had their own case, they were able to make their own individual decisions. They were not in any way bound to decide things collectively, they were just hearing the evidence together. It was like having a jury made up entirely of federal lifetime appointment judges. And that’s what’s so unprecedented and I don’t think I’ve ever heard of that happening anywhere else in federal court in the country. Certainly it never happened before in our Chicago courthouse.
These cases were ultimately resolved through plea deals. What did the plea bargaining process look like and what sentences did the defendants you represented get?
So we did this big hearing with the experts and the question then was how are these cases going to be resolved? Are the judges going to dismiss the cases? Are the judges going to let the cases continue forward to trial and deny our motion? Are some judges going to do one thing and some judges do another? We really didn’t know what was going to happen. And in the course of the post-hearing briefing and the weeks and months that followed, the prosecutors came to us and said, we’d like to present you and your clients with some plea offers. And they said, this is basically a plea offer that applies to every single client, but no one is bound to anybody else. Each person gets to decide whether they want to take this deal or not.
The plea offers were extraordinary and really rare because the prosecutor said we will dismiss all of the charges with mandatory minimums, meaning the drug conspiracy and the gun charges, which, those charges alone had most of our clients looking at either a 15 year minimum time in prison or 25 years minimum in prison. What was left was a conspiracy-to-rob charge that would carry, still some substantial time under the federal sentencing guidelines, like 4 to 10 years approximately, but no minimum that would tie the judges hands from going lower than the guidelines would suggest.
Then we, and also a lot of the other lawyers involved in the cases, had to sit down with our clients individually and speak with them. And this was something Professor Zunkel and I spent a lot of time talking to a number of the clients that we were working with. How do you make this decision? Because if you decide to take the plea deal, the kicker was, then you have to walk away from the race discrimination motion. You have to say, I’m admitting to being involved, at least in the robbery, even though I’m not pleading guilty to the drugs or guns. And that was a really hard decision. It was a very individual decision that had to be made in communication with and consultation with the lawyers. Eventually, all 43 of our clients decided to take that plea deal, because the risk of spending 15 to 25 years in federal prison far from their families was unthinkable. And so that’s where it ended up.
You have written, “The principle that underlies mandatory minimums is dehumanization.” What was the legislative thinking behind mandatory minimums?
There have been times in our country’s history when the legislature has been very concerned that judges are not going to hammer people hard enough with prison time. And so they created these mandatory minimum penalties that would basically tie the judges’ hands, prevent them from making individualized decisions, and force them to give really high sentences. And there are some cases, where judges are forced to give a mandatory minimum of life in prison, simply because the prosecutor chose to charge the case in that way. A judge can’t say, but wait a second, the thing that was motivating this crime was not greed, for example, it was desperation. Or, this guy isn’t a high up leader of this; he’s just a small player in some bigger conspiracy; he shouldn’t be looking at the same sentence as the leaders. And when I say they’re “dehumanizing,” what I mean is it takes away any ability to consider somebody as a human being.
There is all this evidence of the racial disparities that are created by mandatory minimum penalties. I’m convinced that no one would tolerate these kinds of high penalties without any discretion for the judge if they were mostly being used against white people. Part of why our society tolerates this is that the people who are serving really high sentences for mandatory minimums are almost exclusively people of color. And it’s a terrible travesty. And frankly, it’s something that President Biden and his Attorney General, Merrick Garland, have said, they want to get rid of, but they’re not really putting their money where their mouth is. AG Garland is still letting his federal prosecutors all over the country bring mandatory minimum charges and asking judges to impose those high sentences. So it’s an ongoing problem.
How do prosecutors use these mandatory minimums as a tool when they’re plea bargaining or negotiating a cooperation agreement?
If somebody’s charged with a mandatory minimum, the prosecutor has the ability to bargain it away, just as it happened in our stash house cases. The bargain was, we’ll dismiss the mandatory minimum if you withdraw your motion to dismiss for race discrimination. But the more common bargain that happens is they say, look, if you cooperate against somebody higher up than you, then we will give you a break below the mandatory minimum. And there are various statutory and guideline mechanisms for doing that. But there’s this idea of using the mandatory minimum as a hammer to get people to cooperate against other people and to then make it easier for the prosecution to convict more and more people. Part of the problem is, you end up with people pleading guilty and admitting to things sometimes just to avoid that mandatory minimum.
Let’s zoom out on stings in general, putting the stash house stings aside. Are they necessary for public safety, and what is their legal footing?
Sting operations that involve creating or inciting people to commit crimes are not necessary for public safety. Why don’t the police and prosecutors just go after people who are actually committing crimes? Why make something up and target people to commit this crime? So much of our criminal legal system is built on outsized fear. It’s fear of poor people. It’s fear of people of color. No one is going after white people with these kinds of things, certainly not to the same degree and in the same mass way as it’s happening. Yes, they’re legal, unfortunately. But I think that it’s something that I wish that our current administration would really look at hard and back off from.
Are there any legal limitations on these stings? For example, do you have to have some minimum evidence of criminality? Is there a limit on the amount of deception that an undercover agent can engage in? What guidance, if any, is there, or is it up to an agency to decide exactly how they want to do it?
These kinds of sting operations that we’re talking about are virtually unlimited. There is a lot that policing agencies can do. They can deceive. They can lie. They can create their own personas. They can tape people. They can record people. They don’t need warrants. I mean, no, there’s virtually no legal limitations on these operations.
Often we talk about policies and laws and we may lose sight of the fact that there are real people behind these cases. So I do want to highlight the story of one of your defendants named David Cousins. Can you tell us about him and what made him vulnerable to accepting this offer from the ATF.
When my students and I first met David, he had been locked up for like two-and-a-half years, waiting for his case to go to trial. He had spent 841 days in jail, pretrial, before being convicted of anything, presumed innocent. David and his wife had six kids. Before he was targeted by the ATF, he was a devoted stay-at-home dad. His wife worked downtown in Chicago and he was at home with the children who were all fairly young. But the ATF, they did this thing that they do, they offered him half a million dollars for one day’s work. He took this offer in a misguided and desperate desire to be able to provide for his family. All of a sudden, he found himself facing a 25-year-minimum sentence in prison. David was ripped away from his family. He wasn’t there to feed the kids breakfast in the morning, to get them to school, to put them to bed at night. He wasn’t there to change their diapers, help them with their homework…sometimes, we forget the human side of having somebody locked up in a jail cell.
David’s wife couldn’t make it financially without him because she had to get child care. The family was rendered homeless. They ended up going door to door, sleeping on the floors of relatives and family and friends, and depending on the kindness of others. His kids struggled emotionally. Ultimately, my clinic got involved in his case and one of my students and I did the bail hearing where we were able to argue to the judge to release David while he’s awaiting trial and waiting for his motion to dismiss to come to fruition. And it was amazing. The judge released David.
After he got out of jail, David was a model citizen. This was somebody who had never before spent time behind bars. It was his first experience. So he did everything right. He went back to being the devoted dad that he’d always been. He became a medivan driver where he would help people with wheelchairs get to their doctor’s appointments.
There’s this amazing photo of David at his son’s eighth grade graduation, which happened really soon after we got him released from jail. He would never have been able to go had he stayed locked up pretrial. When his case went to sentencing, we were able to convince the judge to do something that happened in a lot of these cases, to give David what’s known as “time served,” meaning he got credit for the two-and-a-half years he had spent in jail and he was not sentenced to another single day behind bars. So instead of getting 25 years, he was able to stay out, stay with his family, stay in his community and continue to be the really upstanding guy that he is.
Why do you do this work? What drives you?
I’m really mission driven and passionate about what I do and about this work. I’ve spent my whole career trying to help people who are caught up in the criminal legal system, trying to advocate on their behalf because I feel that there are very few people in this world who need to be behind bars and be taken away from their families and communities in order to keep other people safe. I’m not saying nobody, but I’m saying there are very few people. As a society, we have such a fear-based and racially disparate way of approaching crime and poverty. It’s outrageous to me, it’s upsetting to me, it troubles me deeply.
Sign up to receive Office Hours in your inbox at cafe.com/brief