Predicting arrests, but little else: Limits of recidivism risk assessments

By Katharine Neill Harris, Ph.D.
Alfred C. Glassell, III, Fellow in Drug Policy

Risk assessment tools (RATs) are a central feature of the modern criminal justice toolkit. By estimating the likelihood that an individual who has been charged with or convicted of a crime will commit a future offense, they are intended to “manage” risk more effectively and reduce incarceration in the process.  Prior to development of these instruments, the justice system relied on professionals in the field—judges, probation officers, parole boards—to make clinical evaluations of an individual’s recidivism risk or general “dangerousness.” Such assessments were often subjective, inconsistent, and heavily influenced by individual biases that negatively and disproportionately impacted minority defendants.

In contrast, RATs promise algorithmically-derived objectivity. They draw on actuarial methods, originally developed by the insurance industry to estimate the likelihood of various risk scenarios, to weigh several factors thought to affect recidivism to produce an overall score. This score is then used to classify someone as a low, moderate, or high risk of recidivism. Specifics vary by instrument, but most include such factors as criminal history, substance use, gender, and age. Some include questions about criminal thinking/attitudes and peer associations. None ask about race or ethnicity, but some ask about employment, financial situation, neighborhood, and family ties, variables opposed by critics on the grounds that they contribute to racial and socioeconomic bias in the assessment.

RAT proponents claim the tools can reduce incarceration while protecting the public by accurately distinguishing between low-risk individuals, who can be released safely to the community, and high-risk individuals, who need the bulk of justice-based interventions. RATs have been widely adopted for use at various points in the criminal justice continuum. Several states now require jurisdictions to use RATs for pretrial, probation supervision, and/or prison release decisions, and at least five mandate their use during sentencing, although policy specifics vary considerably across states.

Jurisdictions may develop their own RATs, tweak tools already in existence, or purchase RATs created by for-profit companies. The extent to which these tools are validated prior to adoption varies; often the parties responsible for implementing the tools also oversee their evaluation. Generally speaking, research has found structured risk assessments to be more accurate and consistent predictors of recidivism than human evaluators, although this claim is disputed. Some scholars argue, for example, that RATs only perform better under certain conditions, and others have found that RATs do not really remove professional discretion from the assessment process, suggesting that the imposed dichotomy between clinical and actuarial evaluations is false.

Given the widespread and increasing adoption of RATs, their validity is crucial; that is, they must correctly predict the behavior they are intended to predict. It is also important to consider the behaviors that are being predicted, and whether predicting these behaviors advances the purported goals of improving public safety while reducing (or at least not increasing) incarceration.

Some RATs used at the pretrial phase focus on predicting failure to appear for court, a relevant  behavior when the main consideration is whether someone will honor their court date. Other tools, more commonly applied to probationers and parolees and sometimes, controversially, to individuals awaiting sentencing, assess general recidivism risk, typically measured as any arrest. A general risk prediction may identify characteristics that are theoretically or empirically linked to reoffending, such as substance use, criminal thinking, and unemployment, that could be improved with targeted interventions, but it is arguably a problematic metric when applied to other decisions, such as supervision level or sentence length. (Judges in some jurisdictions can use RATs to set a specific sentence within a predetermined range; RATs cannot be used to depart from sentencing guidelines.)

General recidivism, especially when defined as “any arrest” rather than a conviction, is dependent to a certain extent on one’s likelihood of having an encounter with law enforcement, a factor that increases for certain populations, such as those who are addicted to drugs and homeless or living in high poverty neighborhoods, and is not necessarily a reflection of the danger an individual poses to society.

Prostitution and drug possession, for example, are offenses for which arrests may be relatively frequent (in Texas possession of a controlled substance accounted for 34 percent of all felony cases in 2018), but that have a negligible impact on public safety. Criminal history is not an automatic predictor of future conduct, but given that prostitution and problematic drug use are often part of a pattern of behavior, individuals charged with or convicted of these offenses may be especially likely to get arrested for them again compared to someone convicted of an offense, such as assault or murder, that was the product of unique circumstances.

In other words, an individual may be at high risk for getting arrested without being a high risk to public safety. But lumping all risk into a measure of “any arrest” can obscure important distinctions between different classes of offenses. When assessments of general recidivism extend beyond arrestable offenses to include violations of supervision conditions, which may refer to such infractions as missing a probation meeting or failing a drug test, this problem is exacerbated. These broad measures of risk may unnecessarily and incorrectly label some individuals as high risk. The potential for this error is further magnified by the fact that most assessment tools, while pretty good at correctly identifying low-risk individuals, are more likely to falsely classify someone as moderate or high risk when s/he is actually low risk.

There are measures of risk that focus on specific types of recidivism, such as for violent, sexual, or domestic abuse offenses, that may address public safety concerns more directly. This assumes, of course, that the tool being used predicts risk accurately; whether it does depends on a number of factors including how one defines “accurate” and the training and competency of evaluators implementing and interpreting the tool. There is evidence to suggest that violence-specific assessments perform moderately better than general recidivism tools, but the inherent challenges in predicting something as complex as recidivism mean that even the best instruments face significant limits in the degree to which they are able to correctly calculate future individual behavior.

This is not to say that RATs do not have a place in the justice system. Most RATs, general and specific, can help identify areas of need for clients that should be addressed while incarcerated or on supervision. (Whether these needs are actually addressed is another issue; the scant research suggests a gap between identification of recidivism risk factors and reduction of those risk factors.) But RATs should not be used to make decisions about sentence length or release readiness, as improper high-risk designations could lead to longer sentences or prolonged incarceration. Their potential to influence decisions to lengthen supervision time or to impose more requirements for probation is also a concern, albeit a less extreme one, because the longer someone is on supervision and the more intense that supervision is, the more likely they may be to violate conditions of probation and wind up in jail or prison. Reliance on broad general risk measures could also lead to ineffective allocation of resources and continued criminalization of certain behaviors, drug use and addiction particularly.

Despite RAT caveats, the clear trend in the US is toward greater adoption. Given the stakes involved, it is imperative to ensure that RATs accurately identify appropriate behaviors, and do not over-generalize about what constitutes a risk to public safety.