4

THE ALLEGHENY ALGORITHM

It’s a week before Thanksgiving, and I’m squeezed into the far corner of a long row of gray cubicles in the call center for the Allegheny County Office of Children, Youth and Families (CYF) child neglect and abuse hotline. I’m sharing a desk and a tiny purple footstool with intake screener Pat Gordon. We’re both studying the Key Information and Demographics System (KIDS), a blue screen filled with case notes, demographic data, and program statistics. We are focused on the records of two families: both are white, living in the city of Pittsburgh, one has two children, the other has three. Both were referred to CYF by a “mandated reporter,” a professional who is legally required to report any suspicion that a child may be at risk of harm from their caregiver. Pat and I are competing to see if we can guess how a new predictive risk model the county is using to forecast child abuse and neglect, called the Allegheny Family Screening Tool (AFST), will score them.

Pat Gordon is the kind of woman who keeps pictures of other people’s children in her cubicle. Gordon, a Pittsburgh native and Pirates fan, wears a telephone headset that pushes back her ear-length bob. She will say only that she is “over forty.” Six lines are busy on her phone as she stands to greet me. Her long-sleeved pink t-shirt complements her warm brown skin, and her mischievous laugh quickly transitions to quiet seriousness when we talk about the kids she serves.

In the noisy glassed-in room, intake screeners like Pat interview callers who have phoned the hotline to report suspicions of child abuse or neglect. Mostly female and about evenly split between African American and white, intake screeners search for information about families in a vast system of interconnected county databases. They have records from Drug and Alcohol Services, Head Start, Mental Health Services, the Housing Authority, the Allegheny County Jail, the state’s Department of Public Welfare, Medicaid, the Pittsburgh Public Schools, and more than a dozen other programs and agencies at their fingertips.

Pat hands me a double-sided piece of paper called the “Risk/Severity Continuum.” It took her a minute to find it, protected by a clear plastic envelope and tucked in a stack of papers near the back of her desk. She’s worked in call screening for five years, and, she says, “Most workers, you get this committed to memory. You just know.”

But I need the extra help. I am intimidated by the weight of this decision, even though I am only observing. From its cramped columns of tiny text, I learn that kids under five are at highest risk of neglect and abuse, that substantiated prior reports increase the chance that a family will be investigated, and that parent hostility toward CYF investigators is considered high-risk behavior. I take my time, cross-checking information in KIDS against the risk/severity handout while Gordon rolls her eyes at me, teasing, threatening to click the big blue button that runs the risk model.

The first child is a six-year-old boy I’ll call Stephen. Stephen’s mom, seeking mental health–care for anxiety, disclosed to her county-funded therapist that someone—she didn’t know who—put Stephen out on the porch of their home on an early November day. She found him crying outside and brought him in. That week he began to act out, and she was concerned that something bad had happened to him. She confessed to her therapist that she suspected he might have been abused. Her therapist reported her to the state child abuse hotline.

But leaving a crying child on a porch isn’t abuse or neglect as the state of Pennsylvania defines it. So the intake worker screened out the call. Even though the report was unsubstantiated, a record of the call and the call screener’s notes remain in the KIDS system. A week later, an employee of a homeless services agency reported Stephen to a hotline again: he was wearing dirty clothes, had poor hygiene, and there were rumors that his mother was abusing drugs. Other than these two reports, the family had no prior record with CYF.

The second child is a 14-year-old I’ll call Krzysztof. On a community health home visit in early November, a case manager with a large nonprofit found a window and a door broken and the house cold. Krzysztof was wearing several layers of clothes. The caseworker reported that the house smelled like pet urine. The family sleeps in the living room, Krzysztof on the couch and his mom on the floor. The case manager found the room “cluttered.” It is unclear whether these conditions actually meet the definition of child neglect in Pennsylvania, but the family has a long history with county programs.

No one wants children to suffer, but the appropriate role of government in keeping kids safe is complicated. States derive their authority to prevent, investigate, and prosecute child abuse and neglect from the Child Abuse Prevention and Treatment Act, signed into law by President Richard Nixon in 1974. The law defines child abuse and neglect as the “physical or mental injury, sexual abuse, negligent treatment, or maltreatment of a child … by a person who is responsible for the child’s welfare under circumstances which indicate that the child’s health or welfare is harmed or threatened.”

Even with recent clarifications that the harm must be “serious,” there is considerable room for subjectivity in what exactly constitutes neglect or abuse. Is spanking abusive? Or is the line drawn at striking a child with a closed hand? Is letting your children walk to a park down the block alone neglectful? Even if you can see them from the window? The first screen of the list of conditions classified as maltreatment in KIDS illustrates just how much latitude call screeners have to classify parenting behaviors as abusive or neglectful. It includes: abandoned infant; abandonment; adoption disruption or dissolution; caretaker’s inability to cope; child sexually acting out; child substance abuse; conduct by parent that places child at risk; corporal punishment; delayed/denied health care; delinquent act by a child under 10 years of age; domestic violence; educational neglect; environmental toxic substance; exposure to hazards; expulsion from home; failure to protect; homelessness; inadequate clothing, hygiene, physical care, or provision of food; inappropriate caregivers or discipline; injury caused by another person; and isolation. The list scrolls on for several more screens.

Three-quarters of child welfare investigations involve neglect rather than physical, sexual, or emotional abuse. Where the line is drawn between the routine conditions of poverty and child neglect is particularly vexing. Many struggles common among poor families are officially defined as child maltreatment, including not having enough food, having inadequate or unsafe housing, lacking medical care, or leaving a child alone while you work. Unhoused families face particularly difficult challenges holding on to their children, as the very condition of being homeless is judged neglectful.

In reality, most child welfare caseworkers aren’t looking to put children into foster care simply because their parents are poor; investigators are often reluctant to define as “neglect” conditions that parents have little control over. On the contrary, child welfare workers sometimes use threats of putting a child in foster care to secure resources to keep a family safe. They may call the public assistance office to help a family get food stamps, force a landlord to make needed repairs, or offer a struggling parent counseling or community supports.

In Pennsylvania, abuse and neglect are relatively narrowly defined. Abuse requires bodily injury resulting in impairment or substantial pain, sexual abuse or exploitation, causing mental injury, or imminent risk of any of these things. Neglect must be a “prolonged or repeated lack of supervision” serious enough that it “endangers a child’s life or development or impairs the child’s functioning.” So, as Pat Gordon and I run down the risk/severity matrix, I think both Stephen and Krzysztof should score pretty low.

In neither case are there reported injuries, substantiated prior abuse, a record of serious emotional harm, or verified drug use. I’m concerned about the inadequate heat in teenaged Krzysztof’s house, but I wouldn’t say that he is in imminent danger. Pat is concerned that there have been two calls in two weeks on six-year-old Stephen. “We literally shut the door behind us and then there was another call,” she sighs. It might suggest a pattern of neglect or abuse developing—or that the family is in crisis. The call from a homeless service agency suggests that conditions at home deteriorated so quickly that Stephen and his mom found themselves on the street. But we agree that for both boys, there seems to be low risk of immediate harm and few threats to their physical safety.

On a scale of 1 to 20, with 1 being the lowest level of risk and 20 being the highest, I guess that Stephen will be a 4, and Krzysztof a 6. Gordon smirks and hits the button. The numbers come up exactly as she predicted. Stephen gets a 5. Krzysztof? A 14.

*   *   *

I have come to Pittsburgh to explore the impacts of the Allegheny Family Screening Tool (AFST) on poor and working-class families. The stakes are high. According to the U.S. Centers for Disease Control and Prevention, approximately 1 in 4 children will experience some form of abuse or neglect in their lifetimes. The agency’s Adverse Childhood Experience Study concluded that the experience of abuse or neglect has “tremendous, lifelong impact on our health and the quality of our lives,” including increased occurrences of drug and alcohol abuse, suicide attempts, and depression.1

The administrative offices of the Allegheny County CYF are just a stone’s throw from where the Allegheny, Monongahela, and Ohio Rivers come together at the center of the city of Pittsburgh. Allegheny County has been a working-class stronghold with conservative Democratic leanings and a history of revolt against government interference since the Whiskey Rebellion started here in 1791. At the turn of the last century, it was home to the world’s first billion-dollar corporation: J.P. Morgan and Andrew Carnegie’s United States Steel Corporation.

Several decades of post-industrial economic disinvestment and population decline followed the abrupt closure of US Steel plants throughout the county in the mid-1980s. But in the last decade, Pittsburgh has seen a wave of young college graduates flocking to the region for jobs in the health professions, higher education, technology, and the arts. What was once Steel City now houses an estimated 1,600 technology companies, including a 450-employee office of Google and Uber’s robotic self-driving car division.

Marc Cherna, director of the Allegheny County Department of Human Services, arrived in February 1996 to run what was then known as Children and Youth Services (CYS) in the wake of two very public scandals. In the first, known as the “Baby Byron” case, a white foster family, the Derzacks, refused to return an African American toddler, Byron Griffin, to the agency so he could be reunited with his mother. Then-director Mary Freeland, upholding standard policies of the time that discouraged foster parents from adopting children in their care and restricted transracial adoption, traveled to the Derzack family home with a police escort to remove Byron on December 27, 1993. After Byron was returned to his mother, LaShawn Jeffrey, the Derzacks made the rounds of national talk shows, characterizing themselves as the infant’s thwarted saviors, and wrote a tell-all book about their experience.

Then, in March 1994, the body of two-year-old Shawntee Ford was found in a Pittsburgh motel. The chief forensic pathologist concluded that the toddler had been beaten to death, just weeks after being placed in the care of her father. CYS caseworkers had removed Shawntee from her mother, Mable Ford, while she underwent drug treatment. The two were later reunited. But when they were discovered living in a car in Buffalo, New York, Shawntee was removed again and her father, Maurice Booker, Sr., petitioned for custody.

During the hearing, a CYS worker told the judge that Booker had been investigated and that the agency didn’t have any concerns about his caretaking. The caseworkers failed to mention that Booker had a record of arrests for drunk driving and reckless endangerment. In February, after the custody hearing but before Shawntee’s death, Booker was also charged with holding his girlfriend and two other children hostage in a New Year’s Eve standoff with police. Shortly after Shawntee died, the state Department of Public Welfare denied CYS a full license, citing 72 violations of regulations, including failure to complete timely criminal background checks on parents. Within a year, Mary Freeland, under pressure to resign, accepted a new post overseeing a children’s commission in Florida.

*   *   *

“When I came here to run Children and Youth, it was a national disgrace,” said Marc Cherna. When he arrived in 1996, there were 1,600 children waiting to be adopted, and the agency was only managing to process 60 adoptions a year. Caseworkers made 35 percent less than caseworkers in neighboring Erie County. Most did not have a degree in social work. They were burdened with excessive caseloads, serving 30 or more families at a time. A blue-ribbon commission characterized the agency’s relationship to Pittsburgh’s African American community as one of “severe antagonism.”2 Seventy percent of children in the foster care system were Black, though African Americans made up only 11 percent of the population of Allegheny County. The agency struggled to recruit and retain people of color as adoptive families, caseworkers, and administrators.

Around the time Marc Cherna was hired, a commission called ComPAC21 convened to study the county’s political structure. It recommended shrinking county government by merging 30 distinct departments into nine large agencies. They combined the offices of aging, children and youth services, intellectual disability, behavioral health, and community services. They named the resulting agency the Department of Human Services (DHS) and appointed Cherna to lead it.

Formerly assistant director of the New Jersey Division of Youth and Family Services, Cherna is a ruddy-faced cheerful man who often sports a signature Save the Children necktie: kids’ drawings of multiethnic toddlers on a brown background. He is deeply proud that he’s managed to stay in his position 20 years, an impressive tenure for the leader of such a challenging agency. Today, DHS serves 200,000 people, employs 940 county workers, manages 417 contracting agencies, and operates with an $867 million annual budget.

Early in his tenure, Cherna proposed the creation of a data warehouse, a central repository that would pull together information collected by DHS, other county agencies, and state public assistance programs. With $2.8 million from a collection of local foundations, Cherna built the data warehouse in 1999. Today, it lives on two servers in DHS headquarters and holds more than one billion electronic records, an average of 800 records for every person in Allegheny County.

Twenty-nine different programs—including adult probation, the bureau of drug and alcohol services, the housing authority, the county jail, the juvenile probation office, the Allegheny County police department, the state office of income maintenance, the office of mental health and substance abuse services, the office of unemployment compensation, and almost 20 local school districts—send regular data extracts. The extracts include client names, social security numbers, dates of birth, addresses, and the type and amount of services they’ve received. The annual cost of the data warehouse, managed primarily through a contract with the multinational consulting firm Deloitte Touche Tohmatsu Ltd., tops $15 million a year, about 2 percent of DHS’s annual budget.

Marc Cherna and Erin Dalton, his deputy director of Data Analysis, Research and Evaluation, see the data warehouse as a tool to increase agency communication and accountability, provide wraparound services for clients, and cut costs. The department can match internal to external data, verify a client’s identity, establish eligibility for program resources, and keep a watchful eye on client behavior across all interactions with public services.

But the administration hasn’t just focused on collecting and analyzing data. Early in his tenure, Cherna reached out to foster, adoptive, and birth parents; service providers; child advocates; lawyers; and judges. In a case study of his administration written by Stewards of Change, a management consulting firm, Cherna explained, “The goal is for the child welfare agency to be viewed in the community as a friend, not a foe.”

“Marc has really solid relationships with private funders in this town. He has really positive relationships with the agencies,” said Laurie Mulvey of the University of Pittsburgh’s Office of Child Development. “He’s clear that it’s all about relationships. He’s honest, and straightforward, and works hard.” Nearly every community member I spoke to in my travels to Pittsburgh agreed with Mulvey, praising Cherna’s team for their participatory approach, clear communication, and high ethical standards. Today’s CYF is more diverse, more responsive, more transparent. It invites community input and leadership. Over the past 20 years, Cherna has earned the community’s trust and goodwill.

In 2012, the Pennsylvania General Assembly reduced its human services allocations by 10 percent, cutting about $12 million from DHS. The budget reduction sharpened a crisis already created by steadily declining county revenues and increased demand for services following the 2007 recession. Rich in data but poor in material resources, Cherna and his team put together an RFP to “design and implement decision support tools and predictive analytics in human services.” DHS offered up to one million dollars—provided by a Richard King Mellon Foundation grant—to build an automated triage system that would help them focus resources where they would do the most good.

The proposal they chose was submitted by a team from New Zealand’s Auckland University of Technology, led by economist Rhema Vaithianathan and Emily Putnam-Hornstein, director of the Children’s Data Network at the University of Southern California. They proposed to design, develop, and implement a decision-making tool that would mine Cherna’s warehoused data to make predictions about which Allegheny County children might be at greatest risk for abuse and neglect.

*   *   *

Rhema Vaithianathan and Emily Putnam-Hornstein met because they share an ambition to predict child maltreatment at the moment of birth, or even before. A 2011 paper by Putnam-Hornstein and Barbara Needell concluded that a prenatal maltreatment-predicting algorithm was theoretically possible: “A risk assessment tool that could be used on the day of birth to identify those children at greatest risk of maltreatment holds great value,” they wrote. “[P]renatal risk assessments could be used to identify children at risk … while still in the womb.”3 On the other side of the world, Rhema Vaithianathan, associate professor of economics at the University of Auckland, was on a team developing just such a tool.

As part of a larger program of welfare reforms led by conservative Paula Bennett, the New Zealand Ministry of Social Development (MSD) commissioned the Vaithianathan team to create a statistical model to sift information on parents interacting with the public benefits, child protective, and criminal justice systems to predict which children were most likely to be abused or neglected. Vaithianathan reached out to Putnam-Hornstein to collaborate. “It was such an exciting opportunity to partner with Rhema’s team around this potential real-time use of data to target children,” said Putnam-Hornstein.

Vaithianathan’s team developed a predictive model using 132 variables—including length of time on public benefits, past involvement with the child welfare system, mother’s age, whether or not the child was born to a single parent, mental health, and correctional history—to rate the maltreatment risk of children in the MSD’s historical data. They found that their algorithm could predict with “fair, approaching good” accuracy whether these children would have a “substantiated finding of maltreatment” by the time they turned five. In a paper released in September 2013, the team suggested that the ministry, after performing a feasibility study and an ethical review, deploy the model to generate risk scores that would trigger targeted, voluntary early intervention programs “with the aim of preventing maltreatment.”4

When the New Zealand public learned of the project in 2014, they responded with concern. Academic researchers warned that the model might not be as accurate as the team claimed: it was wrong about nearly 70 percent of the children it identified as at highest risk of harm in the historical data.5 Others cautioned that the model was primarily a tool of surveillance of the poor.6 Project reviewers raised concerns that the special needs of Māori families, which face child removal at dramatically disproportionate rates, were not adequately considered.7

In 2015, Social Development Minister Anne Tolley, who had replaced Bennett the year before, halted a plan to launch an observational experiment that would risk-rate 60,000 newborns to test the accuracy of the Vaithianathan team’s tool. In the margin of a project briefing that was later leaked to the press, she wrote, “Not on my watch! These are children not lab rats.” The experiment collapsed in the face of public resistance. But by that time, the Vaithianathan team had won the contract to create a similar predictive risk model in Allegheny County.

*   *   *

Back in the call center, Pat Gordon and I consider Stephen and Krzysztof’s scores. As 4 p.m. rolls around, the noise level in the call center rises steeply. From cubicles all around us, I overhear the questions of other intake screeners: “What kind of drugs is she on?” “Do you have any kind of support systems right now? Even like good friends that help you out in these kinds of situations?” “How do you spell Duquan?” In the next cubicle, a caseworker is scrolling through custody documents from the Allegheny County Court of Common Pleas. Another is using Facebook to try to identify a family who has been reported by a caller who only knows the mother’s first name and phone number. The banter between intake workers gets saltier as the stress peaks.

Screeners like Pat Gordon take phone calls for the county’s child abuse and neglect hotline and receive electronic reports from Pennsylvania’s state hotline, called ChildLine. For each report, they collect information: the nature of the caller’s concern, circumstances of the incident, and demographic information on the child and any other involved person, including names, ages, location, and addresses. They also collect history on all the people associated with the allegation of neglect or abuse. Intake screeners have high-level access to ClientView, the DHS’s application for searching the data warehouse. They also search publicly accessible sources: court records, divorce filings, birth records, social media.

Krzysztof’s case came over ChildLine, the state system. The report Gordon receives reads: “[Name redacted], Case manager with Diversified Care Management, reported that the window in the house is messed up and a door is broke. When its cold outside, the house ends up being very cold. C[hild] ends up wearing several layers of clothing. The house smells of urine from the cats and dogs. There has been feces on the floor. There’s a lot of clutter in the living room. C[hild] sleeps in the living room on a couch by choice. M[other] sleeps on the floor in the living room.”

Because there is an ongoing case on Krzysztof, Pat Gordon won’t be deciding whether or not to screen the family in for investigation. She will simply document this report and try to provide Krzysztof’s caseworker with a sense of the urgency of the allegation. If she had to make a decision whether to screen this case in or out, Gordon says, “There’s tons of questions that I would ask [the case manager]: When is the last time you were in the home? How long have you been working with this family? What brings you to work with the family? Does the family know that you’re making a report to us?”

Pat explains that, though the AFST has been getting a lot of attention lately, it’s only the final step in a three-part intake process that determines if a family will be screened in for investigation. Intake screeners consider the nature of an allegation: Does it rise to Pennsylvania’s legal definition of maltreatment? Is it within CYF’s jurisdiction? They then consider the immediate risk to the child: Is there impending danger? Present danger? Finally, intake screeners search through all available data sources to determine a family history. The AFST supplements a call screener’s work in developing that history.

The pairing of the human discretion of intake screeners like Pat Gordon with the ability to dive deep into historical data provided by the predictive risk model is the most important fail-safe of the system. “This is the place where we have the least information,” said Erin Dalton. “The callers don’t know that much. We know a lot about these families. There’s so much history [in the data]. We can make a more informed recommendation.”

Pat walks me through Krzysztof’s case. “This kiddo is older,” she says, “So his vulnerability is going to be low. There’s no real injury or anything like that. Prior abuse and neglect? Well, there is an open GPS [General Protective Services] case on the family already. I don’t get a mental health for the parents or the kiddo in this allegation.” She chooses “Low” for the severity of the allegation. Then she considers the immediate safety of the child. A broken window and door is uncomfortable, she says, but “it’s certainly not impending danger, it doesn’t sound like present danger.” Then, she clicks the button that runs the AFST. Krzysztof’s score appears on her screen in a graphic that looks like a thermometer: it’s green down at the bottom and progresses up through yellow shades to a vibrant red at the top. Krzysztof’s 14 is at the bottom of the red section, in the “Emergency!” part of the scale.

I’m shocked that Krzysztof received a score nearly three times as high as Stephen’s. Krzysztof is in his teens, while Stephen is only 6. The hotline report shows no harm beyond the crowded conditions and poor housing stock common to being poor. Why was he rated so highly? Pat tries to explain. His family’s record with public services stretches back to when his mother was a child. So though the allegation is not severe and Krzysztof seems safe, the family’s AFST score is high.

*   *   *

Though the screen that displays the AFST score states clearly that the system “is not intended to make investigative or other child welfare decisions,” an ethical review released in May 2016 by Tim Dare from the University of Auckland and Eileen Gambrill from University of California, Berkeley, cautions that the AFST risk score might be compelling enough to make intake workers question their own judgment. Rhema Vaithianathan insists that the model is built in such a way that intake screeners will question its predictive accuracy and defer to their own judgment. “It sounds contradictory, but I want the model to be slightly undermined by the call screeners,” she said. “I want them to be able to say, this [screening score] is a twenty, but this allegation is so minimal that [all] this model is telling me is that there’s history.”

But from what I saw in the call center during my visit, the model is already subtly changing how some intake screeners do their jobs. “The score comes at the end of the report, after we’ve already done all this research,” said intake manager Jessie Schemm. “If you get a report and you do all the research, and then you run the score and your research doesn’t match the score, typically, there’s something you’re missing. You have to back-piece the puzzle.”

We all tend to defer to machines, which can seem more neutral, more objective. But it is troubling that managers believe that if the intake screener and the computer’s assessments conflict, the human should learn from the model. The AFST, like all risk models, offers only probabilities, not perfect prediction. Though it might be able to identify patterns and trends, it is routinely wrong about individual cases. According to Vaithianathan and Putnam-Hornstein, intake screeners have asked for the ability to go back and change their risk assessments after they see the AFST score, suggesting that they believe that the model is less fallible than human screeners. So far, Cherna and Dalton have resisted. Intake screeners’ risk and safety assessments are locked in and can’t be changed after the AFST is run, except by a manager.

In the face of the seeming authority and objectivity of a computerized score, risk aversion, or an understandable excess of caution with children’s lives at stake, it is easy to see how a flashing red number might short-circuit an intake screener’s professional judgment. The AFST is supposed to support, not supplant, human decision-making in the call center. And yet, in practice, the algorithm seems to be training the intake workers.

What’s more, if a family’s AFST risk score is over 20, the system automatically triggers an investigation unless a supervisor overrides it. “Once the algorithm is run and the wheels start to turn,” says Bruce Noel, regional intake manager of Allegheny County CYF, “one of the possibilities is that the model says you must screen this in.”

A 14-year-old living in a cold and dirty house gets a risk score almost three times as high as a 6-year-old whose mother suspects he may have been abused and who may now be homeless. In these cases, the model does not seem to meet a commonsense standard for providing information useful enough to guide call screeners’ decision-making. Why might that be?

Data scientist Cathy O’Neil has written that “models are opinions embedded in mathematics.”8 Models are useful because they let us strip out extraneous information and focus only on what is most critical to the outcomes we are trying to predict. But they are also abstractions. Choices about what goes into them reflect the priorities and preoccupations of their creators. Human decision-making is reflected in three key components of the AFST: outcome variables, predictive variables, and validation data.

*   *   *

Outcome variables are what you measure to indicate the phenomenon you are trying to predict. In the case of the AFST, Allegheny County is concerned with child abuse, especially potential fatalities. But the number of child maltreatment–related fatalities and near fatalities in Allegheny County is very low—luckily, only a handful a year. A statistically meaningful model cannot be constructed with such sparse data.

Failing that, it might seem logical to use child maltreatment as substantiated by CYF caseworkers to stand in for actual child maltreatment. But substantiation is an imprecise metric: it simply means that CYF believes there is enough evidence that a child may be harmed to accept a family for services. Caseworkers will substantiate a case in order to get a family access to needed resources like food stamps or affordable housing. Some will substantiate because, though they don’t have credible evidence, they have a strong suspicion that something’s going on with a child. Other cases will be substantiated because frightened parents admit abuse or neglect they didn’t actually commit. Substantiation is not clear-cut, so it can’t be used as an outcome variable, either.

Though it would be best to use a more direct measure, the AFST uses two related variables—called proxies—as stand-ins for child maltreatment. The first proxy is community re-referral, when a call to the hotline about a child was initially screened out, but CYF receives another call on the same child within two years. The second proxy is child placement, when a call to the hotline about a child is screened in and results in the child being placed in foster care within two years. So the AFST actually predicts decisions made by the community (which families will be reported to the hotline) and by the agency and the family courts (which children will be removed from their families), not which children will be harmed.

Predictive modeling requires clear, unambiguous measures with lots of associated data in order to function accurately. But that means that the model has to test what’s available. “We don’t have a perfect outcome variable,” said Erin Dalton. “We don’t think there are perfect proxies for harm.”

Predictive variables are the bits of data within a data set that are correlated with the outcome variables. To find the predictive variables for the AFST, the Vaithianathan team ran a statistical procedure called a stepwise probit regression, a common, but somewhat controversial, data mining process. This computerized method knocks out variables that are not highly correlated enough with the outcome variables to reach statistical significance. In other words, it searches through all available information to pluck out any variables that vary along with the thing you are trying to measure—which leads to charges that the method is a kind of “data dredging,” or a statistical fishing expedition.

For the AFST, the Vaithianathan team tested 287 variables available in Cherna’s data warehouse. The regression knocked out 156 of them, leaving 131 factors that the team believes predict child harm.9

Even if a regression finds factors that predictably rise and fall together, correlation is not causation. In a classic example, shark attacks and ice cream consumption are highly correlated. But that doesn’t mean that eating ice cream makes swimmers too slow to avoid aquatic predators, or that sharks are attracted to soft-serve. There is a third variable that influences both shark attacks and ice cream consumption: summer. Both ice cream eating and shark attacks go up when the weather is warmer.

Validation data is used to see how well your model performs. In Allegheny County, the model was tested on 76,964 referrals received by CYF between April 2010 and April 2014.10 Vaithianathan and her team split the referrals into two stacks: 70 percent of them were used to determine the weights of the predictor variables (how important each variable is to the outcomes they are trying to predict). Then, the resulting model, with its 131 predictive variables properly weighted, was run against the other 30 percent of the cases to see if the model could reliably predict the actual outcomes of children in the historical data.

A perfectly predictive model would have what’s called 100 percent fit in the area under the receiver operating characteristic (ROC) curve. A model that has no degree of predictive ability—its chances of being right are about the same as the chances of guessing heads or tails in a coin toss—would have a 50 percent fit under the ROC curve. The AFST’s initial fit in the area under the ROC curve is 76 percent, about the same as the predictive accuracy of a yearly mammogram.11

Seventy-six percent might sound pretty good, but it’s only halfway between a coin toss and perfect prediction. And while the mammogram comparison is persuasive, it’s also important to remember that in 2009, the U.S. Preventative Services Task Force stopped recommending mammograms for women in their 40s, and recommended fewer mammograms for women over 50, due to concerns about the impacts of false positives, false negatives, and yearly radiation.12 In 2016, there were 15,139 reports of abuse and neglect in Allegheny County. At its current rate of accuracy, the AFST would have produced 3,633 incorrect predictions.

To sum up: the AFST has inherent design flaws that limit its accuracy. It predicts referrals to the child abuse and neglect hotline and removal of children from their families—hypothetical proxies for child harm—not actual child maltreatment. The data set it utilizes contains only information about families who access public services, so it may be missing key factors that influence abuse and neglect. Finally, its accuracy is only average. It is guaranteed to produce thousands of false negatives and positives annually.

*   *   *

A model’s predictive ability is compromised when outcome variables are subjective. Was a parent re-referred to the hotline because she neglects her children? Or because someone in the neighborhood was mad that she had a party last week? Did caseworkers and judges put a child in foster care because his life was in danger? Or because they held culturally specific ideas about what a good parent looks like, or feared the consequences if they didn’t play it safe?

In the call center, I mention to Pat Gordon that I’ve been talking to parents in the CYF system about how the AFST might impact them. Most parents, I tell her, are concerned about false positives: the model rating their child at high risk of abuse or neglect when little risk actually exists. I see how Krzysztof’s mother might feel this way if she was given access to her family’s score.

But Pat reminds me that I should be concerned with false negatives as well—when the AFST scores a child at low risk though the allegation or immediate risk to the child might be severe. “Let’s say they don’t have a significant history. They’re not active with us. But [the allegation] is something that’s very egregious. [CYF] gives us leeway to think for ourselves. But I can’t stop feeling concerned that … say the child has a broken growth plate, which is very, very highly consistent with maltreatment … there’s only one or two ways that you can break it. And then [the score] comes in low!”

Allegheny County has an extraordinary amount of information about the use of public programs stored in its data warehouse. But the county has no access to data about people who do not use public services. Parents accessing private drug treatment, mental health counseling, or financial support are not represented in DHS data. Because variables describing their behavior have not been defined or included in the regression, crucial pieces of the child maltreatment puzzle might be omitted from the AFST. It could be missing the crucial “summer” variable that links ice cream and shark attacks.

Geographical isolation might be an important factor in child maltreatment, for example, but it won’t be represented in the data set because most families accessing public services in Allegheny County live in dense urban neighborhoods. I ask Pat Gordon if she is concerned with those cases in which a family lives in the suburbs and no one’s ever called a hotline on them before, or a caregiver accesses private services for mental health or addiction so he’s not in the system. “Exactly,” she replies. “I wonder if people downtown really get that. I mean, we’re not looking for this to do our job. We’re really not. I hope they get that.”

*   *   *

I met Angel Shepherd and Patrick Grzyb at the Duquesne Family Support Center, one of 26 community hubs where families attend programs, access resources, and connect with others. I was speaking with members of the organization’s Parent Council on a crisp autumn day in 2016. It was a rollicking, wide-ranging, often heated conversation. The atmosphere in the conference room swung wildly from exasperated contempt to tearful appreciation to shocked dread as parents spoke about their experiences with the Allegheny County CYF.

Angel and Patrick didn’t stand out right away because their experience is so utterly average, characteristic of the routine, mundane indignities experienced by the white working class. Since moving in together in 2002, they’ve worked a variety of service jobs, from clerking at Dollar General to providing armed security for a high school to catering. Patrick was born in nearby Munhall two decades before its primary employer, the Homestead Steel Works, closed in 1986. He left school after the ninth grade. He describes himself as “a slow learner,” but is smart and diligent enough to raise three children, mostly on his own, while working full time. Angel took an audacious risk, boarding a bus from California to join Patrick after a two-year online courtship. More recently, Angel gambled again when she decided to pursue a college degree in cybersecurity. But this time, she wasn’t as lucky. The for-profit online university left her deeply in student loan debt with no clear path to employment.

They are a blended, multigenerational family. Tabatha, one of Patrick’s adult daughters, lives with them in their small rented duplex with her own daughter, an expansive and eager-to-please redheaded six-year-old charmer named Deseraye. Harriette, Angel’s daughter, is a precocious, energetic, nine-year-old whirlwind of mocha skin and wavy black hair. She loves Scholastic’s I Survived series of books with their covers featuring young people fleeing fires, tornados, volcanic eruptions, or Nazi invasion. During my November 2016 visit to their home, Harriette showed me her current favorite, I Survived Hurricane Katrina.

Patrick and Angel are creative, involved parents. When the two girls bicker, they put them in the “Get-Along Shirt,” one of Patrick’s roomy button-downs, together. Each girl puts one arm through a sleeve and one arm around the waist of the other. They stay in the Get-Along Shirt until they stop fighting. “Even if they got to go to the bathroom,” Patrick explains, laughing, hazel eyes flashing.

Despite the St. Francis of Assisi blessing on the door of their brown asphalt-shingled home, the family has been touched by all the usual traumas of being working class in America: health crises, stretches of unemployment, and physical disability. Nevertheless, they remain remarkably resilient, funny, and generous. Angel tends to smack Patrick while they’re talking, for emphasis, while he remains placid, like a Buddhist ex-biker, broad shoulders relaxed and elaborate facial hair twitching. He calls her “my Angel,” beaming at her in unguarded moments. Now that Patrick’s diabetes has cost him three toes and Angel is unemployed, they spend most of their time volunteering at the Family Support Center. Patrick works with the “Ready Freddy” program during the summers, helping prepare young children to enter kindergarten. Angel helps around the office with administrative tasks and takes minutes at all the meetings.

Angel and Patrick have racked up a lifetime of interactions with CYF. Patrick was investigated for medical neglect in the early 2000s when he was unable to afford his daughter Tabatha’s antibiotic prescription after an emergency room visit. When her condition worsened and he took her back to the ER the next day, a nurse threatened to call CYF on him. Frightened and angry, Patrick picked his daughter up and walked out. An investigation was opened. “They came late at night,” he remembers. “It was like 11 or 12 o’clock, my kids were already asleep. They came up with the police, told us why they were there, came in, looked at the house, looked where the girls were sleeping. And then two or three days later I received a letter saying I’m going to be on file for child neglect until she’s eighteen.”

The CYF has been in Harriette’s life since birth. Angel placed Harriette in foster care the day she was born, but fought to bring her back home when she began to suspect that the foster family was mistreating her. She asked for and received parenting classes and counseling from the agency, and her experience regaining custody was largely positive. Her caseworker even found an electrical problem in the nursery after Harriette came home and called Angel’s landlord, threatening to pull the family from the house unless he sent a certified electrician out to repair it.

When Harriette was five, someone phoned in a string of reports to the child abuse and neglect hotline. The anonymous tipster explained that Harriette was running around the neighborhood unsupervised. “The most she has ever been unsupervised is two minutes,” Angel counters, “but we had some people on the street who would call and [they] said all this stuff.” CYF examiners opened an investigation on Harriette and came out to the house to interview the family and their neighbors. The investigator took Harriette by the hand and tried to walk her down the street, away from her mother, to talk. “To our pride, and my daughter’s self-preservation,” Angel remembers, “she said, ‘I’m not allowed to go there. It’s against the rules. I’m out of bounds.’” The worker instead took Harriette to the back porch and exiled Angel to the front.

After speaking to Harriette, the caseworker took Angel aside and said, “Wow. That’s a pretty obedient child.” Angel told her, “You have no idea what it took to get her like that.” She explained the family’s approach to discipline, and gave an example: they drew a stop sign on the sidewalk, writing the word “Stop” inside. If Harriette goes past the sign, she has to sit on the porch steps in a time-out. The investigator closed the case.

Another call was made to the hotline, reporting that Harriette was down the block teasing a dog. Angel knew that Harriette had been sneaking out of the yard when she went to the bathroom, throwing food just out of the dog’s reach, barking at it. Angel tried everything to address the behavior. She explained that she might get hurt if she kept it up. She took away cartoons for the day. She made her go up the street to the dog’s owner and apologize. “Which I made her do the day before CPS got called!” Angel says, shrugging. “I told the lady, ‘I’m not going to lie to you. She’s been caught teasing this dog multiple times. I’m working with her to resolve the situation.’” But the investigator wasn’t convinced that Harriette was safe. “That could be child neglect,” Angel remembers her saying. When Angel explained to a supervisor that she could see Harriette at all times, even from the bathroom, CYF closed the case.

Another series of calls to the hotline was made, claiming that Harriette wasn’t being properly clothed, fed, or bathed and that she wasn’t getting her anti-seizure medication. Angel and Patrick explained to the investigating caseworker that her neurologist had canceled two appointments in a row and then withheld a prescription because it had been more than a year since she had been examined. The medical device she was wearing on her head to measure her epilepsy made washing her hair difficult. But she wasn’t running around in the cold barefoot, as the caller had claimed, and they were working on finding a new neurologist. Angel signed a waiver so CYF could access Harriette’s medical file. After seeing that their story checked out, CYF closed the case.

Patrick and Angel suspect a neighbor or family member was placing nuisance calls to harass them. Angel wants to press charges, but there is little she can do. Voluntary callers to child abuse and neglect hotlines can remain anonymous if they choose, and mandated reporters have immunity from civil or criminal liability if they report in good faith. “It seemed like every other week they were coming out,” Angel explains, frustrated. “They haven’t found anything—our cases are closed. But every now and then I feel like they drive by just to see.”

The lesson Patrick learned from his experience with CYF is this: always act deferential. Comply with everything CYF asks, even if you think you are being treated unfairly. “I didn’t think it was fair, but I wasn’t going to fight it,” he says. “I thought maybe if I fought it they would actually come and take her.” The deck is always stacked in the agency’s favor, he explains. “It’s scary. I’m thinking, ‘They’re coming to take my kids.’ That’s the first thing you think: CYF takes your kids away. It’s a very sick feeling in the stomach, especially with the police there. I’ll never forget it.”

*   *   *

Angel Shepherd and Patrick Grzyb, like all the CYF-involved parents I spoke to, have deeply mixed feelings about their experiences with the agency. While they describe frightening, frustrating experiences, they are also grateful for the support and resources they received. They hope that their time volunteering at the Family Support Center helps other families keep their kids safe, but they also suspect that any interaction with CYF might drive up their AFST score.

Most parents reacted with fear and exasperation when I asked them about the AFST. Some think the system unfairly targets them for surveillance. Some find having their entire history as parents summed up in a single number dehumanizing. Some believe the model will make it even more difficult to exert the limited rights they have in the system.

This was particularly true for African American parents. Janine, who asked that I refer to her only by her first name for fear of CYF retribution, is an outspoken advocate for poor families from Rankin, PA. When I asked her what she thought about the predictive model, she shot back decisively, “That’s going to fail. There’s too many risks. Everybody is a risk.”

When Janine says that “everybody is a risk,” she doesn’t mean that anyone might hit their child. She means that every parent in her community could be profiled by the AFST, simply for being poor and Black. According to statistics gathered by the National Council of Juvenile and Family Court Judges, in 37 states, the Dominican Republic, and Puerto Rico, African American and Native American children are removed from their homes at rates that significantly exceed their representation in the general population. For example, in 2011, 51 percent of children in foster care in Alaska were Native American, though Native Americans make up only 17 percent of the youth population. In Illinois, 53 percent of the children in foster care were African American, though African Americans make up only 16 percent of the youth population.

In 2016, 48 percent of children in foster care in Allegheny County were African American, though they made up only 18 percent of the county’s children and youth. In other words, African American children are more than two and a half times as likely to end up in foster care than they should be, given their proportion of the population. Cherna and Dalton see the AFST as a tool to take the guesswork out of intake, hoping it will provide data that will uncover patterns of bias in intake screener decision-making. “I see a lot of variability now,” said Dalton, “I would not go so far as to say that [the AFST] can correct disproportionality. But we can at least observe it more clearly.” By mining the wealth of data in the warehouse, she suggested, the AFST can help subjective intake screeners make more objective recommendations.

But a 2010 study of racial disproportionality in Allegheny County CYF found that the great majority of disproportionality in the county’s child welfare services arises from referral bias, not screening bias.13 The community calls child abuse and neglect hotlines about Black and biracial families more often than they call about white families from Rankin, PA. Though there were three and a half times as many white children as African American and biracial children in Allegheny County in 2006, there were equal numbers of reports—roughly 3,500—submitted to CYF for each group.

The study found that disproportionate referrals were often made based on mandated reporters’ misunderstandings of CYF’s mission and role, perceptions of problems in neighborhoods where people of color live, and class-inflected expectations of parenting. “I’ll never forget one I got,” said one of their interviewees, “I finally got a hold of this kid’s therapist and I’m like what’s going on here? This kid can go home. And the therapist, no lie, said it’s a bad environment for the kid. You know, community violence in the neighborhood.” Another reported that a clinic routinely called CYF to report parents for missing children’s medical appointments, even if they made the appointments up at a later time.

The study showed that once children were referred to CYF, screener discretion didn’t make much difference in disproportionality. Intake workers were only slightly more likely to screen Black and biracial children in for investigation than white children. They chose to screen-in 69 percent of cases focused on Black and biracial children, and 65 percent of cases focused on white children. For those screened in for investigation, roughly equal proportions were substantiated: 71 percent of cases involving Black or biracial children and 72 percent of those involving white children.

*   *   *

The AFST focuses all its predictive power and computational might on call screening, the step it can experimentally control, rather than concentrating on referral, the step where racial disproportionality is actually entering the system. Behind the scenes, the AFST produces two scores: the likelihood that another call will be made to the hotline about the child, and the likelihood of that child being placed in foster care. The AFST does not average the two, which might use the professional judgment of CYF investigators’ and family court judges to mitigate some of the disproportionality coming from community referral. The model simply uses whichever number is higher.

Nuisance calls like those experienced by Angel and Patrick introduce contaminated data into the model and further compromise its accuracy. Feuding neighbors, estranged spouses seeking custody, landlords, and family members with interpersonal axes to grind routinely call CYF as punishment or retribution. While there is little research on the subject, a study of data from the 1998 Canadian Incidence Study of Reported Child Abuse and Neglect found that approximately 4 percent of reports of child maltreatment were intentionally false. Of the 15,139 total reports of child abuse and neglect Allegheny County received in 2016, we can conservatively estimate that 605 were intentionally false. It is illegal to call a malicious report into a child abuse and neglect hotline. But Pennsylvania currently accepts reports from anonymous callers, so there is little a parent can do if a neighbor, relative, or acquaintance decides to harass or intimidate them in this way. The AFST has no way of recognizing or screening out nuisance calls.

Call referral is a deeply problematic proxy for maltreatment. It can be easily manipulated. CYF’s own research shows that it creates nearly all the racial disproportionality in the county’s child protective system. In other words, the activity that introduces the most racial bias into the system is the very way the model defines maltreatment. This easily gameable, discriminatory variable threatens to reverse all of the extraordinary work Cherna and his team have done.

“We don’t control the calls,” said Marc Cherna. “How the folks respond when they get questioned in the emergency room, cultural factors, and all that other stuff … that’s something we don’t control.” But the county does control what data it collects and which variables it chooses.

*   *   *

The overwhelming majority of families involved with CYF in Allegheny County, Black and white, are working class or poor. Though only 27 percent of Pittsburgh children receive public assistance, 80 percent of children placed in foster care in 2015 were removed from families relying on Temporary Assistance for Needy Families (TANF) or the Supplemental Nutrition Assistance Program (SNAP). That is, in Allegheny County, class-based disproportionality is worse than racial disproportionality. But unlike other historically disadvantaged groups, the poor are not widely recognized as a legally protected class, so the disproportionate and discriminatory attention paid to poor families by child welfare offices goes largely unchallenged.

The AFST sees the use of public services as a risk to children. A quarter of the predictive variables in the AFST are direct measures of poverty: they track use of means-tested programs such as TANF, Supplemental Security Income, SNAP, and county medical assistance. Another quarter measure interaction with juvenile probation and CYF itself, systems that are disproportionately focused on poor and working-class communities, especially communities of color. The juvenile justice system struggles with many of the same racial and class inequities as the adult criminal justice system.14 A family’s interaction with CYF is highly dependent on social class: professional middle-class families have more privacy, interact with fewer mandated reporters, and enjoy more cultural approval of their parenting than poor or working-class families.15

The overwhelming majority of child welfare investigations in the United States involve neglect, not abuse. According to the U.S. Department of Health and Human Services Administration for Children and Families, of the 3.4 million children involved in child welfare investigations in 2015, 75 percent were investigated for neglect, while only a quarter were investigated for physical, emotional, or sexual abuse.16

Defining neglect requires more subjective judgment than physical or sexual abuse. “Neglect is so wide,” said Tanya Hankins from the Family Support Center in East Liberty, a mostly African American neighborhood of Pittsburgh. “I’ve had a situation where two people are arguing and mom runs out the door and the baby is in the house and somebody calls CYF. I had a mom, when CYF knocked on the door, she didn’t answer. She was petrified. So they didn’t get a chance to see the baby, and put in for the baby to be removed.”

Nearly all of the indicators of child neglect are also indicators of poverty: lack of food, inadequate housing, unlicensed childcare, unreliable transportation, utility shutoffs, homelessness, lack of health care. “The vast, vast majority of cases are neglect, stem[ming] from people who have difficult, unsafe neighborhoods to live in,” said Catherine Volponi, director of the Juvenile Court Project, which provides pro bono legal support for parents facing CYF investigation or termination of their parental rights. “We have housing issues, we have inadequate medical care, we have drugs and alcohol. It’s poverty. The reality is that most children [investigated by CYF] are not physically or sexually abused.”

Child welfare services are not means-tested; you don’t have to be low-income to access them. CYF can offer parents a multitude of useful resources: respite care for a new mom who needs an hour off to do some laundry, early childhood education and development programs, even a visiting home aid to help with household chores. But professional middle-class families rely instead on private sources for family support, so their interactions with helping professionals are not tracked or represented in the data warehouse.

It is interesting to imagine the response if Allegheny County proposed including data from nannies, babysitters, private therapists, Alcoholics Anonymous, and luxury rehabilitation centers to predict child abuse among wealthier families. “We really hope to get private insurance data. We’d love to have it,” says Erin Dalton. But, as she herself admits, getting private data is likely impossible. The professional middle class will not stand for such intrusive data gathering.

Families avoid CYF if they can afford to, because the agency mixes two distinct and contradictory roles: provider of family support and investigator of maltreatment. Accepting resources means accepting the agency’s authority to remove your children. This is an invasive, terrifying trade-off that parents with other options are not likely to choose. Poor and working-class families feel forced to trade their rights to privacy, protection from unreasonable searches, and due process for a chance at the resources and services they need to keep their children safe.

Poverty is incontrovertibly harmful to children. It is also harmful to their parents. But by relying on data that is only collected on families using public resources, the AFST unfairly targets the poor for child welfare scrutiny. “We definitely oversample the poor,” said Dalton. “All of the data systems we have are biased. We still think this data can be helpful in protecting kids.”

We might call this poverty profiling. Like racial profiling, poverty profiling targets individuals for extra scrutiny based not on their behavior but rather on a personal characteristic: living in poverty. Because the model confuses parenting while poor with poor parenting, the AFST views parents who reach out to public programs as risks to their children.

*   *   *

Janine and I are sitting in a bus shelter behind a CVS pharmacy in a small borough just east of Pittsburgh on a warm September day in 2016. A middle-class suburb for most of its existence, Wilkinsburg lost about half its population in the last five decades, reeling from the closure of the Homestead Steel Works. The Kentucky Fried Chicken is celebrating its Day of Giving by distributing 10,000 free meals, and Janine and her friends are using the opportunity to register people to vote. In her late 40s, Janine wears a white tank top and a black rubber bracelet for the Poise Foundation, an African American community foundation “focused on building sustainable black communities and strengthening black families.”

I found it surprising that someone who has lost a child to the foster care system now volunteers for a CYF-funded organization. But Janine acknowledges that she needed help with her son, Jeremiah, more than a decade ago. She had insecure housing, struggled with transportation to get to work, and was managing health problems. Jeremiah started skipping school and disappearing, and someone called the hotline on her.

From Janine’s perspective, the system’s support requires heart-wrenching choices. Caseworkers opened an investigation when a call came in about her son’s truancy, she said, but closed it before she could access any services. Eventually, the agency required her to give up her son to access the basic material resources that would have allowed her to care for him effectively herself. “Instead of giving me help, they were like, ‘Put [Jeremiah] in foster care and we’ll help you,’” she explains. “You’ve got to put your kid in.” Her son went into foster care. She got help finding stable housing and medical care. Although she is still in touch with him today—Jeremiah’s now 22 and enrolled in college—she never regained custody.

And yet, she does not hesitate to call the abuse and neglect hotline if she believes someone is endangering a child. “It’s not being mean,” she explains. “You just have to understand that if something happened, I’m not going to feel bad, [thinking], ‘Why didn’t you call? You should have called!’ I’m not trying to do no harm, but to protect kids. One thing’s for sure and two’s for certain, I am a mother and I love all kids.”

While we talk on the bus stop bench, Sarah, a dark-haired white woman in her late 20s, jumps into the conversation unprompted to share her own story. Sarah is raising her daughter after fighting to get her back from seven years in foster care. It is her only day off from work that week, she says. She is running from appointment to appointment, trying to fulfill CYF’s expectations. Getting support for your parenting is great, she agrees. But the agency’s services often feel more like barriers than benefits, adding a frustrating new layer of responsibility on top of work and single motherhood. “People who have never been in the system don’t understand,” she says. “They don’t know what it’s like. Drug and Alcohol come to my house [for drug screenings] once a week. I go to court every three months. I have to go to therapy for me, and therapy for my kids.”

Every organization that Sarah, Janine, Angel, and Patrick access for help with their parenting is staffed by mandated reporters. In 2015, in the wake of the Jerry Sandusky scandal—the ex–Penn State football coach is currently serving 30 to 60 years for molesting ten boys—Pennsylvania lowered the standard for what constitutes child abuse. The state also created 15 categories of mandated reporters, including health-care and school employees, volunteers, clergy, and librarians. Under the law, mandated reporters must report any suspicion of child neglect or abuse, whether they learned about it through direct experience or heard about it secondhand. Mandated reporters do not have to identify how they learned about alleged abuse or neglect. They have immunity from legal prosecution. They are protected if they breach mental health or medical confidentiality. In fact, they can face legal prosecution, fines, and even jail time if they fail to report their suspicions. In the year after the changes, calls to abuse and neglect hotlines increased 40 percent.

The people most likely to offer poor parents help and support are all mandated reporters: teachers, doctors and nurses, psychiatrists and therapists, childcare providers, priests, volunteers at afterschool programs, employees of social service agencies. The pressure in the face of such invasive scrutiny and the cost of failing to meet the agency’s expectations are immense. The pressure often overwhelms parents who are already struggling.

Sarah is puzzled that so many caseworkers don’t seem to understand why a mom might lose her temper with them: “They’re like, ‘Why are you so angry?’ Because I’m tired of you being here! Leave me alone. I’m trying to get you to go away. We want you to go away.” I give her my card and Janine tells her to drop in at a Family Support Center. Then, spying her bus, Sarah dashes off to her next appointment.

*   *   *

If a child abuse and neglect investigation was a benign act, it might not matter that the AFST is imperfectly predictive. If a child abuse and neglect investigation inevitably resulted in adequate, culturally appropriate, and nonpunitive resources being offered to families, it wouldn’t matter that the system overrepresents poor and working-class people. But CYF resources come with increased surveillance and strict behavioral compliance requirements. For many, a child abuse and neglect investigation is an intrusive, frightening event with lasting negative impacts.

The price of help from CYF can be high. Janine argues that you have to “put your kid in” to foster care before you get support. Sarah’s schedule is filled with appointments with helping professionals she needs to please with displays of servility. Twenty years after he was accused of medical neglect, Patrick Grzyb still remembers feeling watched, monitored, and judged. “When they come to your house, they are looking around, watching your every move,” he explained. “It was like I was under a microscope. Every time one of my kids got sick, I had to take them to the emergency room. You walk in there and it’s like all these eyes [on you]. ‘Hey, he’s the one. We had to call on him.’ I felt like that for a long time.”

Many poor and working-class parents in Allegheny County are thankful that the data warehouse and other changes at DHS have narrowed resource gaps and eased the often cumbersome process of applying for multiple services. But there are others who feel that, once they are in “the system,” microscopic scrutiny ups the ante on their parenting, raising the stakes so high that they are bound to lose. “We try to comply,” said Janine. “But look, we can’t do it all. You’re opening up a door for ten other things I’ve got to do. It’s just a downward spiral.”

Parenting while poor means parenting in public. The state of Pennsylvania’s goal for child safety, “being free from immediate physical or emotional harm,” can be difficult to reach, even for well-resourced families. Each stage of the process introduces the potential for subjectivity, bias, and the luck of the draw. “You never know exactly what’s going to happen,” said Catherine Volponi in her office at Pittsburgh’s Juvenile Court Project. “Let’s say there was a call because the kids were home alone. Then they’re doing their investigation with mom, and she admits marijuana use. Now you get in front of a judge who, perhaps, views marijuana as a gateway to hell. When the door opens, something that we would not have even been concerned about can just mushroom into this big problem.”

At the end of each child neglect or abuse investigation, a written safety plan is developed with the family, identifying immediate steps that must be followed and long-term goals. But each safety action is also a compliance requirement, and parents’ responses are carefully monitored. Sometimes, factors outside parents’ control make it difficult for them to implement their plan. Contractors who provide services to CYF-involved families fail to follow through. Public transportation is unreliable. Overloaded caseworkers don’t always manage to arrange promised resources. Sometimes parents resist CYF’s dictates, resenting government intrusion into their family.

Failing to meet safety goals increases the likelihood that a child will be removed. “We don’t try to return CYF families to the level at which they were operating before,” said Volponi, “We raise the standard on their parenting, and then we don’t have enough resources to keep them up there. It results in epic failures too much of the time.”

*   *   *

A report of abuse or neglect that is found credible has profound impact on a parent’s life for decades. Most jobs and volunteer positions that involve interaction with children in the state of Pennsylvania require that applicants submit a child abuse history certification. If the applicant is listed in the state’s ChildLine Abuse Registry as a perpetrator of abuse or neglect, she cannot apply for a job working with children. If she already has a job working with children, she will lose it. She can’t be a Girl Scout troop leader, softball coach, or volunteer at her child’s school.

“You [have to] change the way you support your family,” says Amanda Green Hawkins, a Pittsburgh attorney who argued a pro bono CYF expungement case in 2015. A child abuse record “can keep you from getting employment in a lot of areas—anything having to do with kids. You can’t be a teacher anymore. You can’t be program manager … at the Boys and Girls Club anymore. How those people get their lives back—that can be very tricky.”

Parents who go through a CYF investigation and a family court hearing and are found guilty of child maltreatment—the agency term for this is “indicated” or “founded”—receive notice that they have been included in the ChildLine registry. Within 90 days, they can request an administrative review to amend or expunge their record. At the hearing, the county presents the evidence it used to prove abuse or neglect, and the parent rebuts it. Sometimes, when poor families challenge the child welfare system, they win. But not many dare to take CYF on in court.

Tracey McCants Lewis, attorney and pro bono program coordinator for Duquesne University School of Law, told me that she’s never represented a client in a CYF expungement case, in part because it is a “much more extensive process than criminal expungement.” Amanda Green Hawkins agrees that such challenges are vanishingly rare. “[CYF] expungements are very difficult,” she said. “You are going up against the government. It’s like David taking on Goliath.” While Allegheny County has a nonprofit organization that will represent parents when they go to court in child protection matters, there is no public defender for those seeking to expunge their record. They must find an attorney willing to work for free or they have to represent themselves. If a “founded” or “indicated” ruling is not promptly expunged, the parent remains in the state abuse registry until the child who is the subject of the investigation turns 23.

The expungement process applies only to those reported to Pennsylvania’s ChildLine Abuse Registry for grievous neglect or abuse. Any allegations that involve “non-serious injury or neglect” are referred to General Protective Services (GPS). GPS data is kept in the Allegheny County DHS data warehouse indefinitely. So the multiple calls on Harriette, Angel’s feisty but mostly obedient daughter? There is no way to expunge them, even though they were clearly nuisance calls. When and if Harriette becomes a mom, she’ll start out with a higher AFST score, because she had interactions with the child protective system as a kid. The assumption might be that she had a bad mother, and so she had no mental model of how to parent, and the county needs to keep an eye on her. No one will know about the chalked stop sign on the sidewalk, the vocabulary games played on the living room floor, or the obvious pride that shines in Angel’s eyes when she looks at her daughter.

Marc Cherna and Erin Dalton argue that allowing parents to expunge hotline reports, no matter how spurious, would rob CYF of critical data they need to identify and prevent abuse. “The stuff stays in the system,” said Cherna. “A lot of times where there’s smoke there’s fire.” Dalton agreed. “I personally am sympathetic to the idea of redemption,” she said, “but getting rid of data that might predict abuse and neglect is like taking away the biggest tool we have in preventing future abuse.”

Amanda Green Hawkins is not convinced that data’s potential predictive power outweighs parents’ constitutional rights. “Everyone is entitled to due process in our system,” she said. “That process will determine whether or not [CYF is] able to keep a report on someone for the rest of their life. That no one should be entitled to due process to do anything about it? That runs afoul of our Constitution. That’s pitiful.”

*   *   *

Marc Cherna and his team hope that the AFST will provide better, more timely information to help target CYF interventions to the families who need them most. They see little downside to data collection because they understand the agency’s role as primarily supportive, not punitive. Even if a family is screened in for investigation, Cherna and Dalton explained to me, most will be offered services rather than have their children removed. But the social stigma that comes from being involved with CYF is significant, and the level of intrusion is intense.

Having your child rearing choices constantly watched, monitored, and corrected can heighten parents’ perceptions that they are being targeted and trapped. “There’s so many women walking around here who don’t have their children,” said Carmen Alexander, senior operations manager of New Voices Pittsburgh, a grassroots organization dedicated to the complete well-being of Black women and girls. “It’s almost like you can’t even sneeze the wrong way around your children. You have to keep quiet. It builds a culture of distrust.”

When a CYF investigation is launched, parents have only two meaningful options: either resist the agency’s dictates and risk losing their children, or defer to the agency’s authority completely. Research by University of Denver sociologist Jennifer Reich shows that, like police officers, many child welfare caseworkers see resistance as an indicator of guilt. The risk/severity document that Pat Gordon showed me underscores her point. If a parent is “appropriately responsive to requirements” of CYF, “acknowledges problems,” and “initiates contact with Caseworker [to] seek additional services,” she is considered a minimal risk to her children. If she “actively resists any agency contact or involvement … will not permit investigation to occur” or “denies problems,” she is considered high risk. But a mother who is falsely accused of abuse or neglect may resist agency contact and involvement. And parents who fight for their children may also fight CYF.

“If we are painting with a really broad brush, there are two types of clients that come to my door. One comes in, gets in my face, yells at me, and tells me I’m part of the problem,” said Catherine Volponi. “The other comes in and assumes the position to be kicked again. I would much rather have the one who got in my face, because they are still in it. These are the people who will eventually prevail.”

Professional middle-class families reach out for support all the time: to therapists, private drug and alcohol rehabilitation, nannies, babysitters, afterschool programs, summer camps, tutors, and family doctors. But because it is all privately funded, none of those requests ends up in Allegheny County’s data warehouse. The same willingness to reach out for support by poor and working-class families, because they are asking for public resources, labels them a risk to their children in the AFST, even though CYF sees requesting resources as a positive attribute of parents.17 “If a mom has accessed county mental health services in the past, why does that hurt her? Or drug and alcohol services?” asked Pittsburgh civil rights attorney and Duquesne University law professor Tiffany Sizemore-Thompson. “Shouldn’t that show that she’s actually a responsible person who went and got services that she felt she needed?”

*   *   *

CYF-involved families acknowledge the fallibility of human decision-making. They understand perfectly well that the call screeners, caseworkers, administrators, and judges who decide who will be investigated, what kind of services they will receive, which children will be removed, and how quickly children in foster care are reunited with their birth families have biases that influence their work. Nevertheless, they’d rather have an imperfect person making decisions about their families than a flawless computer. “You can teach people how you want to be treated,” said Pamela Simmons, staffing the voter registration table across the street from the Kentucky Fried Chicken in Wilkinsburg. “They come with their own opinions but sometimes you can change their opinion. There’s opportunity to fix it with a person. You can’t fix that number.”

Human bias has been a problem in child welfare since the field’s inception. In its earliest days, Charles Loring Brace’s orphan trains carried away so many Catholic sons and daughters that the religious minority had to create an entirely parallel system of child welfare organizations. Scientific charity workers had religious biases that tended to skew their decision-making. They believed that the children of Protestants could be redeemed by their families, but Catholics were incorrigible and had to be sent to labor on (mostly Protestant) farms in the Midwest. Today, racial disproportionality shatters the bonds of too many Black and Native American families. Some of that disproportion can certainly be traced to human discretion in child welfare decision-making.

But human bias is a built-in feature of the predictive risk model, too.

The outcome variables are proxies for child harm; they don’t reflect actual neglect and abuse. The choice of proxy variables, even the choice to use proxies at all, reflects human discretion.

The predictive variables are drawn from a limited universe of data that includes only information on public resources. The choice to accept such limited data reflects the human discretion embedded in the model—and an assumption that middle-class families deserve more privacy than poor families.

The model’s validation data is a record of decisions made by human caseworkers, investigators, and judges, bearing all the traces of their humanity.

Once the big blue button is clicked and the AFST runs, it manifests a thousand invisible human choices. But it does so under a cloak of evidence-based objectivity and infallibility. Intake screeners reflect a variety of experiences and life paths, from the suburban white Penn State postgraduate to an African American Pittsburgh native, like Pat Gordon, with over a decade of experience. The automated discretion of predictive models is the discretion of the few. Human discretion is the discretion of the many. Flawed and fallible, yes. But also fixable.

Parents in Allegheny County helped me articulate an inchoate idea that had been echoing in my head since I started my research. In Indiana, Los Angeles, and Allegheny County, technologists and administrators explained to me that new high-tech tools in public services increase transparency and decrease discrimination. They claimed that there is no way to know what is going on in the head of a welfare caseworker, a homeless service provider, or an intake call screener without using big data to identify patterns in their decision-making.

I find the philosophy that sees human beings as unknowable black boxes and machines as transparent deeply troubling. It seems to me a worldview that surrenders any attempt at empathy and forecloses the possibility of ethical development. The presumption that human decision-making is opaque and inaccessible is an admission that we have abandoned a social commitment to try to understand each other. Poor and working-class people in Allegheny County want and deserve more: a recognition of their humanity, an understanding of their context, and the potential for connection and community.

“A computer is only what a person puts in it,” Janine reflected. “I trust the caseworker more.… You can talk, and be like, ‘You don’t see the bigger problems?’”

*   *   *

Like the Indiana automated eligibility system, the AFST interprets the use of public resources as a sign of weakness, deficiency, even villainy. Marc Cherna spent the greater part of his career creating a culture of strength-based practice, open community communication, and peer support in the CYF. Unfortunately, he has commissioned an automated tool that sees parents using public programs as a danger to their children.

Targeting “high-risk” families might lead them to withdraw from networks that provide services, support, and community. According to the US Centers for Disease Control’s Division of Violence Prevention, the largest risk factors for the perpetration of child abuse and neglect include social isolation, material deprivation, and parenting stress, all of which increase when parents feel watched all the time, lose resources they need, suffer stigma, or are afraid to reach out to public programs for help. A horrible irony is that the AFST might create the very abuse it seeks to prevent.

It is difficult to say a predictive model works if it produces the outcome it is trying to measure. A family scored as high risk by the AFST will undergo more scrutiny than other families. Ordinary behaviors that might raise no eyebrows before a high AFST score become confirmation for the decision to screen them in for investigation. A parent is now more likely to be re-referred to a hotline because the neighbors saw child protective services at her door last week. Thanks in part to the higher risk score, the parent is targeted for more punitive treatment, must fulfill more agency expectations, and faces a tougher judge. If she loses her children, the risk model can claim another successful prediction.

*   *   *

The AFST went live on August 1, 2016, three and a half months before my visit with Pat Gordon. In the model’s first nine months, the intake center received more than 7,000 calls. Data released by the Office of Data Analysis, Research and Evaluation (DARE) in May 2017 show that slightly more calls (6 percent) were screened in for investigation by intake workers using the AFST than by those working without the model the previous year. However, the number of screened-in calls that went on to be investigated and substantiated jumped by nearly a quarter (22 percent). Calls scored more highly by the AFST, on average, were more likely to be substantiated: 48 percent of calls receiving an AFST score between 16 and 20, 43 percent of those between 11 and 15, 42 percent of those between 6 and 10, and 28 percent of those between 1 and 5. DARE’s admittedly preliminary analysis concludes that referrals scored more highly by intake screeners using the AFST were substantiated and accepted for services by child welfare investigators at higher rates. Because only intake screeners, not child welfare investigators, receive the AFST scores, DARE believes that these early results “perhaps validat[e] the real risk differences the tool has identified.”

But if you look closer at the data, some troubling idiosyncrasies emerge. Of the 333 calls the AFST scored above 20, thereby triggering a mandatory investigation, 94 (28 percent) were overridden by a manager and dismissed out of hand. Only half (51 percent) of the remaining mandatory screen-ins resulted in substantiation. In other words, only 37 percent of calls that triggered a mandatory investigation were found to have merit. And there are other discrepancies. Intake workers screened in about the same number of calls scoring 20 as they did calls scoring 12. Roughly the same number of 9’s were substantiated by later investigation as 19’s. That the number of screen-ins has not changed much while the number of substantiated investigations has risen could suggest that the AFST is simply modeling the agency’s own decision-making.

A few days after I visited the intake call center, on November 29, 2016, the Vaithianathan team implemented a major data fix to the AFST. Twenty percent of the families reported to the hotline in the months after the AFST launched received no score. “We weren’t scoring cases where only the parents had human services experience,” said Erin Dalton. “The most vulnerable kids tend to be young; infants don’t have social services history. [The AFST was] not generating a score for these infants where I have Jack the Ripper for the father and his bride for the mother.” The updated model now evaluates the entire household—paramours, uncles, cousins, grandmothers, housemates, and every single child living together—and the AFST rating is based on the child who receives the highest score, even if she was not the child reported to the hotline. The AFST now produces a score for more than 90 percent of families reported to the hotline, and it is returning many more scores of 18 and above.

In many ways, the AFST is the best-case scenario for predictive risk modeling in child welfare. The design of the tool was open, participatory, and transparent. Elsewhere, child welfare prediction systems have been designed and implemented by private companies with very little input or discussion from the public. Implementation in Allegheny County has been thoughtful and slow. The goals of the AFST are intentionally limited and modest. The tool is meant to support human decision-making, not replace it.

Nevertheless, Allegheny County’s experiment in predicting child maltreatment is worth watching with a skeptical eye. It is an early adopter in a nationwide algorithmic experiment in child welfare: similar systems have been implemented recently in Florida, Los Angeles, New York City, Oklahoma, and Oregon.

As this book goes to press, Cherna and Dalton continue to experiment with data analytics. The next iteration of the AFST will employ machine learning rather than traditional statistical modeling. They also plan to introduce a second predictive model, one that will not rely on reports to the hotline at all. Instead, the planned model “would be run on a daily or weekly basis on all babies born in Allegheny County the prior day or week,” according to a September 2017 email from Dalton. Running a model that relies on the public to make calls to a hotline does not capture the whole population of potential abusers and neglecters; at-birth models are much more accurate. But the primary goal is not to use a more precise model, insists Dalton. “We aren’t considering this because it is more accurate,” she wrote, “but because we have the potential to prevent abuse and neglect.”

Nevertheless, using a model to risk-rate every child born to families using county resources raises vexing questions about how the results will be used. “We have a home-visiting hotline and home-visiting services. If we have limited resources, do we prioritize higher-risk populations with those services?” asks Erin Dalton. “It feels to me like that might be ethical and there might be community acceptance for that sort of thing. Another step beyond that is, let’s say somebody walks into a family support center and requests services and wants to get engaged. Do you get a flag that doesn’t necessarily say high risk, but says something like ‘Really try to engage, keep them engaged?’” Marc Cherna insists that CYF is “not about to knock on your door and say ‘You’re at high risk of abusing your kid.’” But this is exactly how other risk models, such as the algorithm that produces the Chicago Police Department’s violent crime “heat list,” have been implemented.

Cherna’s administration wants to identify those families who could use help earlier, when interventions could make the most difference. But community members wonder if data collected with the best of intentions might be used against them in the future. “People have concerns about what happens when Marc and Erin leave,” said Laurie Mulvey from the Office of Child Development. The DHS held a series of meetings introducing local agencies, funders, and community members to the predictive model. At those meetings, explained Mulvey, people were saying, “We trust you, Erin. We trust you, Marc. What happens when you’re gone?”

Under the right conditions—fiscal austerity, a governor looking to downsize public agencies, or a rash of child deaths—the AFST could easily become a machine for automatically removing children from their homes. It wouldn’t even require reprogramming the model. Today, if a family’s risk score exceeds 20, CYF must open an investigation. Tomorrow, a score of 20 might trigger an emergency removal. Or a score of 10 … or of 5.

When I asked the AFST’s designer Rhema Vaithianathan if she worries about possible abuses of the model, she offered me a hypothetical solution. “The one thing that we could do is say [in our contract], ‘If we feel that it ever gets used unethically, we have the right to say something about that.’” But the assumption that academics speaking out against the way their research is used will have a significant impact on public policy or agency practice is naïve.

*   *   *

If a neighbor or an emergency room nurse calls the hotline about Angel and Patrick’s family again, they will undoubtedly receive a high AFST score. One of the children in the household is six. There are multiple caregivers, and while they are a tight-knit family, not all of them are biologically related. The household has a long history with public assistance. Angel is seeing a counselor and taking medication for PTSD. They have been involved with CYF for decades, though for the last nine years their relationship with CYF has largely consisted of their volunteer service and Angel requesting parenting classes, hands-on help, and respite care.

Near the end of our interview, Angel reflected on the double bind she faces. “I know I’m not the only one that has had positive experiences with CYF,” she said, “reaching out to them saying, ‘Hey, I need your help here.’ [But] I do have a history because of my daughter. I’ve also used county services. They would plug me high for that reason. [The AFST] would flag me big time.”

Patrick and Angel live in fear that there will be another call on their family and that the AFST will target their daughter or granddaughter for investigation, and possibly for removal to foster care. “My daughter is now nine,” said Angel, “and I’m still afraid that they are going to come up one day and see her out by herself, pick her up, and say, ‘You can’t have her anymore.’”