How U-M data ‘nerds’ helped Flint find homes with dangerous lead pipes
In early 2016, months after Flint’s water crisis created international headlines, a group of University of Michigan professors and students realized no one had answered some of residents’ biggest questions.
Among them: How many of Flint homes and buildings tapped lead-tainted water? And where were they?
The academics searched for answers alongside city officials. The city’s water testing data covered just a third of Flint’s nearly 33,000 occupied structures at the time. And more daunting, investigators knew almost nothing about which structures were linked to pipes that were more susceptible to problems.
Knowing which homes had lead or galvanized steel pipelines that may have corroded and leached lead into Flint’s drinking water — as opposed to safer copper lines — would shed light on which homes to dig into, and which to ignore.
But that wasn’t simple, either. Records of Flint’s buried pipelines were a scattered mess, and the city had neither the time nor the millions of dollars to blindly excavate beneath each home to examine its pipelines. There had to be another way to unravel the riddle, the researchers figured.
It involved statistics.
Related Flint stories:
- Flint water crisis: Full coverage
- He told Flint to ‘relax.’ Now, Michigan is paying him to lead media training.
- Flint’s recovery begins with a carrot: How a unique program is healing the city.
- Preschool works wonders for Flint water crisis kids. But funding is running out.
- Flint Township tells the world: Please don’t confuse us with Flint.
Using machine learning, any data they could scrounge, and a $150,000 grant from Google, researchers constructed a model that could predict lead risks by entering such factors such as a building’s age, size, location and value. The researchers teamed with more experts at UM-Flint to build a web application for residents wanting to see the risk in their homes — all at no cost to the city.
In 2017, the university team’s research revealed more than 20,000 structures that likely had unsafe lead or galvanized steel service lines — far above the city’s original estimate of 15,000.
Through 2017, the number crunching played a vital role in Flint’s ongoing recovery; helping to save money as the city pinpoints and replaces hazardous lines. Expected to last into 2020, the project is drawing up to $97 million in state funds and federal funds awarded last year.
Had Flint used the algorithm from the project’s start, it could have saved up to $11 million in avoided inspection and excavation costs throughout the project, researchers wrote in a paper they presented last week at a data conference in London.
“We were just a bunch of nerds who cared a lot about data...and we saw a crisis here,” said Eric Schwartz, a professor of marketing at the University of Michigan’s Ross School of Business and a co-adviser on the Michigan Data Science Team.
Schwartz, computer science professor Jacob Abernethy and the students earned a “best student paper” award at the Association for Computer Machinery’s KDD Conference in London. They see their effort as a blueprint for other Michigan cities who may be required to swap out lead service lines in the coming years.
Across the state, Michigan has roughly 460,000 lead service lines, according to a 2016 American Water Works Association survey. The cost to replace each line typically runs about $5,000.
National Guard Brig. Gen. Michael McDaniel, who led Flint’s pipeline replacement efforts from 2016-2017, agrees other cities could use the researchers’ algorithm.
“Absolutely, I hope they do” use it, he told Bridge.
Even so, and for reasons that remain unknown, Flint stopped seeking the university data team’s pro bono expertise in 2018 after signing a $5 million contract to hire a private engineering firm — Los Angeles-based AECOM — to coordinate pipeline replacement efforts.
Though U-M researchers offered their data and capabilities to AECOM, the company does not appear to be using the statistical model, according to the city’s court filings in a federal lawsuit relating to service line replacements.
Now, the academics say they aren’t sure how their work in Flint is being used.
“We provided the city with predictions and recommended addresses to visit to help them get the biggest bang for their buck," Schwartz said. “We hoped they’d continue to use the work and collaborate to update the model and improve the predictions.”
Neither AECOM nor the city responded to detailed questions from Bridge Magazine.
‘Groping around in the dark’
The chain of blunders that caused lead levels in Flint’s drinking water to skyrocket is now well known.
Led by an emergency manager appointed by Michigan Gov. Rick Snyder, the city switched drinking water sources from Detroit’s system to the Flint River in 2014 in an attempt to cut costs. The Michigan Department of Environmental Quality approved the change – but didn’t require treatment to control corrosion of aging water mains. Highly corrosive river water then rusted the mains, causing lead to leach into drinking water.
DEQ regulators first ignored the problem, then tried to discredit whistleblowers before the episode swirled into a public health, social justice and political crisis.
Drawing less attention, Schwartz said, was Flint’s information crisis. Even after the city switched back to Detroit water and began replacing its lead and galvanized steel pipes, many Flint neighborhoods were in the dark about their lead risks.
McDaniel, a former assistant adjutant general for Homeland Security for the Michigan National Guard, recalled it well. In early 2016, Mayor Karen Weaver appointed him to coordinate the city’s pipeline replacement project — called the Flint Action and Sustainability Team, or FAST. The team felt serious political pressure while trying to figure out where to start digging, but it lacked accurate records to guide them.
The city found some pipeline records, but they spanned more than 100,000 index cards — many tough to decipher — in the Flint water department's basement. It also found a hand-drawn map last updated in the 1980s.
McDaniel wasn’t initially sure how much funding FAST would get; this was before a 2017 settlement between the city, state and citizens group guaranteed up to $97 million in state funds for pipe replacements.
“We were groping around in the dark,” McDaniel said.
That changed, he said, after Schwartz and Abernethy, a former U-M prof who now teaches at the Georgia Institute of Technology, arrived with their engineering students.
“The more data we fed them, the more accurate they could be in determining what neighborhoods and streets we should try,” McDaniel said.
When the data team first contacted Flint, the city had signed a contract to replace lead lines at just 36 homes. The city has since replaced nearly 7,000 service lines, according to city data current as of Aug. 16, and estimates about 12,000 remain to be replaced, though that number is now a matter of dispute.
Without solid pipe records, cities are left to flag lead pipelines in two ways.
One option is to excavate, something contractors must do anyway to replace a line. But what if a crew claws through the pavement only to find pipelines made from copper or another safe material? That’s like throwing thousands of dollars and wasted time and resources down a hole.
A cheaper option emerged during Flint’s replacement efforts: hydrovac trucks, which can inspect pipes by quickly digging a precise hole through the soil. But even those inspections run about $250 a pop, and they might not always be necessary.
“That became our job,” Schwartz said. “How can we help the team minimize the waste and excessive spending so that they can get bigger bang for their buck?”
By spitting out how likely it was that a neighborhood sat atop lead lines, the algorithm technique — dubbed “ActiveRemediation” — helped Flint determine where to excavate or hydrovac, and which areas to place on the bottom of a priority list.
The technique can lower the costly “error rate” — homes selected for pipeline replacement that turn out to have safe pipelines — from nearly 19 percent to just 2 percent, according to the data team’s paper. That can shave nearly 11 percent off a project’s costs, or up to $11 million in Flint’s case.
How it works
The algorithm can draw on all sorts of information known about a city. In Flint, including the limited pipeline information from the basement-dwelling index cards. (Captricity, a software company, helped digitized some of the data.)
Also feeding the algorithm in Flint: data on nearly 56,000 housing parcels in the city, including addresses, property values and building characteristics like age. Knowing the age and value and location of Flint’s homes proved particularly revealing.
“For instance, homes built during and before World War II and those that are lower in value are more likely to contain lead in their public service line,” the data team wrote in its study.
The algorithm cross-referenced these records with the partial pipeline and water records to generate lead probabilities throughout the city. It grew more accurate as the city conducted more inspections. From the field, contractors feed new information into the database — using an app built by the data researchers.
It’s a process that could hold promise for other cities where old service lines are also drawing scrutiny.
A model for other cities?
In June, Michigan began enforcing a slate of new regulations aimed at reducing lead risks in drinking water across the state. Beginning in 2021, some utilities must start replacing all of their lead service lines within 20 years. Some utility officials balk at the cost and complicated logistics.
Schwartz says he hopes more cities will use similar math to steer replacement efforts. He said he’s discussed the algorithm with officials in Detroit and cities in Missouri, New York and Pennsylvania.
“Most cities don’t know how many lead pipes they have, and cities don’t know where most of their lead pipes are,” Schwartz said.
But Flint’s pipeline project isn’t controversy-free.
Last week, attorneys for citizens groups that sued Flint and Michigan over the city’s pipelines returned to a federal courthouse in Detroit. They argued Flint wasn’t carrying out its end of the settlement that earmarked $97 million in state funds for pipeline replacements.
Among the plaintiff’s allegations: that Flint hasn’t properly explained why it believes no more than 18,000 lines — in a city with 28,400 occupied homes — needed replacing in March 2017, when the parties forged the settlement.
Flint can trigger that state funding if an analysis concludes more than 18,000 lines need replacing. But the city can’t recoup additional funding if it discovers the higher number further along in the project, the plaintiffs, represented by the American Civil Liberties Union and the Natural Resources Defense Fund, wrote in a legal brief.
In a sworn affidavit, Alan Wong, the Flint program manager for AECOM, the engineering firm, says data from excavations in 2018 support Flint’s projection that it will see fewer lead lines as it gets into the newer parts of the city — and other calculations that give the city no reason to believe more than 18,000 lines needed replacing.
Wong wrote that the plaintiffs’ suggestion that Flint develop a “sound predictive model” estimating lead service lines could cost up to $350,000 in consulting fees that the city doesn’t have. Wong also said he was “not aware any algorithm that has been developed to reliably predict the composition of buried infrastructure that has been installed over 5 to 10 decades.”
Wong’s statement surprised Schwartz of U-M, who learned of the filing from Bridge.
The researchers’ algorithm — if it were still being used — was designed and delivered to answer those exact questions, Schwartz said, and its accuracy has been statistically validated.
Curiously, an appendix in Wong’s declaration included a map that was based on the U-M researchers' predictions from December of 2017.
“We’d love to update those predictions to help them as they're spending the rest of the $100 million on a tight schedule to remove thousands more lead service lines,” Schwartz said.
It’s not clear why Flint and AECOM aren’t erring on the side of estimating too many pipelines in need of replacement, rather than too few. A broader estimate would ensure the city more state funds.
Reached last week, Wong told Bridge to email him questions, which he also had to run by spokespeople at AECOM and Flint. Neither the company nor the city have yet responded to repeated messages from Bridge.
We’ve been there for you with daily Michigan COVID-19 news; reporting on the emergence of the virus, daily numbers with our tracker and dashboard, exploding unemployment, and we finally were able to report on mass vaccine distribution. We report because the news impacts all of us. Will you please donate and help us reach our goal of 15,000 members in 2021?