With the growing adoption of large language models, such as OpenAI's ChatGPT, the enormous environmental footprint of artificial intelligence, or AI, is increasingly in the news. This cost is often discussed in terms of electricity consumed, carbon released into the atmosphere, and water needed to operate massive data centers.
But there is a hidden cost of AI that needs to be considered: The toll on public health associated with the resulting increase in air pollution. So says a new study by Caltech and UC Riverside (UCR) scientists published online December 9 on the arXiv preprint server.
This air pollution is expected to result in as many as 1,300 premature deaths per year by 2030 in the United States. Total public health costs from cancers, asthma, and other diseases, as well as from missed work and school days are approaching an estimated $20 billion per year.
"When we talk about the costs of AI, there has been a lot of focus on measurements of things like carbon and water usage. And while those costs are really important, they are not what's going to impact the local communities where data centers are being built," says Adam Wierman, the Carl F Braun Professor of Computing and Mathematical Sciences and director of Information Science and Technology at Caltech, who is a corresponding author of the paper. "Health is a way of focusing on the local impact these data centers are having on their communities and understanding, quantifying, and managing those impacts, which are significant."
Wierman notes that AI is going to be a significant part of our lives, offering clear benefits that have the potential to improve societal systems. "At the same time," he says, "we need to make sure that we have our house in order, and that the negative impacts that come from it are recognized, quantified, minimized, and shared equitably."
The authors, including Shaolei Ren, a UCR associate professor of electrical and computer engineering, recommend that standards and methods be adopted that require tech companies to report the air pollution caused by their power consumption and backup generators.
The researchers further recommend that tech companies properly compensate communities hit hardest by air pollution for the health burden caused by data-processing center electricity production.
The authors also found that air pollution stemming from AI disproportionally affects certain low-income communities, partly because of their proximity to power plants or backup generators at the data-processing centers. Additionally, the pollution drifts across county and state lines, creating health impacts on communities far and wide, Ren says.
"The data centers pay local property taxes to the county where they operate," Ren says, "But this health impact is not just limited to a small community. Actually, it travels across the whole country, so those other places are not compensated at all."
For example, pollution from backup generators at data centers in Northern Virginia drifts into Maryland, West Virginia, Pennsylvania, New York, New Jersey, Delaware, and the District of Columbia, creating regional public health costs of some $190 million to $260 million a year. If these backup generators emit at their maximum permitted level, the annual costs will increase tenfold and reach $1.9 billion to $2.6 billion.
In some areas, the public health cost associated with AI processing centers may exceed what the tech companies pay for electricity, the study shows.
As tech companies race to provide AI services that are reshaping how we work and play, the resulting air pollution in the form of lung-penetrating fine particles—those smaller than 2.5 micrometers—and other federally regulated pollutants, such as nitrogen oxides, is expected to steeply increase. The public health burden is expected to be double that of the U.S. steel-making industry by 2030, when it could rival that of all the cars, buses and trucks in California, the study projects.
"The growth of AI is driving an enormous increase in demand for data centers and energy, making it the fastest-growing sector for energy consumption across all industries," Ren says.
As an example, the scientists calculated the emissions from training a large language model, or LLM, at the scale of Meta's Llama-3.1, an advanced open-weight LLM released by the parent company of Facebook in July to compete with leading proprietary models like OpenAI's GPT-4. The researchers found that generating the electricity needed to train this model produced an air pollution equivalent of more than 10,000 round trips by car between Los Angeles and New York City.
The authors estimated the health costs, including premature deaths, using statistical methods developed by the US Environmental Protection Agency; these methods account for known epidemiological risks associated with air pollution from power plants and backup diesel generators. The 1,300 expected annual deaths by 2030 represent the midpoint of a range between 940 and 1,590.
"If you have family members with asthma or other health conditions, the air pollution from these data centers could be affecting them right now. It's a public health issue we need to address urgently," Ren says.
Additional authors of "The Unpaid Toll: Quantifying the Public Health Impact of AI" are Yuelin Han, Zhifeng Wu, and Pengfei Li of UCR's Bourns College of Engineering.