Machine Learning can be used to improve energy use in cities

Drexel researchers use machine learning to predict Philadelphia's energy use.


The City of Philadelphia has set a goal of halving GHG emissions from the built environment by 2030. Carbon reduction programs aimed at the business and residential building sectors are a top priority, as commercial buildings and facilities are the region’s single most significant contribution to Greenhouse gas emissions(GHG)

The College of Engineering at Drexel University researchers are attempting to forecast how energy use will change as neighbourhoods change. They are using a machine learning model they developed to help them.

In 2017, the city declared a target of becoming carbon neutral by 2050, with a reduction in greenhouse gas emissions from building energy usage accounting for approximately three-quarters of Philadelphia’s carbon footprint at the time. The problem for Philadelphia, one of the oldest cities in the nation, is that there is no one-size-fits-all approach to energy use due to the huge variety of building styles.

But to accomplish this goal, it’s important to incorporate energy use forecasts into the zoning decisions that will shape future development, not just for new construction but also for buildings already in use.

Simi Hoque, Ph.D., a professor in the College of Engineering who led research into using machine learning for granular energy-use modeling recently published in the journal Energy & Buildings, said, “For Philadelphia in particular, neighborhoods vary so much from place to place in the prevalence of certain housing features and zoning types that it’s important to customize energy programs for each neighborhood, rather than trying to enact blanket policies for carbon reduction across the entire city or county.” 

Hoque’s team believes that properly implemented existing machine learning programs can provide insight into how zoning decisions may affect future greenhouse gas emissions from buildings.

She said, “Right now, there is a huge volume of energy use data, but it’s often just too inconsistent and messy to be reasonably put to use. For example, one dataset corresponding to certain housing characteristics may have usable energy estimates. However, another dataset corresponding to socioeconomic features is missing too many values to be usable. Machine learning is well equipped to handle this challenge because they can iteratively learn and improve through the training process to reduce bias and variance despite these data limitations.”

The researchers developed a technique to extract information from the fragmented data by combining two machine learning programs. One that can extract patterns from huge amounts of data and use them to make predictions about future energy, and a second that can pinpoint the details in the model that most likely had the most significant impact on changing the projections.

They used vast commercial and residential energy-use data for Philadelphia from the 2015 U.S. Energy Information’s Residential Energy Consumption Survey and Commercial Buildings Energy Consumption Survey, as well as the city’s demographic and socioeconomic information from the U.S. Census Bureau’s American Communities Survey, to train a deep-learning program called Extreme Gradient Boosting (XGBoost). 

The program learned enough knowledge from the data to make correlations between a long list of factors, including building density, the population in a specific area, building size, occupant count, number of days heating or cooling was used, and energy use for each home or structure.

Although deep learning models like XGBoost are excellent for producing accurate forecasts, the intricacy of their operations might make it difficult to understand how they operate. 

The researchers used a Shapley additive explanations analysis, a method used in game theory to distribute credit among factors that contributed to an outcome, to decode the so-called “black box” program’s estimates and recommendations. 

They were able to determine how much a change in building density or square footage, for instance, affected the projection made by the program.

Hoque said, “Machine learning models like XGBoost learn how to chug through datasets to fulfill a specific task — like generating a reliable forecast of a system — but they do not claim to understand or represent the on-the-ground relationships that underlie a phenomenon, And while a Shapley analysis cannot tell us which features have the greatest impact on energy use, it can explain which features had the greatest impact on the model’s energy use prediction, which is still quite a useful piece of information.”

The team then tested the model by providing data from a hypothetical scenario given by the Delaware Valley Regional Planning Commission, which forecasted continued economic development in Philadelphia through the 20th century.

The scenario predicted a 17% rise in population with a corresponding increase in households and a variety of work and income opportunities by region around the city. 

The model projected how future residential and commercial development would impact greenhouse gas emissions from building energy usage in 11 distinct city sections for each scenario and which variables played essential roles in creating the projections.

In the 2045 scenario, six of the 11 areas would reduce their energy usage, largely in lower-income areas. Mixed-income regions, such as the city’s northernmost half, including Oak Lane, would likely increase their energy use. 

The presence of single-family attached (lower energy use) versus detached (higher energy use) dwellings played an important role in the Shapley analysis, with high monthly electricity costs, lot sizes of less than one acre, and a lower number of rooms per building all contributing to lower energy use projections.

Overall, the home energy prediction model discovered that characteristics associated with lower construction intensity are associated with lower energy consumption predictions in the model. 

They wrote, “These results give reason to reinvestigate the effects of upzoning policies, commonly present as an affordable housing solution in Philadelphia and other cities across the U.S., and subsequent changes in energy use for these areas.”

The most important details in this text are that the machine learning model predicted little change in energy use under 2045 conditions and that the Shapley analysis identified building square footage and employee count as the most important predictors of energy use for most types of commercial buildings.

Hoque said, “I see much potential in using machine learning models like XGBoost to forecast energy use increases or decreases due to new construction projects or policy changes. For example, building a new rail line in a neighborhood may change the demographics and employment of a neighborhood. Their methods would be ideal for incorporating that information in the context of an energy prediction model.”

The team understands that further testing is required and that the program will improve as more data is submitted. They propose that the research continue by focusing on parts of the city with known high-energy use and performing a Shapely analysis to identify some of the elements that may be contributing to it.

The researcher said, “We hope this will provide a resource for future researchers and policymakers so they don’t have to scope through the entire city of Philadelphia but can hone in on neighborhoods and variables which we have flagged as areas of potential importance. Ideally, future studies would use more interpretable methods to test whether these features correspond to higher or lower energy estimates in a given area.”

According to the study, commercial buildings in the top quantiles of square footage and employee count should be the primary targets for energy reduction programs, with an approximate threshold of 10,000 square feet of total building area prioritized due to the model’s disproportionate influence on energy prediction.

The researchers caution against assuming a direct relationship between variables and changes in energy use in the model. However, they believe it is still quite helpful because of its ability to provide planners with both a high-level and granular look at the interplay of zoning decisions and development and their effect on energy use.

The study suggests that commercial buildings in the top quantiles of square footage and employee count should be the primary targets for energy reduction programs, with an approximate threshold of 10,000 square feet of the total building area being prioritized due to their disproportionate influence on the energy prediction of the model.

The researchers caution against assuming a direct link between variables and energy use changes in the model but suggest that it is still quite useful because of its ability to give planners both a high-level and granular look at the interplay of zoning decisions and development and their effect on energy use.

Journal Reference:

  1. Shideh Shams Amiri, S., Mueller, et al. Investigating the application of a commercial and residential energy consumption prediction model for urban Planning scenarios with Machine Learning and Shapley Additive explanation methods. Energy and Buildings. DOI: 10.1016/j.enbuild.2023.112965


Latest Updates