Benchmarking ordinances expose grade inflation in Energy Star scores

Many Gridium customers are proud of their Energy Star rating and believe it represents superlative efficiency. But new data sets are raising questions about the validity of the EPA’s building benchmarking system. In short, your Energy Star score probably doesn’t mean what you think it means.

Let’s start with the EPA’s definition of an Energy Star score “The ENERGY STAR score, expressed as a number on a simple 1 – 100 scale, rates performance on a percentile basis: buildings with a score of 50 perform better than 50% of their peers; buildings earning a score of 75 or higher are in the top quartile of energy performance.”

“That’s the news from Lake Wobegon, where all the women are strong, all the men are good looking, and all the children are above average.”

So far, so simple. It turns out, though, that constructing this percentile index isn’t simple at all. In fact, the rating system is largely a black box. The EPA starts with a sample of building energy performance from the CBECS database. It then adjusts the data using statistical models that attempt to correct for factors such as building size, age of construction, occupancy, building type, computer load, etc. At the other end, an Energy Star score pops out. The precise methodology and input data set have never been exposed.

Until now. The very success of Energy Star has allowed intrepid outsiders to peer inside the black box. Building rating policies have spread like wildfire and now cover 51,000 commercial buildings and 5.8 Billion square feet. That enormous data set allows smart scientists to reverse engineer the percentile scale. What they’ve discovered raises troubling questions.

Consider the fundamental promise of Energy Star’s percentile index: your score indicates the percentage of buildings that consume more energy than yours, on an apples to apples basis. That is, if you’re a medical office, you’ll be compared to similarly sized medical offices in similar climates.

Here are the recent median Energy Star scores scores from city-wide disclosures:

  • New York: 70 (pdf)
  • Washington DC: 74 (pdf)
  • Philadelphia: 64 (pdf)
  • Chicago: 76 (pdf)

By definition, the average score across a large sample of buildings should be close to 50, the (theoretical) halfway point on the index. It’s possible that all of the cities that have reported numbers are all above average — maybe all the bad buildings are in Kansas? — but it’s not terribly likely.

Part of the alchemy of Energy Star is “adjustments” that facilitate the aforementioned apples-to-apples comparisons. Unsurprisingly, this is a really tricky area. Every building is architecturally unique, sits on a unique site, and houses a unique set of occupants and activities. This graph from the excellent Philadelphia Sustainability office shows the relationship between Energy Star score and energy use intensity (that is, the energy used per square foot).

EnergyStarinphilly
City of Philadelphia

The strained relationship between energy use and Energy Star Score

You’d expect a strong inverse correlation between energy use intensity and Energy Star scores. While it’s true that every building is different, in general, across a large set of buildings, lower energy use per square foot should indicate greater energy efficiency. Instead you see a massive spread. Even in the relatively homogenous office category we see building EUIs from 2.5 to 100 kbtu/sq ft achieving high Energy Star ratings (75 and above) and we see non rated buildings run a similar spectrum. Certainly there are good reasons why, in individual cases, high EUIs can go hand-in-hand with high efficiency. But the overall lack of a relationship is a red flag. If you’re the office building in this sample running 74 kbtu/sq ft with an Energy Star score of 34, you’re right to wonder if you’re being graded correctly.

Recent research into the models behind Energy Star from professor John Scofield at Oberlin raise sharp challenges to the Energy Star methodology. In particular, professor Scofield estimates that Energy Star scores are uncertain by about 35 points, that savings claims based on Energy Star scores are spurious, and that the true average Energy Star score is probably closer to 60 or 70.

For many experienced operators, these criticisms will ring true. Achieving an Energy Star score of 75 is not that hard, and just because you have a score of 90 doesn’t mean your energy management mission is complete. Finally, smart teams realize that Energy Star is just a number, and for many buildings the dollar costs of energy relative to local market norms are much more important.

Nevertheless, these issues go deep and to the core of the Energy Star rating system, and if you’ve led your company into Energy Star-based management, you should get smart about the underlying issues and your own likely grade inflation. I highly recommend John Scofield’s ACEEE paper (pdf) and accompanying YouTube video:

About Tom Arnold

Tom Arnold is co-founder and CEO of Gridium. Prior to Gridium, Tom Arnold was the Vice President of Energy Efficiency at EnerNOC, and cofounder at TerraPass. Tom has an MBA from the Wharton School of Business at the University of Pennsylvania and a BA in Economics from Dartmouth College. When he isn't thinking about the future of buildings, he enjoys riding his bike and chasing after his two daughters.

You may also be interested in...

RealCrowd podcast at CREtech AI event

“The challenge is how do you get out of that sea of data into making decisions, with AI, that are actually going to build the value, not the drudgery of analyzing.”

Introducing Gridium Alpha – Zero CapEx Energy Projects

Gridium Software builds value from what you currently have. Our new Gridium Alpha offering develops upgrade projects for your building, measures the results, and sells these harvested efficiency resources back to the Grid. All with zero upfront costs.