Peeling Back the Onion: Understanding What Goes into an ESG Rating

The divergence of environmental, social, and governance (ESG) ratings across providers is an area of increasingly greater focus given their increased use by regulators for policymaking and by investors for investment decisions. Here, the authors discuss at a granular level what goes into an ESG rating, particularly the modeling choices involved in their construction.

In this piece, we provide a step-by-step illustration using companies in the global automobiles industry with “raw” ESG data from four leading ESG data providers. They discuss the differences in the metrics measured, how they are measured, and how they are combined and aggregated. ESG ratings are complex, but just because something is complex does not mean it isn’t useful or informative. The authors highlight the rich set of information that underlies ESG ratings, which can be used by investors. They also stress the need for analyses using ESG ratings to acknowledge (at a minimum) and address (ideally) the differences between rating frameworks.

A version of this paper was published in The Journal of Impact and ESG Investing, Vol. 4, Issue 1, Fall 2023. Portfolio Management and Research,

ESG Ratings: A Subject Well-Trod, but Not Well Understood

Once an evolving concept, ESG ratings have become a term almost everyone, investors and non-investors alike, is familiar with. In the US, ESG ratings have come under scrutiny in recent years; Hester Peirce, commissioner at the Securities and Exchange Commission (SEC) was famously quoted in The Economist1 as decrying ESG ratings as “labelling based on incomplete information, public shaming, and shunning wrapped in moral rhetoric.” We believe the blowback against ESG ratings in recent years is at least partly a function of investors not fully
understanding how they are constructed.

Whether it is because ESG ratings have complicated methodologies or because they tend to use a multitude of data inputs (more so, for instance, than typical income statement or balance sheet measures), most investors and casual observers of ESG investing cannot readily describe what goes into a rating.

In this article, we lay out what is actually involved in an ESG rating. We believe this is a critical first step for anyone trying to understand or analyze ESG ratings and their implications for financial modeling and investment applications. With the explosion of empirical analysis concerning ESG ratings, we fear that the underlying traits and dynamics of rating systems are not fully appreciated. Most concerningly, broad claims about ESG ratings and their relationship to returns, risk, or other security characteristics are often made based on a specific ESG rating system/provider.

ESG ratings and what goes into them reflects a rich set of information likely not yet fully utilized. We will see that understanding the differences between ESG rating systems and thinking through how that might impact any empirical analyses is a necessary first step for all ESG ratings research in our view.

We are not the first to point out that ESG ratings are different. Chatterji, Durand, Levine and Touboul were among the earliest researchers to document a lack of agreement,they analyze six social rating data providers. More recently, Berg, Kölbel, and Rigobon show the low level of correlation between six ESG ratings providers, about 54%.3 Jacobs and Levy discuss how the rating disparities can make it difficult to assess whether ESG ratings are aligned with companies’ ESG performance and ESG investing affects investment performance.4

This piece is meant to complement this ongoing discussion about ESG ratings’ divergence by bringing intuition around what we mean when we talk about modeling choices made in building a rating. These include the choice of measurement, the weighting of metrics, and the normalization of metrics, among others. Overall, we aim to help further the discussion of ESG ratings by going under the hood of model decisions, illustrating in detail the different choices that vendors make and the impact of these modeling choices.

An Abbreviated History: How ESG Ratings Came To Be and Their Importance Today

While ESG has only recently exploded into the mainstream lexicon, an earlier form was around as early as the 1960s, when socially responsible investing (SRI) began to gain popularity. At the time and still largely true today, SRI means selecting what to invest in based on a company’s social or environmental impact, in addition or sometimes irrespective of its financial performance. Examples of early providers of SRI screens and data include KLD, founded in 1988, and Jantzi Research, founded in 1992. (The approaches to SRI created by KLD and Jantzi Research now underly the rating frameworks for MSCI and Sustainalytics, respectively.) Early firms offering SRI assessments had a broad, loosely defined approach to evaluating firms. They analyzed the environmental and social “worthiness” of companies in how they conducted their business but also how they “impacted” the larger world around them. (The notion of “impact investing,” the latter aspect, eventually became its own concept, and today refers to investing in a way that creates positive environmental or social impact.) During the 2000s and 2010s, as SRI evolved into ESG investing, whether a firm was SRI/ESG “friendly” or not really came down to individual analysts’ assessment of these firms, since actual hard data were extremely rare and difficult to come by.

Today’s ESG rating systems really came about in the 2010s, as computing power and available data (everything from company websites to employee reviews to business analytics) rapidly increased. Throughout these years, the amount and granularity of information that could be used for ESG assessments exploded. MSCI acquired RiskMetrics (KLD’s acquirer) in 2009 and GMI Ratings in 2014, revising and expanding its ESG ratings framework. Sustainalytics expanded its framework as well during this decade, eventually being acquired by Morningstar in 2020, where the framework now powers Morningstar ESG fund ratings. Other major vendors also invested heavily in ESG data and analytics, including S&P, which acquired the ESG division of RobecoSAM, an affiliate of Robeco, in 2019 and the London Stock Exchange, which consolidated various ESG ratings (FTSE, Beyond Ratings, and Refinitiv). Moody’s Vigeo Eiris, ISS, FactSet, and RepRisk round out the list of leading providers today.

Anatomy of an ESG Rating

Typically, ESG ratings are constructed as an amalgamation of raw data points. (We focus here only on ESG ratings that are “structured” ratings, as opposed to “unstructured” ratings, which can use artificial intelligence and machine learning techniques or other black-box algorithmic approaches to creating ratings.) Most structured ESG ratings are centered on the E, S, and G pillars of ESG; see Figure 1. Usually, the ESG rater starts with identifying broad themes within E, S, and G (for instance, human capital management within the “S” pillar would be a subcategory). Then a range of raw metrics are identified to measure the subcategory (raw metrics for human capital management in Figure 1 might include employee satisfaction from surveys, employee turnover, paid leave policy, training opportunities, etc.). Note that some rating frameworks may have more than two layers, the implications of which we will discuss later.

peeling back the onion

The leading providers we analyze in this article — MSCI, Sustainalytics, Moody’s Vigeo Eiris, and ISS — all employ E, S, and G pillars. However, we note that not all ESG ratings use the E, S, and G subpillars. SASB, for instance, groups activities into five “sustainability dimensions” — environment, human capital, social capital, business model and innovation, and leadership and governance. For the rest of this article, we will use the SASB categories, given SASB’s prominence in disclosure standards and the transparency of its framework. (In full disclosure, we centered on SASB to build R-factor™, an ESG rating tool used at State Street Global Advisors.)5 While SASB does not use E, S, and G subpillars, all the concepts we describe here are directly portable to those ratings frameworks that do.

We start with the choice of metrics and how they are measured.

