Data Strategist Corner: Leveraging AI to Solve the ESG Challenge

By Christina Ho and Trent Bradberry

Peter Drucker, who was recognized as the world’s greatest management consultant, famously said, “If you can’t measure it, you can’t improve it.” This quote is relevant for addressing the Environmental, Social, Corporate Governance (ESG) challenge.

The History of ESG

ESG are non-financial factors used to assess the sustainability and societal impact of an investment. It represents a fundamental shift from the 20th to the 21st century in how social responsibility is perceived to impact a company’s financial performance.

Historically and for most of the 20th century,

the prevailing theory was that social responsibility negatively affects a company’s financial performance. An American economist Milton Friedman, a Nobel prize winner in 1976, was the leading voice on the shareholder or stockholder theory, noting that a company’s sole responsibility was to its shareholders.

Toward the end of the 20th century,

the notion of “triple bottom line”, coined by author and a serial entrepreneur John Elkington, began to emerge and gain momentum. Elkington identified this new framework whereby a company’s value is determined by social and environmental factors, in addition to the traditional financial factors.

At the beginning of the millennium,

in response to the growing ESG investment market, some institutions began to develop related products and services. Research was also being published to challenge the assumption that social responsibility was a cost that yielded no financial returns.

In 2006,

the United Nations launched the Principles for Responsible Investing (PRI) which is dedicated to achieving a sustainable global financial system for long term value creation. Since then, the number of signatories has grown to 3000, representing over $100 trillion in assets under management.

ESG Reckoning

Unfortunately, such commitment has not resulted in substantive improvement in responsible investing. In a recent Harvard Business Review blog “An ESG Reckoning Is Coming”, authors Michael O’Leary and Warren Valdmanis rendered the following critical assessment:

According to research last year, investors who signed onto the United Nations principles did not improve the social and environmental performance of their investments. According to the researchers, signatories “use the PRI status to attract capital without making notable changes to ESG.” Similarly, signatories to the Business Roundtable statement have performed no better than other companies in protecting jobs and worker safety during the pandemic.

The authors suggested that standardized and required public disclosures would lay the foundation for further accountability. Currently, the Securities and Exchange Commission (SEC) does not mandate ESG reporting for U.S. companies. As such, performance metrics are not standardized and comparable across companies. SEC has been under pressure from the investor community to require ESG disclosures.

On February 24, 2021, Acting SEC Chair Allison Herren Lee announced that the agency is taking action to update its 2010 guidelines on climate-related disclosures in public company filings. In subsequent weeks, the SEC also announced a series of other ESG-related priorities and changes. These steps are clear signals that changes are coming.

Possible Solutions

The ultimate challenge of ESG is summarized in Dr. Drucker’s quote at the beginning of this blog. Although there is already a significant amount of information (and data) provided by the companies, investors do not have the tools and ability to efficiently and effectively extract reliable and comparable insights to measure the performance of a company as it relates to ESG.

In the section below, we explore how natural language processing (NLP) can solve this daunting problem both from an investor and regulator perspective.

Tackling the Lack of Transparency in ESG Reporting

One way to promote accountability through greater transparency of a company’s ESG status is by revealing relevant insights through analytical methods aimed at processing descriptive, natural language from documents about a company. Over the past decade, NLP has made great strides within the domains of Artificial Intelligence and Machine Learning. In short, NLP uses computer algorithms to efficiently manipulate, summarize, classify, and extract information from text and speech. Applying NLP to text containing ESG information can offer descriptive insights for making informed decisions about a company’s sustainability, and it can do so at a faster rate than a human analyst.

In the domain of ESG, textual sources are available along a continuum of transparency. Three sources of ESG text ordered from least to most transparent are:

Self-Reported Sustainability Assessments

A company’s sustainability assessment is often published on its website. While the report can provide detailed ESG-related information, it can also be biased because it is voluntary and unregulated. Companies may avoid reporting negative information.

Nonetheless NLP can be used to automate tasks such as summarization, keyword extraction, topic modeling, document classification, or sentiment analysis. These types of techniques are especially useful when one has sustainability assessments from multiple companies and wants to process them together to reveal trends and outliers. For example, topic modeling could reveal that most companies report on topics related to COVID-19 and employee health, yet a few companies avoid reporting on this topic. Text summarization could then be applied to quickly assess these outliers’ overall messaging.

SEC-Related Documents

SEC-related documents, in theory, are more transparent and regulated. Required Form 10-Ks may contain ESG information that a company discloses to the SEC. A company’s annual meeting proxy statements, also required by the SEC, are also a good source of potential ESG disclosures. In addition, shareholder resolutions can be a valuable source of the types of ESG issues raised by shareholders.

As with self-reported sustainability assessments, NLP techniques such as topic modeling and text summarization can be used to extract insight from these documents. In fact, recent NLP work on shareholder resolutions has reported revealing topics related to emissions and energy, boards, regulations, politics, governance, product management, and accountability.

While these SEC-related text sources are typically more transparent and reliable than self-reported sustainability reports, there is increased scrutiny around them, leading the SEC to recently announce the formation of an ESG task force to consider a more well-defined regulatory framework.

News Articles

The most transparent text source is news articles. Not only is news more transparent, but the frequency at which timely information is provided allows NLP to be used to monitor ESG-related events as they are reported. In addition to applying NLP to news articles in the manner described in the previous two sections, two reported examples exist where state-of-the-art techniques for language understanding were applied to automatically classify news articles into at least 20 ESG-related categories such as physical impacts of climate change, employee diversity and inclusion, business ethics, etc. (Mukherjee, 2020; Nugent et al., 2020).

Elder Research has developed a proof-of-concept called the News Analyzer that scrapes news articles from the Internet and uses these NLP technologies to filter them by relevance and sentiment before displaying them to users. This technology could be extended into the ESG domain to offer clients informative monitoring of breaking ESG-related news about companies.

ESG Regulation Development

In addition to deriving insights about companies to monitor their ESG status, NLP could aid in the development of disclosure standards. These NLP techniques can reveal ways in which companies may evade disclosure of adverse behavior.

Consider a scenario where text classifiers are leveraged in a process to collect news articles about companies’ environmental violations. NLP can be used to compare these companies’ SEC-related documents and self-reported sustainability assessments against a group of non-offenders’ documents to reveal distinguishing characteristics that could inform new regulations to prevent future circumvention of the system.


The current lack of ESG standardization, comparability, and transparency makes it difficult for individual investors, portfolio managers, and government agencies to fully understand a company’s sustainability. However, NLP, a core competency of Elder Research, offers techniques for data-driven monitoring and regulation development that could dramatically change the present reality for the common good.

A leading consulting company in data science, machine learning, and AI. Transforming data and domain knowledge to deliver business value and analytics ROI.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store