research Market Intelligence /marketintelligence/en/news-insights/research/a-practical-use-case-of-textual-data-analysis-on-credit-ratings-research content esgSubNav

In This List

A Practical Use Case of Textual Data Analysis on Credit Ratings Research

Case Study

Searching for Alpha with Textual Data


Unlocking the Full Potential of Earnings Transcripts with Kensho NERD


The cloud complexity storm & changing organizational dynamics of IT – Highlights from VotE: Cloud, Hosting & Managed Services


5G will be a killer app for cloud native, as recent deals signify

A Practical Use Case of Textual Data Analysis on Credit Ratings Research

This article is written and published by S&P Global Market Intelligence, a division independent from S&P Global Ratings. Lowercase nomenclature is used to differentiate S&P Global Market Intelligence credit scores from the credit ratings issued by S&P Global Ratings.


In a previous article, we explored the ability to extract forward-looking and credit-sensitive keywords from S&P Global Ratings’ research reports via the machine-readable RatingsXpress®: Research dataset. In this article, we demonstrate the creation of a credit signal that can be used by risk managers to anticipate potential rating moves.

In our research, we collected over 65,000 S&P Global Ratings’ research reports (including Full Report, Summary Report, and Research Update) of corporates in the period from 2013 to the latest year, via RatingXpress. Since the machine-readable research reports were divided into sections and tagged, we easily obtained the parts we were interested in: Downside Scenario and Upside Scenario sections.

With the textual data extracted from the downside/upside scenario sections, we first used regular expressions to filter the sentences which have numbers or percentages. Then, we applied Natural Language Processing (NLP) algorithms to identify, separate, and extract the financial keywords, signs, and numerical thresholds. Table 1 below shows a few examples of the algorithm outputs.

Table 1: Examples of NLP algorithm outputs from downside/upside scenario sections

Source: S&P Global Market Intelligence. As of August 20, 2021. For illustrative purposes only.

We randomly selected 1,000 research reports, manually tagged the Downside/Upside Scenario sections, and compared the manual outputs against the algorithm outputs. The algorithm achieves an accuracy of >95% for numerical thresholds and >80% for financial keywords.

Financial Ratio Thresholds for Rating Downgrade/Upgrade

In the S&P Global Ratings methodology for rating corporate industrial companies and utilities, two core financial ratios and five supplementary financial ratios are defined, with corresponding benchmark ranges, for the assessment of a company’s financial risk.[1]

In the Outlook sections of rating reports, we can find the downside/upside scenarios under which a rating action may be triggered if the specific financial ratios breach pre-defined thresholds. The top five most frequently mentioned financial ratios in downside/upside scenarios are:

Table 2: Top five financial ratios in downside/upside scenario sections

Source: S&P Global Market Intelligence. As of August 20, 2021. For illustrative purposes only.

The list of extracted financial ratios are, in general, consistent with the financial risk ratios listed in the rating criteria. When we look into the relative frequencies of extracted threshold values, we can see some threshold values, which are mentioned in the upside/downside scenarios, but are not explicitly defined in the rating criteria. An example is given in Table 3.

Table 3: Threshold values of FFO/Debt extracted from downside/upside scenario sections

Source: S&P Global Market Intelligence. As of August 20, 2021. For illustrative purposes only.

A Credit Signal for Potential Rating Downgrade

In the second part of our empirical study, we defined a credit signal using the financial thresholds extracted from the downside scenarios and the annual financial statements from S&P CreditStats dataset. We tested the signal on the two core financial ratios (i.e., FFO/Debt and Debt/EBITDA) individually and assessed its performance in predicting rating downgrades over a one-year time-horizon.

To construct the credit signal, we compared the threshold values in the downside scenarios of each rating reports against the latest financial ratios of the corresponding company as at the report publication date. The financial ratios and thresholds were normalized, and the differences were computed. The sign of a credit signal indicates whether a financial ratio has breached its downside scenario threshold or not, while the magnitude of a credit signal indicates the normalized distance of a financial ratio from its downside scenario threshold. For example, a positive credit signal for FFO/Debt (Debt/EBITDA) means the actual financial ratio is higher (lower) than the downside scenario threshold. Our hypothesis was that the higher the value of credit signal (i.e., the actual financial ratio is more distant from the threshold mentioned in the report), the less likely a downgrade to happen. This was tested by using the area under curve (AUC) measure of receiver operating characteristic (ROC) curve.

The ROC measures of the credit signal are just slightly above 0.5,[2] which indicates the distance of a historical financial ratio from its downside scenario threshold is not a good predictive classifier of rating downgrade. This may not be surprising because historical financial performance is not necessarily predictive of future financial performance.

Then we tested the usefulness of the credit signal as a surveillance indicator. We examined the annual financial statements published after the rating reports to see if the newer financial ratios breach the original downside scenario thresholds or not. We grouped our samples into 4 groups according to the signs of credit signal as at the report publication date (T) and after the report publication date (T+1). The percentages of samples downgraded within one year are reported in Table 4. The significant differences between the downgrade percentages of positive and negative credit signals (after the report publication) show the credit signal is a good surveillance indicator.

Table 4: Percentage of samples downgraded

Source: S&P Global Market Intelligence. As of August 20, 2021. For illustrative purposes only.

Additional Considerations

Our empirical study demonstrates the use of NLP techniques on rating reports for the automatic generation of credit signals. The results show that the credit signals are suitable for credit surveillance purposes. Obviously, the financial thresholds mentioned in the downside/upside scenario sections are just part of the many considerations for rating migrations. There are lot of useful information, which may require manual extraction or advanced NLP techniques, to enhance our surveillance signal.


[1] S&P Global Ratings, “General Criteria: Corporate Methodology”, May 27, 2021.

[2] A random binary classifier has a ROC measure of 0.5.

Streamlined delivery of S&P Global Ratings data

Learn more
Learn more about RatingsXpress
Request Demo