Artificial Intelligence Powering Synthetic Biology: The Fundamentals

This is a thought leadership report issued by S&P Global. This report does not constitute a rating action, neither was it discussed by a rating committee.

Highlights

The combination of AI and synthetic biology (synbio) will accelerate and scale-up the research, testing, and production of novel genes that have the potential to transform economies and societies.

The resulting industrialization of synbio will affect most industries and expand existing applications in healthcare, agriculture, food enhancement, renewable fuel development, and environmental remediation.

The scientific community has worked to minimize the risks associated with synbio, but wider and new applications of genetic engineering, including in humans, come with moral and ethical issues and dangers, including to biodiversity and from malevolent actors.

Living organisms and AI may seem worlds apart, yet both share a fundamental reliance on data to determine their function and enable functionality. In organisms, that information is stored in DNA sequences, or genes. For AI, the information is largely external and typically delivered in massive troves as training data (though it is also found in the code that governs models' algorithms).

The fit between biology's wealth of data and AI's input requirements is evident. And it promises to be a boon for a relatively new field of science, called synthetic biology, that is concerned with reading (sequencing), editing (synthesizing), and writing (printing) DNA to create new entities. Synbio, as it is known, is already changing industries, such as manufacturing, pharmacology, and agriculture. And its influence will continue to spread as genetic innovations and inventions give rise to new applications, including in areas such as renewable energy, healthcare, foods, and crops.

AI's ability to sort data and find meaningful (and often novel) connections means it has quickly become a central element of synbio's technology platform, which, with engineering and biology, is part of the three-component loop that powers this emerging field of science (see figure 1).

What is synthetic biology and why does it matter?

Genetic manipulation's history can be traced back thousands of years through farmers and agriculturalists who have used selective breeding to foster desirable traits in crops and animals. More recently, in the 1850s and 1860s, many of the rules that underpin our knowledge of genetics were codified when Gregor Mendel's experiments with pea plants established a code of genetic inheritance known as the laws of Mendelian inheritance.

Synbio, which traces its history back just a few decades, changes the paradigm of genetic manipulation by enabling for the direct alteration of DNA through gene editing and the application of engineering principles. This enables more radical change, accelerates the process of creating new traits in existing organisms, and (for the first time) allows for the creation of completely new organisms.

These new abilities open the door to reengineering of the natural world, creating massive and wide-ranging opportunities for beneficial innovation. White blood cells could be tweaked to seek out and kill specific cell mutations, such as cancer. Organisms that eat atmospheric carbon could be tailored to help arrest climate change. And any number of innovations in crops and foods could improve health, reduce food vulnerability, and lower production costs.

Yet, synbio also comes with significant risks to be managed. Putting aside Frankenstein stories (though the risk of unintended consequences is real, as are similar risks from creations by malevolent actors) the world may have to come to grips with longer-lived populations, unintended ecological consequences, and disruption to long-established industries.

Technology is enhancing synbio's potential

Synbio, has traditionally been expensive and slow. But that is changing as technological advancements at the cross-roads of genomics and digitization push the science in a variety of ways. For example, the digital domain's relatively easy and low-cost data manipulation has created unprecedented economies of scale, accelerated development of new processes, and cut the cost and length of experiments. This is evident in the cost of DNA sequencing. In 2001, the first human genome was published after more than a decade of work and at a cost of about $100 million. That same process can now be done in a day for about $600, according to the National Human Genome Research Institute.

AI promises to provide synbio with the next technological leap forward. Advancements in machine learning and the improving quality and quantity of data have the potential to generate new and useful genes, speed analysis of possible applications, and conduct (low-cost, low-risk, and rapid) virtual testing.

An example of those possibilities emerged in April 2024, when researchers published a paper describing a large language model (LLM), dubbed CRISPR-GPT, capable of automating and enhancing gene editing experiments (for more on large language models see "Language Modelling: The Fundamentals," Jan. 23, 2024). Such a model could enhance our ability to edit DNA sequences known as CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats), which can be used to cure diseases (and potentially reverse aging). The potential is clear, but the advances come with ethical issues. They include the possibility of illegal altering of human genomes to enhance traits and, because edits can be passed to future generations, the possibility of compounded mistakes and questions of consent.

How AI is used in synthetic biology

The power of the partnership between synbio and AI is based on three elements:

Data— Even the simplest organisms have more than 100,000 base pairs of DNA, while complex life forms, such as humans, have more than three billion. Each of those DNA pairs is a data point, that can be used by machine learning algorithms for training and to create outputs.

Hidden patterns—The massive variety of DNA combinations leads to complexity and optionality that is difficult for the human mind to comprehend, but machine learning is adept at uncovering variations, patterns, and interconnections (and can be trained to identify the most promising candidates).

Scalability— Laboratory testing by humans is characterized by lengthy and often wasteful trial and error, whereas virtual and automated testing by AI can be exponentially quicker, cheaper, and safer. That in turn makes synthetic biology scalable.

A breakthrough demonstration of AI's potential to aid synthetic biology came in 2018, when Google’s AI research laboratory, DeepMind, used machine learning software, called AlphaFold 1, to predict 25 protein structures. A team of humans taking part in the same experiment correctly predicted just three. Subsequent development of the AI led to the release, in 2024, of AlphaFold 3, which moved beyond proteins to more accurately predict "the structure and interaction of all life’s molecules," according to a Google blog published on May 8, 2024.

New developments in AI, and notably the advent and improvement of generative AI, have opened the door to additional creativity in synbio. For example, LLMs, which use data and algorithms to generate meaningful text responses to queries, have been adapted to the lexicon of biology by replacing words with nucleotide bases, such as adenine, cytosine, thymine, and guanine. This enables the LLMs to optimize experiments to generate new DNA sequences (and thus new virtual organisms) precisely, quickly, and cheaply, in response to human prompts. The resulting molecules and organisms promise to be useful in accelerating drug discovery, food engineering, and climate remediation. For instance, experiments and studies suggest that microalgae could be engineered to (more effectively) remove toxins from the air, treat heavy metal pollution, and for desalinization.

Generative AI could also be used to predict the outcomes of gene editing experiments, reducing time spent investigating eventual dead ends, broadening the scope of testing, and delivering resultant savings in cost and time. Those applications are similar to software engineers' use of generative AI to test code.

What are the opportunities of AI and synthetic biology?

The potential created by the combination of AI and synbio is vast. Up to 60% of physical inputs into the economy could be subject to bio-innovation, according to an estimate by McKinsey & Co., a consulting firm. And while many of those potential applications are still being assessed, or yet to be discovered, the ability to use AI to scale synbio research and production should prove a boon for economies and society (see figure 2).

The breadth of synbio's applications across various industries (see table 2) ensures that it will be difficult to predict where benefits will emerge most quickly and have the most effect. But it seems clear that the industrialization of synbio through the application of AI could make a material difference in addressing some of the world's most pressing issues, including treatments for infectious and intransigent diseases, and the remediation of the environment, including through CO2 reduction and water purification.

AI could magnify and expedite synbio's risks

AI's ability to automate, accelerate, and scale synbio has the potential to expedite and magnify many of the risks associated with gene editing. An exhaustive list of those potential risks would be long and granular, but some of the key risks that we believe should be monitored include:

Threats to species diversity and ecosystem balance from newly created synthetic organisms that will co-exist with natural ones. Past introduction of non-native animals into ecosystems has shown the unpredictability of interactions between new organisms and native populations and the risk that synthetic organisms become invasive cannot be discounted. AI's potential to accelerate the design and creation of new organisms magnifies this risk to native organisms and biodiversity. It also exposes native species to the potential of gene modification through interaction with synthetically engineered organisms, which could lead to unintended ecological consequences.

Deficient human control combined with AI's automation of the design, engineering, and testing of organisms could lead to unintended consequences, such as creating and releasing harmful pathogens.

The potential to create new biological weapons either through the modification of existing pathogens or by synthetically designing new ones (to create engineered pandemics). In both cases AI-powered synbio could more efficiently and accurately increase malicious organisms' virulence, transmissibility, and resistance to current treatments, and make them less detectible. Synbio and AI-assisted bioterrorism could develop pathogens tailored to target specific regions (e.g., based on the specificity of regional agriculture or livestock) or target populations based on certain biological or genetic traits.

Increased threat of biosecurity hacks and data-breaches notably due to AI models' susceptibility to data poisoning (e.g., the modification of an AI model's training data to undermine its performance) and data security breaches. With biological data, the potential negative outcomes could materialize in biological weapons or bioterrorism.

Dangers inherent to human-machine interconnection, such as brain-computer interfaces, which are implanted in humans’ neural systems usually for medical reasons. Ethical and moral concerns include the potential for inequality stemming from the digital divide, the possibility for human enhancement (including for military purposes), potential privacy concerns (including due to ownership of models with access to humans’ neural activity), and cybersecurity risks (including the potential for hackers to access and even take control of brain activity).

Understanding the risks associated with AI's combination with synbio (and considering the likely evolution of those risks) is central to putting in place tools to mitigate the potential dangers (see figure 3).

The scientific community has been working toward reducing synbio's risks for decades and has undertaken numerous initiatives. They include:

international safety and governance frameworks, including bio-governance regimes and bio-safety protocols (such as the Cartagena Protocol on Biosafety);

consortiums to screen synbio activity and promote beneficial applications (e.g., International Gene Synthesis Consortium);

forums to discuss and assess risks (e.g., NTI Bio forum);

safety and security audits;

and employee laboratory training.

On the regulatory front, we expect that new regulatory requirements for AI models (e.g., the EU AI Act) will complement existing regulations for genetically modified organisms, medicine, and food safety, among others. It seems likely that many AI-model applications in synbio will be considered high-risk because they will collect and analyze biological human data, which could be used to profile humans based on genetic trails.

Key trends to watch in the next five years

Our usual practice is to identify key trends watch over the course of a decade. But the likely speed of development and the breadth of potential inherent to the combination of AI and synbio is such that looking beyond a five-year horizon seems both overly adventurous and a step too far into speculation.

With that caveat, the key trends that we will be watching are:

The growth of industrial biomanufacturing: This implies success in the scaling up of the production of biological materials and processes. The Internet-of-Things' (IoT) sensors and other devices should contribute to real-time collection and monitoring of biological data, which AI models will process accurately and at scale. Also, digital twins (virtual replicas of physical process or system) will enable the testing of biomanufacturing processes in a virtual environment and experiments on organisms behavior in different environments and outside the laboratory. This should significantly reduce production costs and improve the efficiency of biomanufacturing processes. The application of industrialization and scalability to DNA printing should also support next-generation DNA printers to improve precision and add automation, which will facilitate the scaling up of processes to create organisms and other bio materials.

Support for the circular economy: AI, IoT, and data analytics could be catalysts for a thriving circular economy, particularly by enabling more sustainable use of renewable resources to minimize and recycle waste, and by supporting energy generation from waste. Generative AI could facilitate innovation in biomaterials design and industrial production at affordable prices, supporting local production and the growth of biological goods. Taking a step up, we think AI could become the engine for a bioeconomy (based on products, services and processes derived from plants or microorganisms) and thus help to address issues related to food security, climate change, and sustainability.

Big data in biology: Multi-omics is the aggregation of different types of scientific data, including information from proteomics, genomics, metabolomics, and transcriptonomics. Multi-omic data can be used as data to train AI models in similar ways that we currently use text, video, images, and code as inputs into multi-modal generative AI models. The combination of multi-omics with AI, will give scientists a more holistic view, for example, of the human body. And that could be still further augmented with data such as X-ray images, patient health records, and personal and family history.

Content Type

Special Reports

Theme

Artificial Intelligence Technology & Innovation Technology & Media

Look Forward Council Theme

Digital & AI

Contributors

S&P Global Ratings

Sudeep Kesh
Chief Innovation Officer

S&P Global Ratings

Alexander Gombach
Director

S&P Global Ratings

Paul Whitfield
Editor & Writer

Editorial, Design & Publishing

Cat VanVliet
Senior Design Manager,
Data Visualization

External Research

National Human Genome Research Institute

The Bio Revolution: Innovations transforming economies, societies, and our lives, McKinsey, May 13, 2020

Pfeifer, Blain A., et al.,Harnessing synthetic biology for advancing RNA therapeutics and vaccine design, npj Systems Biology and Applications, Nature, Nov. 30, 2023

The Bioeconomy: A Primer, Congressional Research Service, Sept, 19, 2022

Huang, K. et al., CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments, April, 27, 2024,

S&P Global Offerings

Market Intelligence

Ratings

Energy

S&P Dow Jones Indices

Mobility

Featured Topics

Featured Products

S&P Capital IQ Pro

S&P Global Energy Core

S&P Global ESG Scores

AutoCreditInsight

Ratings360

SPICE: The Index Source for ESG Data

Events

Careers

Market Intelligence

Ratings

Energy

S&P Dow Jones Indices

Mobility

S&P Capital IQ Pro

S&P Global Energy Core

S&P Global ESG Scores

AutoCreditInsight

Ratings360

SPICE: The Index Source for ESG Data

S&P Global Offerings

Market Intelligence

Ratings

Energy

S&P Dow Jones Indices

Mobility

Featured Topics

Featured Products

S&P Capital IQ Pro

S&P Global Energy Core

S&P Global ESG Scores

AutoCreditInsight

Ratings360

SPICE: The Index Source for ESG Data

Events

Careers

Market Intelligence

Ratings

Energy

S&P Dow Jones Indices

Mobility

S&P Capital IQ Pro

S&P Global Energy Core

S&P Global ESG Scores

AutoCreditInsight

Ratings360

SPICE: The Index Source for ESG Data

Featured Products

S&P Capital IQ Pro

S&P Global Energy Core

S&P Global ESG Scores

AutoCreditInsight

Ratings360

SPICE: The Index Source for ESG Data

Ratings & Benchmarks

Overview

Find a Rating

By Topic

S&P Capital IQ Pro

S&P Global Energy Core

S&P Global ESG Scores

AutoCreditInsight

Ratings360

SPICE: The Index Source for ESG Data

Overview

Find a Rating

Market Insights

Energy & Commodities

Capital Markets

Sustainability

Artificial Intelligence

Economy

Global Trade