Science Pool

AI/ML-Driven Antibody Discovery

Posted by Evotec on Jun 10, 2022 12:18:43 PM

Antibodies generated in the lab are important as potential treatments for a broad spectrum of diseases, in particular infectious diseases caused by viruses. They can be obtained either by animal-derived B cells or from antibody library display platforms. Evotec’s strategy for the optimal path to obtain lead candidates is offering access to both sources of antibodies for discovery, coupled with the exploitation of state-of- the-art technologies to ensure success for a broad range of targets and disease states. In addition, selected lead candidates can be further optimized using powerful computational platforms to enhance productivity, manufacturability, and formulation stability. This is the end-to-end J.Design biologics platform, which is fueled by the front-end discovery platform, J.HAL™ (Just Humanoid Antibody Library) and associated data-driven, company-wide machine learning methodology.

By using artificial intelligence (AI) and machine learning (ML), J.HAL can generate novel, humanoid antibody sequences that both represent natural repertoires and are biased towards desirable features. To enable properties such as broad target and epitope engagement, focused efficacy, and suitable developability, Just-Evotec Biologics has devised an Antibody-GAN (Generative Adversarial Network), a new synthetic approach to designing a novel class of antibody therapeutics, which is termed humanoid antibodies.

At the conferences International Conference on Antiviral Research (ICAR) 2021 and Antibody Engineering & Therapeutics Europe 2022, researchers from Evotec and Just-Evotec Biologics introduced results obtained by using GAN to generate novel sequences, which mimic natural human response and provide the necessary diversity and developability features.


Competing Neural Networks


GAN is based on competing, deep layer neural networks that learn and produce the features of the mature human antibody repertoire, including sequence characteristics and structure properties, allowing for the encoding of key properties of interest into diverse libraries for a feature-biased discovery platform. It works to:

  • capture the complexity of the entire variable region of the standard human antibody sequence space,
  • provide a basis for generating novel antibodies that span a larger sequence diversity than standard in silico generative approaches, and
  • incorporate transfer learning, a critical feature for antibody discovery to bias the physical properties of the generated antibodies towards broader efficacy traits such as CDR lengths and surface properties, improved developability (e.g., improved thermal and pH stability), and diverse chemical and biophysical properties.

The GAN network is trained by using hundreds of thousands of human antibody sequences to recognize legitimate human v-genes. The generator network generates random sequences to fool the discriminator while continually receiving feedback from the discriminator on sequence validity. Over time, the two networks get progressively better at their tasks. After full training, the Antibody-GAN generator is eventually able to produce fully human, novel antibody sequences for the germline for which the GAN was trained.

Antibodies targeting SARS-CoV-2

To demonstrate the usefulness of this platform, the researchers used their newly constructed, 1 billion theoretical diversity phage Fab library with the intent to discover antibodies to the SARS-CoV-2 spike protein. Candidates that specifically bound SARS-CoV-2 spike protein and did not bind an irrelevant antigen were further characterized for dose-dependent binding using AlphaLISA technology. In the primary “yes/no” binding screen a total of 73 unique antibody sequences specific for SARS-CoV-2 spike protein were identified. The researchers then performed binding assays using unpurified transfection supernatants and later reproduced the results with purified material. The candidate antibody supernatants that specifically bound SARS-CoV-2 spike protein were subsequently tested for their ability to block binding of this protein to human ACE-2 receptor. The team identified multiple antibodies that effectively blocked spike human ACE2 receptor interaction, demonstrating the feasibility to screen unpurified transfection supernatants for functional activity. After further rounds of panning, the top candidates expressed at flask scale were purified and tested for SARS-CoV-2 neutralization ability across multiple strains. The researchers identified multiple candidates with neutralizing activity against several strains of SARS-CoV-2. Nine of these antibodies exhibited blocking activity of the spike protein to the ACE2 receptor in an in vitro functional assay. Of note, all antibody data shown here were from native library candidates without any affinity maturation.

The presentation demonstrates that applying machine learning algorithms in antibody discovery “promotes efficient learning from the least expensive and most abundant data encoded in the DNA of antibodies, to validation of this learning through less abundant, more expensive, but most relevant data from GMP manufacturing at full commercial scale,” stated James N. Thomas, retired Executive Vice President, Global Head of Biotherapeutics and President U.S. Operations at Just - Evotec Biologics. “This is a systems approach to platform definition and continuous improvement, and it is unique in the industry, made possible by a number of factors that will be difficult for others to replicate."

To learn more about Evotec's capabilities read our related poster.

Learn More

Tags: Blog, Biologics