Article,

The human antibody sequence space and structural design of the V, J regions, and CDRH3 with Rosetta

S. Schmitz, E. Schmitz, J. Crowe, Jr, and J. Meiler.
MAbs, 14 (1): 2068212 (January 2022)

Abstract

The human adaptive immune response enables the targeting of epitopes on pathogens with high specificity. Infection with a pathogen induces somatic hyper-mutation and B-cell selection processes that govern the shape and diversity of the antibody sequence landscape. To date, even the largest immunome repertoires of adaptive immune receptors acquired by next-generation sequencing cannot fully capture the vast antibody sequence space of a single individual, which is estimated to be at least 1012 potential sequences. Degeneracy of the genetic code means that the number of possible nucleotide triplets (64) is greater than the number of canonical amino acids (20), resulting in some amino acids being encoded by multiple triplets and different amino acids sharing the same nucleotide in 1 or 2 positions in the triplet. We hypothesize that the degeneracy of the genetic code can be used to statistically model an enlarged space of human antibody amino acid sequences, accommodating for the discrepancy between the observed and the hypothesized antibody sequence space. Facilitated by Bayesian statistics and immunome repertoire clustering, we calculated amino acid probabilities from single nucleotide frequencies to infer a human amino acid sequence space that is used to design human-like antibodies with Rosetta. We show that antibodies designed with our restraints are on average up to 16.6\% more human-like in the V and J regions compared to the Rosetta designs produced without constraints. The human-likeness of the heavy-chain CDR3 region (CDRH3) could be increased for 8 of 27 antibodies compared to Rosetta designs with a similar number of mutations and could be successfully applied on Mus musculus antibodies to demonstrate humanization.

BibTeX key: Schmitz2022-bo
entry type: article
year: 2022
month: jan
journal: MAbs
number: 1
pages: 2068212
publisher: Informa UK Limited
volume: 14
language: en

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Journal Article %1 Schmitz2022-bo %A Schmitz, Samuel %A Schmitz, Emily A %A Crowe, Jr, James E %A Meiler, Jens %D 2022 %I Informa UK Limited %J MAbs %K topic_lifescience Human-likeness; HumanizationABBREVIATIONS antibody biostatistics; design; immunome repertoire; rosetta; %N 1 %P 2068212 %T The human antibody sequence space and structural design of the V, J regions, and CDRH3 with Rosetta %V 14 %X The human adaptive immune response enables the targeting of epitopes on pathogens with high specificity. Infection with a pathogen induces somatic hyper-mutation and B-cell selection processes that govern the shape and diversity of the antibody sequence landscape. To date, even the largest immunome repertoires of adaptive immune receptors acquired by next-generation sequencing cannot fully capture the vast antibody sequence space of a single individual, which is estimated to be at least 1012 potential sequences. Degeneracy of the genetic code means that the number of possible nucleotide triplets (64) is greater than the number of canonical amino acids (20), resulting in some amino acids being encoded by multiple triplets and different amino acids sharing the same nucleotide in 1 or 2 positions in the triplet. We hypothesize that the degeneracy of the genetic code can be used to statistically model an enlarged space of human antibody amino acid sequences, accommodating for the discrepancy between the observed and the hypothesized antibody sequence space. Facilitated by Bayesian statistics and immunome repertoire clustering, we calculated amino acid probabilities from single nucleotide frequencies to infer a human amino acid sequence space that is used to design human-like antibodies with Rosetta. We show that antibodies designed with our restraints are on average up to 16.6\% more human-like in the V and J regions compared to the Rosetta designs produced without constraints. The human-likeness of the heavy-chain CDR3 region (CDRH3) could be increased for 8 of 27 antibodies compared to Rosetta designs with a similar number of mutations and could be successfully applied on Mus musculus antibodies to demonstrate humanization.

@article{Schmitz2022-bo, abstract = {The human adaptive immune response enables the targeting of epitopes on pathogens with high specificity. Infection with a pathogen induces somatic hyper-mutation and B-cell selection processes that govern the shape and diversity of the antibody sequence landscape. To date, even the largest immunome repertoires of adaptive immune receptors acquired by next-generation sequencing cannot fully capture the vast antibody sequence space of a single individual, which is estimated to be at least 1012 potential sequences. Degeneracy of the genetic code means that the number of possible nucleotide triplets (64) is greater than the number of canonical amino acids (20), resulting in some amino acids being encoded by multiple triplets and different amino acids sharing the same nucleotide in 1 or 2 positions in the triplet. We hypothesize that the degeneracy of the genetic code can be used to statistically model an enlarged space of human antibody amino acid sequences, accommodating for the discrepancy between the observed and the hypothesized antibody sequence space. Facilitated by Bayesian statistics and immunome repertoire clustering, we calculated amino acid probabilities from single nucleotide frequencies to infer a human amino acid sequence space that is used to design human-like antibodies with Rosetta. We show that antibodies designed with our restraints are on average up to 16.6\% more human-like in the V and J regions compared to the Rosetta designs produced without constraints. The human-likeness of the heavy-chain CDR3 region (CDRH3) could be increased for 8 of 27 antibodies compared to Rosetta designs with a similar number of mutations and could be successfully applied on Mus musculus antibodies to demonstrate humanization.}, added-at = {2024-09-10T11:54:51.000+0200}, author = {Schmitz, Samuel and Schmitz, Emily A and {Crowe, Jr}, James E and Meiler, Jens}, biburl = {https://puma.scadsai.uni-leipzig.de/bibtex/2a07912a2aeb09bd4cd697feabe6d96de/scadsfct}, interhash = {3e6aa299b98b7396ae5d05f71b7dc2de}, intrahash = {a07912a2aeb09bd4cd697feabe6d96de}, journal = {MAbs}, keywords = {topic_lifescience Human-likeness; HumanizationABBREVIATIONS antibody biostatistics; design; immunome repertoire; rosetta;}, language = {en}, month = jan, number = 1, pages = 2068212, publisher = {Informa UK Limited}, timestamp = {2024-11-28T17:41:24.000+0100}, title = {The human antibody sequence space and structural design of the V, {J} regions, and {CDRH3} with Rosetta}, volume = 14, year = 2022 }

PUMA

The human antibody sequence space and structural design of the V, J regions, and CDRH3 with Rosetta

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on