Machine learning-assisted prediction of structure and function of cystine-stabilized peptides and optimization of expression in an E. coli system.


Cystine-stabilized peptides are promising prospects for the pharmaceutical industry as biologics. These peptides carry out a variety of useful functions which could be exploited to treat diseases and kill unwanted organisms. As well, an array of disulfide bonds makes the peptides highly stable against temperature, enzymatic degradation, pH and other adverse physiological conditions. There is a vast number of cystine-stabilized peptides serving as antimicrobial peptides, immunological modulators, ion channel blockers and other functions across a wide array of taxa, from fungi and bacteria to plants and humans. Practical access to these promising bioactive molecules could be greatly accelerated if it were possible to efficiently mine cystine-stabilized peptide sequences from genomic databases, determine the function and structure of each candidate from only the primary sequence, and then express the top candidates in E. coli for biological analysis. In this way, only the natural, presumably functional, variants of a particular family of cystine-stabilized peptides could be collected in large quantities. Going further, it would be desirable to convert the nonspecific activity of antimicrobial peptides to a specific activity, targeting a specific pathogen and leaving the rest of the microbiome intact; in essence, developing a targeted antibiotic. To contribute to developing this pipeline, I developed the machine learningassisted algorithms PredSTP and CSPred to predict structural and functional characteristics, respectively, of cystine-stabilized peptides from primary sequence data. In addition, I developed an E. coli-based expression system for high yield production of recombinant antimicrobial peptides specifically targeted to Staphylococcus aureus. These techniques are now available to collect large libraries of cysteine-stabilized peptide sequences, to express top candidates in E. coli, and to target the peptides to specific pathogens.



Machine learning. Cystine-stabilized peptides. Antimicrobial peptides.