Development of chemical classification and annotation tools for metabolomics and lipidomics MS analyses.


Within the context of mass spectrometry (MS), chemical annotation is the process of assigning elements of chemical identity such as structure, stereochemistry, elemental composition (EC), and class to previously unknown features detected by MS. Annotation in metabolomics and lipidomics (i.e., the study of metabolite and lipid populations in biological samples, respectively) allows investigators to analyze samples in terms of relevant labels and infer about the biological and/or chemical significance of the sample based on the character of the data to which the labels are attached. Annotation strategies vary significantly with respect to the level of specificity with which they describe a feature. A highly specific annotation would account for all structural and stereochemical elements that describe a unique molecule, and a less specific annotation would use terms that describe a group of related molecules (i.e., class) in which the annotated feature resides. The former is preferable but also often inaccessible in exploratory, untargeted MS workflows; however, class information is readily accessible and can help investigators to make global inferences about their data. Most MS instrumental classification strategies are dependent on previous assignment of EC to some degree prior to determination of analyte class; often, this process is made trivial using metabolomics or lipidomics databases that contain chemical ontology data about all entries. However, this dissertation documents a classification approach that operates orthogonally to conventional identification workflows and is thus independent of identity assignment by instrumental methods, allowing classification to provide sample information and guide feature identification. Additionally, this dissertation details an instrumental annotation method for MS imaging of lipids that integrates class-based annotation and image filtering for intuitive interrogations of lipid populations. Chapter two and three discuss the development and application of In Silico Fractionation (iSF), a feedforward neural network (FFNN)-based tool that uses neural decision trees (NDT) to classify biological analytes detected by MS. Chapter two demonstrates an application to a wide variety of biomolecules, and Chapter three details a focused application of iSF toward lipid subclassification in lipidomics workflows. Chapter four details a referenced Kendrick mass defect (RKMD)-based tool developed for integrated annotation and class-based image filtering of lipids in MS imaging data. This method enables intuitive examination of lipid spatial distributions in MS imaging data via a class data-driven approach. Chapter five summarizes the work detailed in this dissertation and explores potential future directions to follow this work, including application of iSF to classification in whole x-ome analysis and application of the RKMD-based annotation method to larger scale between-sample MS imaging analyses.



Mass spectrometry. Annotation. Lipidomics. Chemical classification. Metabolomics.