• Login
    View Item 
    •   BEARdocs Home
    • Graduate School
    • Electronic Theses and Dissertations
    • View Item
    •   BEARdocs Home
    • Graduate School
    • Electronic Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Statistical methods for high-throughput data integration : methodologies in disease research and drug discovery.

    View/Open
    CHAN-DISSERTATION-2019.pdf (3.879Mb)
    Jinyan_Chan_CopyrightAvailabilityform.pdf (530.6Kb)
    Jinyan_Chan_Publisher Permissions.pdf (156.9Kb)
    Access rights
    No access - Contact librarywebmaster@baylor.edu
    Date
    2019-07-09
    Author
    Chan, Jinyan, 1990-
    Metadata
    Show full item record
    Abstract
    The wide application of high-throughput technologies in biomedical research calls for integrative approaches for data mining and knowledge discovery. Consequently, methodologies that deliver robust and systems integrations are in unprecedented demand. Two important sub-disciplines in biomedical research, namely disease research and drug discovery, have become ever-evolving frontiers for integration of “big data”. In disease research, p-value combination has been broadly employed to integrate statistical evidences from multiple studies. Common assumptions of conventional p-value combination methods include independence and homogeneity of the combined tests, which are constantly challenged by the complex nature of high-throughput biomedical datasets. In this dissertation, we propose a novel and robust p-value combination algorithm based on the Pareto Dominance principle from multi-objective optimization, which accounts for dependency and heterogeneity in data. Compared to existing methods, the Pareto method attains adaptive rejection regions from “learning” the multivariate null distribution estimated by permutations, therefore achieves superior performance when combining heterogeneous effects from multiple datasets, meanwhile remains appropriate error control for correlated tests. The Pareto meta-gene-set-analysis tool, PEACH, was developed and tested on a 16-cancer pan-cancer dataset from The Cancer Genome Atlas (TCGA). Significantly improved statistical power of the PEACH algorithm and its ability to detect important pathways related to sub-groups of cancers were demonstrated. On the other hand, computational drug repurposing based on gene expression data has gained increasing popularity in the field of drug discovery. The Connectivity Map (CMap) is a major database to repurpose new drugs from gene expression data. However, key limitations of the current signature-based drug-repurposing paradigm have prohibited accurate and unbiased repurposing. In the second part of this dissertation, we developed a frame-breaking statistical approach, namely Dr. Insight, to remove the requirement of subjective selection of a gene signature to query CMap database. We performed comprehensive studies using simulation data and disease datasets and validated the superior performance of Dr. Insight compared to previous methods. A TCGA breast cancer case study was also performed to showcase the application of Dr. Insight to breast cancer drug repurposing, from drug redirection to systematic construction of disease-specific drug-target networks.
    URI
    https://hdl.handle.net/2104/10767
    Collections
    • Electronic Theses and Dissertations
    • Theses/Dissertations - Biomedical Studies

    Copyright © Baylor® University All rights reserved. Legal Disclosures.
    Baylor University Waco, Texas 76798 1-800-BAYLOR-U
    Baylor University Libraries | One Bear Place #97148 | Waco, TX 76798-7148 | 254.710.2112 | Contact: libraryquestions@baylor.edu
    If you find any errors in content, please contact librarywebmaster@baylor.edu
    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    TDL
    Theme by 
    Atmire NV
     

     

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    Statistics

    View Usage Statistics

    Copyright © Baylor® University All rights reserved. Legal Disclosures.
    Baylor University Waco, Texas 76798 1-800-BAYLOR-U
    Baylor University Libraries | One Bear Place #97148 | Waco, TX 76798-7148 | 254.710.2112 | Contact: libraryquestions@baylor.edu
    If you find any errors in content, please contact librarywebmaster@baylor.edu
    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    TDL
    Theme by 
    Atmire NV