
DMS Statistics and Data Science Seminar

Time: Feb 25, 2021 (02:00 PM)
Location: ZOOM


Speaker: Antony Pearson (Auburn University)

Title: Quantifying Structure within Unstructured Symbolic Data


Abstract: Modern biological research is epitomized by “omics” experiments, which produce millions to billions of symbolic outcomes in the form of reads (i.e., DNA sequences of a few dozen to a few hundred nucleotides). Unfortunately, these intrinsically non-numerical datasets are often highly contaminated, and the possible sources of contamination are usually poorly characterized. The latter contrasts with continuous datasets, where it is often well-justified to assume that the distribution of contaminating samples is Gaussian. To overcome hurdles associated with these data, I will introduce the notion of “latent weights,” which measure the largest expected fraction of samples from a contaminated probabilistic source that conforms to a model in a well-structured class of desired models. As proof of concept, I use latent weights to reevaluate a long-standing assumption used in most modern DNA methylation analysis.





Roberto Molinari is inviting you to a scheduled Auburn University Zoom e-meeting. If you're a new participant, we have a quick start guide here:


Topic: DMS - Data Science Seminar

Time: This is a recurring meeting Meet anytime

Join from PC, Mac, Linux, iOS or Android:

    Password: 098550

Connect using Computer/Device audio if possible.

Or Telephone: Meeting ID: 832 9968 1626

    Dial: +1 301 715 8592 (US Toll)

        or +1 312 626 6799 (US Toll)


Or an H.323/SIP room system:

    H.323: (US West) or (US East)

    Meeting ID: 832 9968 1626

    Password: 098550



    Password: 098550


See Statistics & Data Science Seminar - Auburn University College of Sciences and Mathematics for complete calendar