COSAM » Departments » Mathematics & Statistics » Research » Seminars » Statistics Discussion Group

Statistics Discussion Group

DMS Statistics Seminar
Dec 08, 2017 11:00 AM
Parker Hall 246

Speaker: Dr. Christopher G. Burton, Assistant Professor, Department of Geosciences

Title: Natural Hazards Risk: The Development of Open-source Tools to Measure the Physical and Human Components

Abstract: At the forefront of the Sendai Framework for Disaster Risk Reduction is the understanding of disaster risk, including risk from climate-related hazards and earthquakes. To understand and communicate disaster risk, a multitude of initiatives have developed state-of-the-art modeling capabilities and software tools. Few, however, have incorporated the ability to assess natural hazard impact potential beyond direct physical impacts and loss of life to account for the interconnectedness between natural hazards, the built environment, and the socio-economic characteristics of populations that create the potential for harm or loss.

This seminar will discuss the development and validation of data, methods, models, open-source software, and best practices that were developed to facilitate the meaningful assessment of natural hazard risk from a multivariate perspective. Particular focus will be placed on the development of an open-source GIS that allows researchers and risk analysts to draw from results on probabilistic hazard, exposure, property loss, and the vulnerability (i.e., characteristics that create the potential for loss) and resilience of populations (i.e., the ability of systems to prepare for, respond to, and recover from damaging hazard events).
DMS Statistics Seminar
Dec 01, 2017 03:00 PM
Parker Hall 224

Speaker: Li Chen, Assistant Professor of Pharmacy
Title: A sparse regression framework for integrating phylogenetic tree in predictive modeling of microbiome data

Abstract: The development of next generation sequencing offers an opportunity to predict the disease outcomes of patients using the microbiome sequencing data. Considering a typical microbiome dataset consists of more taxa than samples, and all the taxa are related to each other is the phylogenetic tree, we propose a smoothness penalty-Laplacian penalty to incorporate the prior information of phylogenetic tree to achieve coefficient smoothing in a sparse regression model. Moreover, we observe that sparsifying the Laplacian matrix usually results better prediction performance; however, the optimal threshold for sparsifying varies dataset by dataset and is unknown in real data analysis. To overcome this limitation, we further develop another phylogeny-constraint penalty based on evolutionary theory to smooth the coefficients with respect to the phylogenetic tree. Using simulated and real datasets, we demonstrate that the proposed methods has better prediction performance than the other competing methods.

DMS Statistics Seminar
Nov 17, 2017 11:00 AM
Parker Hall 246

Speaker: Debswapna Bhattacharya, Assistant Professor, Department of Computer Science & Software Engineering

Title: Probabilistic Graphical Model for Protein Folding 

Abstract: Graphical models have emerged as one of the most powerful ways to capture the dynamics of natural systems in recent years. Protein folding is a grand puzzle of nature that remains to be solved even after more than 50 years of intense research. In this talk, I will first introduce the nature of protein molecule and present a computational abstraction of protein folding problem. Then, I will present my latest research of developing algorithms to simulate protein folding in silico by formulating probabilistic graphical model that integrates probability distributions from directional statistics, a field of statistics concerned with angles and orientations to model wind directions or astronomical observations, to represent and learn systematic patterns observed in naturally occurring proteins. Using the trained model as a proposal distribution and employing Markov chain Monte Carlo sampling, protein folding can be computationally replicated and predicted with high degree of accuracy and speed. The method overcomes the limitations of existing approaches and was ranked as one of the 19 most innovative methods in community-wide blind assessments, demonstrating great potential of graphical models in tackling protein folding problem in particular and other fundamental problems in life sciences in general.

DMS Statistics Seminar
Nov 10, 2017 11:00 AM
Parker Hall 246

Speaker: Liang Liu, Associate Professor, Department of Statistics, University of Georgia

Title: An age-dependent birth-death model for gene family evolution

Abstract: In this study, we describe a generalized birth-death process for modeling the evolution of gene families. Use of mechanistic models in a phylogenetic framework requires an age-dependent birth-death process. Starting with a single population corresponding to the lineage of a phylogenetic tree and with an assumption of a clock that starts ticking for each duplicate at its birth, an age-dependent birth-death process is developed by extending the results from the time-dependent birth-death process. The implementation of such models in a full phylogenetic framework is expected to enable large scale probabilistic analysis of duplicates in comparative genomic studies.

DMS Statistics Seminar
Nov 03, 2017 11:00 AM
Parker Hall 246

Jordan Harshman, Assistant Professor,Department of Chemistry and Biochemistry, Auburn University
Marilyne Stains, Department of Chemistry, University of Nebraska-Lincoln

Title: Characterizing teacher and student behaviors in over 2,000 classes: A journey into mixed-model clustering
Abstract: As evidence-based instructional practices (EBIPs) have gained more and more momentum, the interest in assessing the effect of their implementation has grown. This requires a national baseline that characterizes how instructors currently teach their courses to track changes to the norms. To address this, we codified 13 student and 12 instructor behaviors in 2,028 classes taught by 545 STEM faculty from over 25 institutions. A total of seven instructional profiles, which are groups of student and instructor behaviors, were determined using mixture model clustering. Data sets this large and with this many variables are notorious for producing unstable solutions when mix-model clustering techniques are applied, but we propose a novel means of choosing an empirically superior model. The most prominent behaviors were lecturing for instructors and listening for students, resembling a traditional didactic instructional style (~55%). Another ~25% of instructors were classified into interactive lecture categories while the remaining 20% demonstrated predominately student-centered strategies. Additionally, we discovered that an instructor’s practices can vary significantly within a course, indicating that multiple observations are required to gain a complete picture of that faculty’s instruction.

Statistics Seminar
May 05, 2017 11:00 AM
Parker Hall 324

Speaker: Hans Werner van Wyk

Title: Numerical Statistics; Statistical Numerics

Abstract: Numerical analysis (the study of algorithms and approximation), and statistics (the study of data) have long been considered distinct fields of research. However, they are increasingly being used in conjunction to solve interesting real world problems. For example, efficient numerical optimization- and linear algebra routines facilitate the statistical analysis and processing of large complex datasets. Also, the rise of scientific computing has resulted in the prevalent use of numerical simulations to supplement physical observations. This has led to the need to consider efficiency and accuracy of numerical approximations in the statistical estimation and prediction of these systems. On the other hand, statistics is playing an increasingly important role in the development and analysis of numerical algorithms themselves, such as investigations into the statistical properties of numerical errors, fault detection for computational nodes in a supercomputer, or the use of randomization to speed up and stabilize numerical linear algebra routines. This informal talk explores the fascinating interplay between these two complementary areas of mathematics within the context of a few simple numerical examples.

This is the last Statistics Seminar talk for this semester.
Statistics Seminar
Apr 28, 2017 11:00 AM
Parker Hall 324

CANCELED                  NO SEMINAR

Statistics Seminar
Apr 21, 2017 11:00 AM
Parker Hall 324

Speaker: Yujin Chung

Title: Joint distribution of tree shape and tree distance for recombination detection

Abstract: Ancestral recombination events can cause the underlying genealogy of a site to vary along the genome. To simultaneously detect recombination breakpoints in very long sequence alignments and estimate the phylogenetic tree of each block between breakpoints, I consider a Bayesian model using distance between trees in the prior distribution to favor similar trees at neighboring loci. The main hurdle in using such models is the need to calculate the normalizing function of a prior distribution on trees. I will explain how to compute the normalizing function exactly, for a distribution based on the Robinson-Foulds (RF) distance. At the core is the calculation of the joint distribution of the shape of a random tree and its RF distance to a fixed tree. I will also introduce fast approximations to the normalizing function.

Statistics Seminar
Apr 14, 2017 11:00 AM
Parker Hall 324

Speaker: Laurie Stevison

Title: Approaches and challenges in quantifying recombination rate and hybridization along the genome

Statistics Seminar
Apr 07, 2017 11:00 AM
Parker Hall 324

Speaker: Xiaoyu Li

Title: Statistical Analysis with Missing Data: Some Examples

More Events...

Last Updated: 09/11/2015