Statistics and Data Science Seminar

Department of Mathematics and Statistics




The Statistics & Data Science Seminar is hosted by the Department of Mathematics and Statistics and provides a weekly platform for academics and researchers from different domains to present and discuss problems and solutions regarding data collection, management and analysis.

Spring 2026 Seminars

Welcome to the Spring 2026 Seminar series! The seminar takes place on Wednesdays at 2 p.m. CT. The seminars will be hybrid (in-person and over Zoom) or virtual only (over Zoom). The location is Parker Hall 358. For any questions or requests, please contact Huan He or Haotian Xu. The list of speakers for this series can be found in the table below which is followed by information on the title and abstract of each talk.


Speaker Institution Date Format
    Feb. 4  
Sayar Karmakar U of Florida Feb. 11 In-person
    Feb. 18  
Jiajin Sun Florida State Feb. 25 In-person
Shuoyang Wang U of Louisville Mar. 4 In-person
NA NA Mar. 11 NA
Florian Gunsilius Emory Mar. 18 In-person
Yan Li Auburn Mar. 25 In-person
Rich Lehoucq   Sandia National Labs Apr. 1 In-person
Mine Dogucu  UC Irvine  Apr. 8  
    Apr. 15  
Shivam Kumar U Chicago  Apr. 22 In-Person 

 

Sayar Karmakar (U of Florida)

Title: Epidemic Changepoints: Applications in spatial anomaly detection and localizing LLM watermarks

 

Abstract: We present epidemic change-points as a unifying lens for two localization problems:(i) detecting spatial anomalies and (ii) segmenting watermarked regions in mixed-source text. For spatial data, we formalize a `spatial' change-point as an anomalous region (an epidemic in space), provide detection-accuracy results for single and multiple breaks, and propose a block-based scan that delivers substantial computational savings with guarantees. Next, we move to a seemingly unrelated but a very pertinent topic.


As large language models proliferate, ensuring content provenance has become a statistical challenge. For this problem on finding locLized modified text data segments, we introduce WISER, a fast epidemic-segmentation approach with finite-sample error bounds and consistency for multiple watermarked segments, and we demonstrate empirical gains over state-of-the-art baselines on benchmark datasets.


We emphasize how classical ​changepoint ideas ​catered to epidemic and transient departures yield principled, scalable solutions to modern problems in text provenance and spatial anomaly detection. Simulations and empirical studies corroborate the theory and point to open questions for PhD-level research.


Joint work with Soham Bonnerjee & Subhrajyoty Roy (watermarks) and with Soham Bonnerjee & George Michailidis (spatial anomaly)

Jiajin Sun (Florida State)

Title: Efficient Analysis of Latent Spaces in Heterogeneous Networks

 

Abstract: This work proposes a unified framework for efficient estimation under latent space modeling of heterogeneous networks. We consider a class of latent space models that decompose latent vectors into shared and network-specific components across networks. We develop a novel procedure that first identifies the shared latent vectors and further refines estimates through efficient score equations to achieve statistical efficiency. Oracle error rates for estimating the shared and heterogeneous latent vectors are established simultaneously. The analysis framework offers remarkable flexibility, accommodating various types of edge weights under general distributions.

Shuoyang Wang (U of Louisville)

Title: Deep Learning for Complex Functional Data Analysis

Abstract: Functional data are realizations of random functions observed over a continuum, such as signals and images. In many modern applications, including neuroscience and biomedical research, observations are more naturally represented as random functions rather than finite dimensional vectors. The intrinsic complexity of such data stems from high dimensional functional domains, cross cohort heterogeneity, and unknown data generating distributions, which together complicate principled modeling and performance guarantees. Although deep learning has shown strong empirical performance in biomedical studies, its methodological and theoretical foundations for complex functional data settings remain limited. In this talk, I will present two methodological contributions that develop principled deep learning frameworks for complex functional data. First, I will introduce a federated deep learning approach for functional data classification across multiple heterogeneous cohorts. The learner visits each cohort once, performs local updates, and transmits only compressed model weights, thereby preserving privacy and reducing communication and computational costs. To address cross cohort heterogeneity, we develop an adaptive sequential weight updating strategy that progressively corrects distributional shifts and improves performance on a target cohort. We establish minimax optimal excess risk bounds and characterize a sharp sampling threshold governing learnability under both densely and sparsely observed functional data. Second, I will present a deep learning based functional graphical modeling framework for learning conditional independence structures in multivariate functional data. Each node’s neighborhood is estimated via flexible functional regression with embedded feature selection, allowing a fully nonparametric specification, and the overall graph is recovered by aggregating the neighborhood estimates. The method avoids restrictive distributional assumptions and does not rely on a well-defined functional precision operator. We prove global model selection consistency and establish convergence rates that attain the classical nonparametric regression rate up to a logarithmic factor, with a fundamental sampling threshold determining the estimator’s convergence behavior. Empirical performance is demonstrated through simulations and real data applications, including analyses of ADNI dataset and the ADHD-200 Consortium.

Florian Gunsilius (Emory)

Title: 

Abstract: 

Yan Li (Auburn)

Title: 

Abstract: 

Rich Lehoucq (Sandia National Labs)

Title: Poisson tensor completion density estimator

Abstract: 

Mine Dogucu (UC Irvine)

Title: 

Abstract: 

Shivam Kumar (U Chicago)

Title: 

Abstract: