Cody Mitchell Oral Defense of Dissertation

Jun 29, 2021, 11:00 AM - 12:30 PM

Oral Defense of Doctoral Dissertation
Doctor of Philosophy in Computational Sciences and Informatics
Department of Computational and Data Sciences
College of Science
George Mason University

CODY MITCHELL

Bachelor of Arts in Chemistry, Washington & Jefferson College, 2014

Bachelor of Arts in Economics, Washington & Jefferson College, 2014

Master of Science in Data Analytics Engineering, George Mason University, 2017

Conjugated Learning: Semi-Supervised Learning with Bayesian Inference

June 29, 2021, 11:00 a.m.

https://gmu.zoom.us/j/96242301160?pwd=Yk5JRzU1QndmalpWZ1BlZ0hSUHRLQT09

Meeting ID: 962 4230 1160

Passcode: 316025

All are invited to attend.

Committee

Hamdi Kavak, Chair
Igor Griva
Jason Kinser
Olga Gkountouna

Modern machine learning algorithms require significant quantities of labeled data to train accurate models. Unfortunately, labeled data is often scarce and expensive to obtain. Semi-supervised learning is poised to make complex, accurate models accessible to problems without significant labeled data to start and without undertaking expensive data labeling operations. However, current semi-supervised methods have narrow capabilities or can often lead to low quality results.

To address these challenges and build upon the current state-of-the-art solutions, Conjugated Learning is proposed as a flexible and accurate framework for semi-supervised learning. Conjugated Learning creates an ensemble of classifiers, where each classifier is based upon different data bootstrapped samples. These classifiers are then trained and used for predictions on unlabeled observations. Each unlabeled observation has many predictions - one from each classifier in the ensemble - and is represented as a distribution of predictions, which is reduced to sufficient statistics.

This distribution of predictions is modeled as a Beta-Binomial Distribution. Using Bayesian-based conjugate updating rules, the observed predictions update a prior over several epochs, increasing accuracy over time. Initially unlabeled observations that satisfy defined thresholds are labeled and used in the classifiers trained in the upcoming epoch. This process then repeats for several epochs, including newly labeled data as additional training points during each epoch.

In an extensive evaluation, Conjugated Learning outperforms existing state-of-the-art solutions under all low data scenarios, as tested with 64 different structured datasets. In this study, Conjugated Learning increases accuracy of these solution by an average of 8 percentage points. Additionally, this research shows that this technique can be quickly adapted for Learning from Positive and Unlabeled Observations to exceed state-of-the-art performance by 21 percentage points. Finally, this research concludes by discussing future research that may enable Conjugate Learning to adapt for Active Learning and Multi-view Learning.

Upcoming Events

Dissertation Defense - Telly Sepahpour, PhD Biosciences

7 May (CLIM) Foufoula-Georgiou, Global Precipitation Extremes

George Mason Reef Building

Cody Mitchell Oral Defense of Dissertation