Skip to main
Binary code

Mason CSI Grad Student Wins Presentation Competition at 2021 Joint Statistical Meetings

tjones

Tommy Jones, a PhD candidate in George Mason’s CSI program, was named one of two winners of the 2021 Text Analysis Interest Group (TAIG) presentation competition for a talk he gave at the 2021 Joint Statistical Meetings, the largest gathering of statisticians and data scientists held in North America. The conference, held August 8 through 12, draws attendance from 12 national and international statistical groups. The award is $500, coverage in AMSTATNews (01_JANUARY_2022_web.pdf (amstat.org), and a speaker slot at a future session of Data Science DCJones is also a senior member of the technology staff at In-Q-Tel, and vice president of Data Community DC.

“The conference was virtual,” Jones noted, “.  .  . but it was still possible to make connections. I reached out to several speakers whose work I liked and received emails from some of the audience for my presentation. We connected over VTC links between sessions. So, it is still possible to meet and talk with people, even in a virtual conference.

“My research is developing some statistical theory for analyzing language, focused on Latent Dirichlet Allocation (LDA). Statistical theory for language, which I call ‘corpus statistics,’ can allow us to measure linguistic phenomena with the same rigor that we use to measure, for example, economic phenomena like unemployment and prices. Then, businesses, researchers, and policymakers can incorporate cultural zeitgeist into their analyses in a principled way. LDA is a great model for corpus statistics because it's a Bayesian probability model, allowing us to leverage existing best practices and embedding language into a probability space where relationships between points are interpretable and well-defined.”

Jones also has thought a lot about the skills associated with presenting technical topics. “I've developed some principles to help myself,” he explains. “First, focus on telling a story, not on conveying every detail of the research. I believe the goal of a presentation is to inspire the audience to want to read the paper, not to replace the paper itself. Second, be judicious in the use of math in the presentation. Math can be hard for the audience to follow in real time unless they are very familiar with your model. Finally, practice, practice, practice. Rehearse the presentation many times and time yourself. That way, by the time you present, you sound like you are speaking naturally and maintain a good pace to keep the audience engaged.” 


Software implementing some of Tommy’s work:  https://CRAN.R-project.org/package=tidylda