Skip to main


Spring semester begins Monday, January 25, with a mix of in-person instruction and expanded online classes. Visit Mason’s Safe Return to Campus Plan for COVID-19 updates.

Binary code

Why Data Science and Astrobiology?


Blog pic
Biosignatures Detection in Exoplanet Atmospheric Data, source:

Astrobiology is one of the newest and most interdisciplinary scientific field of study, and a very exciting one, that is researching the potential of life in the universe, the origins of life and the potential of life expansion in the Universe. It is formerly known as exobiology and the field includes not only biology, but also chemistry, physics, geology, astrophysics, social sciences, complexity sciences, information and computer sciences and many more. It is a truly interdisciplinary, fascinating field of study that emerged from the necessity of answering some of the largest questions in science, from the origin of life, to how we define life and how can we detect it in other parts of the Universe.

Biosignatures Detection in Exoplanet Atmospheric Data, source:

Astrobiology is a particular field of study that involves searches and observations outside our planet, but also on our planet in extreme environments. The searches for data and observations largely fall into two categories: biosignatures and technosignatures.

A biosignature denotes the presence of a substance or phenomenon that is indicative of life, past or present. Biosignature is a term defined in the context of the field of astrobiology and it represents evidence of life in the context of the system where it is being detected. More specifically, if this system is a planet or a moon, the biosignature represents the detection of chemical compounds on the surface or in the atmosphere (if it has an atmosphere) that are indicative of life on that planet or moon. For example, biosignatures can be chemical compounds, particularly organic compounds, visible macroscopic patterns, atmospheric gases.

Visualization of Radio Astronomy Data Collection of All Southern Sky from WALLABY ASKAP HI Telescope, Australia, source:

The past couple of decades have seen an explosion in biosignature science, but currently only the Viking mission has been equipped to search for biosignatures in our solar system. Upcoming missions target the moons Europa and Titan for searches. One of Earth’s biosignatures is the atmospheric composition of gases, particularly oxygen and methane in strong thermodynamic equilibrium, the surface reflectance of the vegetation in color red, and narrow-band, pulse modulated radio signs. These “signatures” together suggest an inhabited planet, but each signature by itself might be a “false positive” in confirming the presence of life. 

Additional to how we understand the presence of life on Earth as a starting point for biosignature searches on exoplanets, there has been considerable work done in understanding “agnostic biosignature”, or biosignatures that have very little in common with life on Earth. The agnostic biosignatures are based on a broader definition of life, based on processes and activities and not on specific molecular structures.

Blog pic 2
Visualization of Radio Astronomy Data Collection of All Southern Sky from WALLABY ASKAP HI Telescope, Australia, source:

On another hand, technosignatures involve searches for anomalies in astronomy (radio or optical) data. The difference between technosignatures and biosignatures in terms of data and data science approaches rely in the scarcity vs. abundance of data – they can be polar opposites: very few exoplanets observed in the Kepler and TESS missions or planets or moons within our solar system have atmospheres, while radio and optical astronomy data is very abundant – it trails in petabytes the amount of data created by all social media on Earth. Therefore, the data science problems are different in these two different contexts – the first one can involve simulations and generating new data, or problems of data alignment from various very different data resources, while the second one is about detecting anomalies in the vast amount of data. Both involve the use of machine learning and deep learning, as well as agent-based simulations or other types of simulations. In this blog, we will explore current published research, techniques and methods used and coding problems related to this fascinating field.

#biosignatures #technosignatures #ML #simulations