Galaxies in our universe form hierarchically, continuously merging and absorbing smaller galaxies over cosmic time. In this project we aim to identify the most important features of, as well as generate efficient new features from, the merger histories of galaxies. Namely, features that predict (or physically speaking, determine) the properties of galaxies, e.g. their shape or color. This will be done using the results from a large cosmological simulation, IllustrisTNG (www.tng-project.org). We will begin with identifying ways to represent the rich information in the merger history. We will then compare various ML methods oriented towards feature selection or importance analysis: random forests or gradient boosted trees, L1SVM, neural networks (through analysis of e.g. saliency maps). More advanced models can also be applied, such as neural network models designed for feature selection. Finally, we wish to apply / develop methods that can build ‘interpretable’ new features by constructing them as algebraic formulas from original input features (inspired by e.g. https://science.sciencemag.org/content/324/5923/81). The overarching goal is to understand better what in the merger history is most crucial in determining a galaxy’s present-day properties, an answer to which can be widely applicable to problems in galaxy formation.

Selected candidate(s) will receive a stipend directly from the faculty advisor. Amount is subject to available funding.

Faculty Advisor

  • Professor: Shy Genel
  • Department/School: Columbia Astrophysics Laboratory
  • I study Galaxy Formation (how galaxies grow and evolve over cosmic time) using large supercomputer simulations that contain both dark matter and normal matter: gas and stars. These simulations generate large data sets that allow following, visualizing, and creating mock observations of the evolution of thousands of individual simulated galaxies.

Project Timeline

  • Earliest starting date: 10/15/2019
  • End date: 05/31/2020
  • Number of hours per week of research expected during Fall 2019: ~NA

Candidate requirements

  • Skill sets: Knowledge of Python, knowledge of basic machine learning methods, some knowledge of deep learning methods + pytorch or tensorflow is desirable
  • Student eligibility: freshman, sophomore, junior, senior, master’s
  • International students on F1 or J1 visa: eligible