Deep Learning Final Project

Title: Summarizes the main idea of your project. Deep RNN for Supernovae Classification

Who: Names and logins of all your group members. Katie Li (kli154), AJ Wu (axwu), Arin Idhant (aidhant), Stella Tsogtjargal (stsogtja)

Introduction: What problem are you trying to solve and why? We are solving the problem of supernovae classification, a relevant task in astronomy today, especially since future large, wide-field photometric surveys such will produce vast amounts of data. This lends itself well to analysis methods that can learn abstract representations of complex data.

We are implementing an existing paper called “Deep Recurrent Neural Networks for Supernovae Classification” by Tom Charnock and Adam Moss, which has the same objectives. This is a classification task. We chose this paper because many members in our group are interested in the field of astronomy/astrophysics. It seemed like an interesting application of deep learning.

Type 1a supernovae are rarer than other types of supernovae. While other supernovae occur when massive stars collapse under their own weight and explode, type Ia supernovae originate from binary star systems that contain at least one white dwarf. Type Ia supernovae are much rarer, happening roughly once every 500 years in the Milky Way. If a white dwarf accretes mass from a binary companion or merges with another white dwarf, its core will reach the ignition temperature for carbon fusion and explode.

Unlike most celestial objects, type Ia supernovae are standard candles – they have a known luminosity and we can measure their distance to the Earth from that luminosity. From this distance, we can know how long ago the supernova occurred. We can also measure the redshift from the spectrum of the supernova, which tells us how much the Universe has expanded since the explosion. By studying many supernovae at different distances, astronomers can piece together a history of the expansion of the Universe.

Related Work: Are you aware of any, or is there any prior work that you drew on to do your project? Please read and briefly summarize (no more than one paragraph) at least one paper/article/blog relevant to your topic beyond the paper you are re-implementing/novel idea you are researching.

One commonly used dataset of supernovae, the Joint Light curve Analysis (JLA) contains only 740 supernovae, while the Dark Energy Survey has detected thousands of supernovae. The astronomy community expects to detect hundreds of thousands of supernovae from future projects like the Vera Rubin Telescope (which will be completed in 2025). Detecting and confirming these supernovae types by hand will be impossible, which is why AI/ML solutions are necessary. A January 2024 paper used an ML algorithm to classify supernovae from five years of data from the Dark Energy Survey.

In this section, also include URLs to any public implementations you find of the paper you’re trying to implement. Please keep this as a “living list”–if you stumble across a new implementation later down the line, add it to this list.

https://github.com/adammoss/supernovae

Data: What data are you using (if any)? If you’re using a standard dataset (e.g. MNIST), you can just mention that briefly. Otherwise, say something more about where your data come from (especially if there’s anything interesting about how you will gather it).

We will use data from the Supernovae Photometric Classification Challenge (SPCC), a set of 21k simulated light curves. It can be found here: https://www.hep.anl.gov/SNchallenge/

How big is it? Will you need to do significant preprocessing?

The dataset consists of 21,319 simulated supernova light curves. Each supernovae sample consists of a time series of flux measurements, with errors, in the g, r, i, z bands (one band for each time step), along with the position on the sky and dust extinction. We will need to do a small amount of data preprocessing to get values of the g, r, i, z fluxes and errors at each sequential step.

Methodology: What is the architecture of your model? How are you training the model?

We will be using several different architectures and comparing the results: vanilla RNN, GRU, and LSTM. We’ll also try different numbers of layers and different hyperparameters and compare the results. Given that the dataset is relatively small, we will also try using dropouts to prevent overfitting and compare the results.

Consider also using transformers

If you are implementing an existing paper, detail what you think will be the hardest part about implementing the model here.

We think the hardest part of implementing the model will be 1) preventing overfitting, 2) understanding and preprocessing the data, 3) interpreting the results, and 4) figuring out the syntax to do what we want.

Metrics: What constitutes “success?”

Like in the paper, we plan on using accuracy and the confusion matrix. The confusion matrix splits predictions into true positives, false positives, true negatives, false negatives, and true negatives. We also plan on calculating the area under the curve (AUC); a perfect classifier would have an AUC of 1 while a random classifier would have an AUC of 0.5. We plan on conducting 5 randomized runs of 200 epochs each.

The goal of the paper was to determine the supernovae type in the test set. They considered two problems, (1) to categorize two classes (type-Ia versus non-type-Ia), and (2) to categorize three classes (supernovae types 1, 2, and 3). We will follow along these lines. Our base goal would be to reach 80% accuracy and AUC for categorizing type-Ia vs non-type-Ia. Our target goal is to reach 80% accuracy and AUC for that and categorizing the three supernovae classes. Our stretch goal is to reach 90%+ accuracy and AUC for both problems. Interpretability Ethics: Choose 2 of the following bullet points to discuss; not all questions will be relevant to all projects so try to pick questions where there’s interesting engagement with your project. Remember that there’s not necessarily an ethical/unethical binary; rather, we want to encourage you to think critically about your problem setup.)

Why is Deep Learning a good approach to this problem?

Given the amount of data available from the SPCC, it’s useful to create a model to extrapolate the type of supernova based on relatively little and easily accessible data (in this case, the light curve). If successful, then using this model will dramatically decrease the amount of work and data necessary to classify supernovae.

What is your dataset? Are there any concerns about how it was collected, or labeled? Is it representative? What kind of underlying historical or societal biases might it contain?

Since our dataset is being generated by a program, there are concerns about how we can interpret these results due to biases and inconsistencies in that data generation. Inaccurate labels, for example, would have extremely negative implications on our model. That said, the generated data used is highly credible and has been confirmed, approved, and used by many astronomers.

Who are the major “stakeholders” in this problem, and what are the consequences of mistakes made by your algorithm?

Major stakeholders are mainly scientists interested in classifying supernovae for research purposes; an incorrect classification could then potentially result in incorrect scientific extrapolations from output data (e.g., about the universe’s chemical makeup). As a result, it’s important to not rely entirely on the model’s output, even if the accuracy is fairly high.

How are you planning to quantify or measure error or success? What implications does your quantification have?

We are most interested in comparing accuracy between different types of architectures. However, this accuracy is dependent on the data, which might have inconsistencies/biases.

Division of labor: Briefly outline who will be responsible for which part(s) of the project. Preprocessing - Everyone Training (writing the code and having it work) Vanilla RNN - everyone LSTM - AJ, Katie GRU - Stella, Arin Transformer - Testing (same three architectures) - leads: Katie and AJ Gauging accuracy and changing hyperparameters, number of layers, etc. Summarizing results - leads: Stella and Arin Poster - leads: Stella and Arin

Built With

python
tensorflow

Updates

katieli1 Li started this project — Apr 14, 2024 09:38 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.