Hands-on 05: Time series data and RNNs: Identifying cosmic rays in radio signals

Hands-on 05: Time series data and RNNs: Identifying cosmic rays in radio signals#

Credit: Adapted from deeplearningphysics.org

Large arrays of radio antennas can be used to measure cosmic rays by recording the electromagnetic radiation generated in the atmosphere. These radio signals are strongly contaminated by galactic noise as well as signals from human origin. Because these signals appear to be similar to the background, the discovery of cosmic-ray events can be challenging.

Identification of signals#

In this exercise, we design an RNN to classify if the recorded radio signals contain a cosmic-ray event or only noise.

The signal-to-noise ratio (SNR) of a measured trace \(S(t)\) is defined as follows:

\[\mathrm{SNR}=\frac{S^{\mathrm{signal}}(t)_\mathrm{max}}{\mathrm{RMS}[S(t)]},\]

where \(S^{\mathrm{signal}}(t)_\mathrm{max}\) denotes the maximum amplitude of the (true) signal.

Typical cosmic-ray observatories enable a precise reconstruction at an SNR of roughly 3.

We choose a challenging setup in this task and try to identify cosmic-ray events in signal traces with an SNR of 2.
Training RNNs can be computationally demanding, thus, we recommend to use a GPU for this task.

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

2024-06-05 04:34:19.529318: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

Load and prepare dataset#

In this task, we use a simulation of cosmic-ray-induced air showers that are measured by radio antennas.
For more information, see arXiv:1901.04079. The task is to design an RNN which is able to identify if the measured signal traces (shortened to 500 time steps) contains a signal or not.

import os
import gdown

url = "https://drive.google.com/u/0/uc?export=download&confirm=HgGH&id=1R-qfxO1jVh88TC9Gnm9JGMomSRg0Zpkx"
output = "data/radio_data.npz"

if not os.path.exists(output):
    gdown.download(url, output, quiet=True)

# there are 50k traces total, but let's just look at the first 5k
f = np.load(output)
n_train = 4000
n_test = 1000

x_train, x_test = f["traces"][:n_train], f["traces"][n_train : n_train + n_test]  # measured traces (signal + colored noise)
signal_train, signal_test = (
    f["signals"][:n_train],
    f["signals"][n_train : n_train + n_test],
)  # signal part (only available for cosmic-ray events)

# define training label (1=cosmic event, 0=noise)
y_train = (signal_train.std(axis=-1) != 0).astype(float)
y_test = (signal_test.std(axis=-1) != 0).astype(float)

Plot example signal traces#

Left: signal trace containing a cosmic-ray event. The underlying cosmic-ray signal is shown in red, the backgrounds + signal is shown in blue. Right: background noise.

from matplotlib import pyplot as plt

fs = 180e6  # Sampling frequency of antenna setup 180 MHz
t = np.arange(500) / fs * 1e6

plt.figure(1, (12, 4))

plt.subplot(1, 2, 1)
plt.plot(t, np.real(x_train[y_train.astype(bool)][0]), linewidth=1, color="b", label="Measured trace")
plt.plot(t, np.real(signal_train[y_train.astype(bool)][0]), linewidth=1, color="r", label="CR signal")
plt.ylabel("Amplitude [mV]")
plt.xlabel("Time [$\mu$s]")
plt.legend()
plt.title("Cosmic-ray event")
plt.grid(True)

plt.subplot(1, 2, 2)
plt.plot(t, np.real(x_train[~y_train.astype(bool)][0]), linewidth=1, color="b", label="Measured trace")
plt.ylabel("Amplitude [mV]")
plt.xlabel("Time [$\mu$s]")
plt.legend()
plt.title("Noise event")
plt.grid(True)

plt.tight_layout()

_images/310ccffc9aaa813568f8c3fc193cb5432ab12a6f9d19e6c639c7dbfeed14c582.png

sigma = x_train.std()
x_train /= sigma
x_test /= sigma

Define RNN model#

In the following, design a cosmic-ray model to identify cosmic-ray events using an RNN-based classifier.

from tensorflow.keras import layers

model = keras.models.Sequential(name="sequential_1")
model.add(layers.LSTM(32, return_sequences=True, input_shape=(500, 1), name="lstm_1"))
model.add(layers.LSTM(64, return_sequences=True, name="lstm_2"))
model.add(layers.LSTM(10, return_sequences=True, name="lstm_3"))
model.add(layers.Flatten(name="flatten_1"))
model.add(layers.Dropout(0.3, name="dropout_1"))
model.add(layers.Dense(1, activation="sigmoid", name="dense_1"))

model.summary()

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[5], line 1
----> 1 from tensorflow.keras import layers
      3 model = keras.models.Sequential(name="sequential_1")
      4 model.add(layers.LSTM(32, return_sequences=True, input_shape=(500, 1), name="lstm_1"))

ModuleNotFoundError: No module named 'tensorflow.keras'

RNN training#

lr_schedule = keras.optimizers.schedules.ExponentialDecay(initial_learning_rate=1e-3, decay_steps=100, decay_rate=0.8)

model.compile(loss="binary_crossentropy", optimizer=keras.optimizers.Adam(learning_rate=lr_schedule), metrics=["accuracy"])


results = model.fit(
    x_train[..., np.newaxis],
    y_train,
    batch_size=256,
    epochs=20,
    verbose=0,
    validation_split=0.1,
    callbacks=[
        keras.callbacks.EarlyStopping(patience=15, verbose=0, restore_best_weights=True),
    ],
)

y_pred = model.predict(x_test[..., np.newaxis], verbose=0)

Plot loss and accuracy#

from plotting import plot_model_history

plot_model_history(results)

from plotting import make_roc
from sklearn.metrics import accuracy_score

make_roc(y_test, y_pred, ["CR signal"])
accuracy_score(y_test, y_pred > 0.5)

Exercises#

Replace the LSTM layers with bidirectional LSTM layers: layers.Bidirectional(layers.LSTM(...)) (See https://keras.io/api/layers/recurrent_layers/bidirectional/). Note for the first layer, the input_shape argument needs to be an input to the outer layers.Bidirectional function instead of the inner layers.LSTM function. How does the number of parameters of the model change? How does the performance change?
Replace the LSTM layers with GRU layers. How does the number of parameters of the model change? How does the performance change?
Replace the LSTM layers with 1D convolutional layers layers.Conv1D(...) with the following parameters. How does the number of parameters of the model change? How does the performance change?
- filters=32, kernel_size=5, strides=1, padding="same", activation="relu"
- filters=64, kernel_size=5, strides=2, padding="same", activation="relu"
- filters=10, kernel_size=5, strides=2, padding="same", activation-"relu"
How can you modify the architecture and data to perform signal trace extraction?
(BONUS) Implement signal trace extraction.