Hands-on 05: Time series data and RNNs: Identifying cosmic rays in radio signals#

Credit: Adapted from deeplearningphysics.org

Large arrays of radio antennas can be used to measure cosmic rays by recording the electromagnetic radiation generated in the atmosphere. These radio signals are strongly contaminated by galactic noise as well as signals from human origin. Because these signals appear to be similar to the background, the discovery of cosmic-ray events can be challenging.

Identification of signals#

In this exercise, we design an RNN to classify if the recorded radio signals contain a cosmic-ray event or only noise.

The signal-to-noise ratio (SNR) of a measured trace \(S(t)\) is defined as follows:

\[\mathrm{SNR}=\frac{S^{\mathrm{signal}}(t)_\mathrm{max}}{\mathrm{RMS}[S(t)]},\]

where \(S^{\mathrm{signal}}(t)_\mathrm{max}\) denotes the maximum amplitude of the (true) signal.

Typical cosmic-ray observatories enable a precise reconstruction at an SNR of roughly 3.

We choose a challenging setup in this task and try to identify cosmic-ray events in signal traces with an SNR of 2.
Training RNNs can be computationally demanding, thus, we recommend to use a GPU for this task.

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
2024-03-31 18:29:13.005345: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

Load and prepare dataset#

In this task, we use a simulation of cosmic-ray-induced air showers that are measured by radio antennas.
For more information, see arXiv:1901.04079. The task is to design an RNN which is able to identify if the measured signal traces (shortened to 500 time steps) contains a signal or not.

import os
import gdown

url = "https://drive.google.com/u/0/uc?export=download&confirm=HgGH&id=1R-qfxO1jVh88TC9Gnm9JGMomSRg0Zpkx"
output = "data/radio_data.npz"

if not os.path.exists(output):
    gdown.download(url, output, quiet=True)

# there are 50k traces total, but let's just look at the first 5k
f = np.load(output)
n_train = 4000
n_test = 1000

x_train, x_test = f["traces"][:n_train], f["traces"][n_train : n_train + n_test]  # measured traces (signal + colored noise)
signal_train, signal_test = (
    f["signals"][:n_train],
    f["signals"][n_train : n_train + n_test],
)  # signal part (only available for cosmic-ray events)

# define training label (1=cosmic event, 0=noise)
y_train = (signal_train.std(axis=-1) != 0).astype(float)
y_test = (signal_test.std(axis=-1) != 0).astype(float)

Plot example signal traces#

Left: signal trace containing a cosmic-ray event. The underlying cosmic-ray signal is shown in red, the backgrounds + signal is shown in blue. Right: background noise.

from matplotlib import pyplot as plt

fs = 180e6  # Sampling frequency of antenna setup 180 MHz
t = np.arange(500) / fs * 1e6

plt.figure(1, (12, 4))

plt.subplot(1, 2, 1)
plt.plot(t, np.real(x_train[y_train.astype(bool)][0]), linewidth=1, color="b", label="Measured trace")
plt.plot(t, np.real(signal_train[y_train.astype(bool)][0]), linewidth=1, color="r", label="CR signal")
plt.ylabel("Amplitude [mV]")
plt.xlabel("Time [$\mu$s]")
plt.legend()
plt.title("Cosmic-ray event")
plt.grid(True)

plt.subplot(1, 2, 2)
plt.plot(t, np.real(x_train[~y_train.astype(bool)][0]), linewidth=1, color="b", label="Measured trace")
plt.ylabel("Amplitude [mV]")
plt.xlabel("Time [$\mu$s]")
plt.legend()
plt.title("Noise event")
plt.grid(True)

plt.tight_layout()
_images/8c88a51d089e78e7538507c3333b21e6b7d53027676d0626ab329f5725dcef84.png
sigma = x_train.std()
x_train /= sigma
x_test /= sigma

Define RNN model#

In the following, design a cosmic-ray model to identify cosmic-ray events using an RNN-based classifier.

from tensorflow.keras import layers

model = keras.models.Sequential(name="sequential_1")
model.add(layers.LSTM(32, return_sequences=True, input_shape=(500, 1), name="lstm_1"))
model.add(layers.LSTM(64, return_sequences=True, name="lstm_2"))
model.add(layers.LSTM(10, return_sequences=True, name="lstm_3"))
model.add(layers.Flatten(name="flatten_1"))
model.add(layers.Dropout(0.3, name="dropout_1"))
model.add(layers.Dense(1, activation="sigmoid", name="dense_1"))

model.summary()
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm_1 (LSTM)               (None, 500, 32)           4352      
                                                                 
 lstm_2 (LSTM)               (None, 500, 64)           24832     
                                                                 
 lstm_3 (LSTM)               (None, 500, 10)           3000      
                                                                 
 flatten_1 (Flatten)         (None, 5000)              0         
                                                                 
 dropout_1 (Dropout)         (None, 5000)              0         
                                                                 
 dense_1 (Dense)             (None, 1)                 5001      
                                                                 
=================================================================
Total params: 37185 (145.25 KB)
Trainable params: 37185 (145.25 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Pre-processing of data and RNN training#

lr_schedule = keras.optimizers.schedules.ExponentialDecay(initial_learning_rate=1e-3, decay_steps=100, decay_rate=0.8)

model.compile(loss="binary_crossentropy", optimizer=keras.optimizers.Adam(learning_rate=lr_schedule), metrics=["accuracy"])


results = model.fit(
    x_train[..., np.newaxis],
    y_train,
    batch_size=2048,
    epochs=20,
    verbose=0,
    validation_split=0.1,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=5, verbose=0, min_lr=1e-5),
        keras.callbacks.EarlyStopping(patience=15, verbose=0),
    ],
)
2024-03-31 18:29:23.737184: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 262144000 exceeds 10% of free system memory.
2024-03-31 18:29:23.800938: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 262144000 exceeds 10% of free system memory.
2024-03-31 18:29:23.932665: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 262144000 exceeds 10% of free system memory.
2024-03-31 18:29:24.817738: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 262144000 exceeds 10% of free system memory.
2024-03-31 18:29:24.885840: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 262144000 exceeds 10% of free system memory.
y_pred = model.predict(x_test[..., np.newaxis], verbose=0)

Plot loss and accuracy#

from plotting import plot_model_history

plot_model_history(results)
_images/30d37b7df34c84e6719bc8499ee2d70e2f8a8f310908e3e000702726a4a63c90.png
from plotting import make_roc
from sklearn.metrics import accuracy_score

make_roc(y_test, y_pred, ["CR signal"])
accuracy_score(y_test, y_pred > 0.5)
0.714
_images/703977c96726c4014dec21f6cf4d4879b1069cb71eb957570c7dc0b519a28037.png

Exercises#

  1. Replace the LSTM layers with bidirectional LSTM layers: layers.Bidirectional(layers.LSTM(...)). How does the number of parameters of the model change? How does the performance change?

  2. Replace the LSTM layers with GRU layers. How does the number of parameters of the model change? How does the performance change?

  3. Replace the LSTM layers with 1D convolutional layers layers.Conv1D(...) with the following parameters. How does the number of parameters of the model change? How does the performance change?

    • filters=32, kernel_size=5, strides=1, padding="same", activation="relu"

    • filters=64, kernel_size=5, strides=2, padding="same", activation="relu"

    • filters=10, kernel_size=5, strides=2, padding="same", activation-"relu"

  4. How can you modify the architecture and data to perform signal trace extraction?

  5. (BONUS) Implement signal trace extraction.