Hands-on 05: Time series data and RNNs: Identifying cosmic rays in radio signals#
Credit: Adapted from deeplearningphysics.org
Large arrays of radio antennas can be used to measure cosmic rays by recording the electromagnetic radiation generated in the atmosphere. These radio signals are strongly contaminated by galactic noise as well as signals from human origin. Because these signals appear to be similar to the background, the discovery of cosmic-ray events can be challenging.
Identification of signals#
In this exercise, we design an RNN to classify if the recorded radio signals contain a cosmic-ray event or only noise.
The signal-to-noise ratio (SNR) of a measured trace \(S(t)\) is defined as follows:
where \(S^{\mathrm{signal}}(t)_\mathrm{max}\) denotes the maximum amplitude of the (true) signal.
Typical cosmic-ray observatories enable a precise reconstruction at an SNR of roughly 3.
We choose a challenging setup in this task and try to identify cosmic-ray events in signal traces with an SNR of 2.
Training RNNs can be computationally demanding, thus, we recommend to use a GPU for this task.
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
2023-04-07 02:37:27.286814: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Load and prepare dataset#
In this task, we use a simulation of cosmic-ray-induced air showers that are measured by radio antennas.
For more information, see arXiv:1901.04079.
The task is to design an RNN which is able to identify if the measured signal traces (shortened to 500 time steps) contains a signal or not.
import os
import gdown
url = "https://drive.google.com/u/0/uc?export=download&confirm=HgGH&id=1R-qfxO1jVh88TC9Gnm9JGMomSRg0Zpkx"
output = "data/radio_data.npz"
if not os.path.exists(output):
gdown.download(url, output, quiet=True)
# there are 50k traces total, but let's just look at the first 5k
f = np.load(output)
n_train = 4000
n_test = 1000
x_train, x_test = f["traces"][:n_train], f["traces"][n_train : n_train + n_test] # measured traces (signal + colored noise)
signal_train, signal_test = (
f["signals"][:n_train],
f["signals"][n_train : n_train + n_test],
) # signal part (only available for cosmic-ray events)
# define training label (1=cosmic event, 0=noise)
y_train = (signal_train.std(axis=-1) != 0).astype(float)
y_test = (signal_test.std(axis=-1) != 0).astype(float)
Plot example signal traces#
Left: signal trace containing a cosmic-ray event. The underlying cosmic-ray signal is shown in red, the backgrounds + signal is shown in blue. Right: background noise.
from matplotlib import pyplot as plt
fs = 180e6 # Sampling frequency of antenna setup 180 MHz
t = np.arange(500) / fs * 1e6
plt.figure(1, (12, 4))
plt.subplot(1, 2, 1)
plt.plot(t, np.real(x_train[y_train.astype(bool)][0]), linewidth=1, color="b", label="Measured trace")
plt.plot(t, np.real(signal_train[y_train.astype(bool)][0]), linewidth=1, color="r", label="CR signal")
plt.ylabel("Amplitude [mV]")
plt.xlabel("Time [$\mu$s]")
plt.legend()
plt.title("Cosmic-ray event")
plt.grid(True)
plt.subplot(1, 2, 2)
plt.plot(t, np.real(x_train[~y_train.astype(bool)][0]), linewidth=1, color="b", label="Measured trace")
plt.ylabel("Amplitude [mV]")
plt.xlabel("Time [$\mu$s]")
plt.legend()
plt.title("Noise event")
plt.grid(True)
plt.tight_layout()

sigma = x_train.std()
x_train /= sigma
x_test /= sigma
Define RNN model#
In the following, design a cosmic-ray model to identify cosmic-ray events using an RNN-based classifier.
from tensorflow.keras import layers
model = keras.models.Sequential(name="sequential_1")
model.add(layers.LSTM(32, return_sequences=True, input_shape=(500, 1), name="lstm_1"))
model.add(layers.LSTM(64, return_sequences=True, name="lstm_2"))
model.add(layers.LSTM(10, return_sequences=True, name="lstm_3"))
model.add(layers.Flatten(name="flatten_1"))
model.add(layers.Dropout(0.3, name="dropout_1"))
model.add(layers.Dense(1, activation="sigmoid", name="dense_1"))
model.summary()
2023-04-07 02:37:33.640410: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_1 (LSTM) (None, 500, 32) 4352
lstm_2 (LSTM) (None, 500, 64) 24832
lstm_3 (LSTM) (None, 500, 10) 3000
flatten_1 (Flatten) (None, 5000) 0
dropout_1 (Dropout) (None, 5000) 0
dense_1 (Dense) (None, 1) 5001
=================================================================
Total params: 37,185
Trainable params: 37,185
Non-trainable params: 0
_________________________________________________________________
Pre-processing of data and RNN training#
lr_schedule = keras.optimizers.schedules.ExponentialDecay(initial_learning_rate=1e-3, decay_steps=100, decay_rate=0.8)
model.compile(loss="binary_crossentropy", optimizer=keras.optimizers.Adam(learning_rate=lr_schedule), metrics=["accuracy"])
results = model.fit(
x_train[..., np.newaxis],
y_train,
batch_size=2048,
epochs=20,
verbose=0,
validation_split=0.1,
callbacks=[
keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=5, verbose=0, min_lr=1e-5),
keras.callbacks.EarlyStopping(patience=15, verbose=0),
],
)
2023-04-07 02:37:43.449723: W tensorflow/tsl/framework/cpu_allocator_impl.cc:82] Allocation of 262144000 exceeds 10% of free system memory.
2023-04-07 02:37:43.626660: W tensorflow/tsl/framework/cpu_allocator_impl.cc:82] Allocation of 262144000 exceeds 10% of free system memory.
2023-04-07 02:37:43.946381: W tensorflow/tsl/framework/cpu_allocator_impl.cc:82] Allocation of 262144000 exceeds 10% of free system memory.
2023-04-07 02:37:45.634844: W tensorflow/tsl/framework/cpu_allocator_impl.cc:82] Allocation of 262144000 exceeds 10% of free system memory.
2023-04-07 02:37:45.802368: W tensorflow/tsl/framework/cpu_allocator_impl.cc:82] Allocation of 262144000 exceeds 10% of free system memory.
y_pred = model.predict(x_test[..., np.newaxis], verbose=0)
Plot loss and accuracy#
from plotting import plot_model_history
plot_model_history(results)

from plotting import make_roc
from sklearn.metrics import accuracy_score
make_roc(y_test, y_pred, ["CR signal"])
accuracy_score(y_test, y_pred > 0.5)
0.738

Exercises#
Replace the LSTM layers with bidirectional LSTM layers:
layers.Bidirectional(layers.LSTM(...))
. How does the number of parameters of the model change? How does the performance change?Replace the LSTM layers with GRU layers. How does the number of parameters of the model change? How does the performance change?
Replace the LSTM layers with 1D convolutional layers
layers.Conv1D(...)
with the following parameters. How does the number of parameters of the model change? How does the performance change?filters=32, kernel_size=5, strides=1, padding="same", activation="relu"
filters=64, kernel_size=5, strides=2, padding="same", activation="relu"
filters=10, kernel_size=5, strides=2, padding="same", activation-"relu"
How can you modify the architecture and data to perform signal trace extraction?
(BONUS) Implement signal trace extraction.