Interpreting convolutional neural network decision for earthquake detection with feature map visualization, backward optimization and layer-wise relevance propagation methods - Archive ouverte HAL Access content directly
Journal Articles Geophysical Journal International Year : 2023

Interpreting convolutional neural network decision for earthquake detection with feature map visualization, backward optimization and layer-wise relevance propagation methods

(1) , (2) , (1)
1
2

Abstract

SUMMARY In the recent years, the seismological community has adopted deep learning (DL) models for many diverse tasks such as discrimination and classification of seismic events, identification of P- and S-phase wave arrivals or earthquake early warning systems. Numerous models recently developed are showing high accuracy values, and it has been attested for several tasks that DL models perform better than the classical seismological state-of-art models. However, their performances strongly depend on the DL architecture, the training hyperparameters, and the training data sets. Moreover, due to their complex nature, we are unable to understand how the model is learning and therefore how it is making a prediction. Thus, DL models are usually referred to as a ‘black-box’. In this study, we propose to apply three complementary techniques to address the interpretability of a convolutional neural network (CNN) model for the earthquake detection. The implemented techniques are: feature map visualization, backward optimization and layer-wise relevance propagation. Since our model reaches a good accuracy performance (97%), we can suppose that the CNN detector model extracts relevant characteristics from the data, however a question remains: can we identify these characteristics? The proposed techniques help to answer the following questions: How is an earthquake processed by a CNN model? What is the optimal earthquake signal according to a CNN? Which parts of the earthquake signal are more relevant for the model to correctly classify an earthquake sample? The answer to these questions help understand why the model works and where it might fail, and whether the model is designed well for the predefined task. The CNN used in this study had been trained for single-station detection, where an input sample is a 25 s three-component waveform. The model outputs a binary target: earthquake (positive) or noise (negative) class. The training database contains a balanced number of samples from both classes. Our results shows that the CNN model correctly learned to recognize where is the earthquake within the sample window, even though the position of the earthquake in the window is not explicitly given during the training. Moreover, we give insights on how a neural network builds its decision process: while some aspects can be linked to clear physical characteristics, such as the frequency content and the P and S waves, we also see how different a DL detection is compared to a visual expertise or an STA/LTA detection. On top of improving our model designs, we also think that understanding how such models work, how they perceive an earthquake, can be useful for the comprehension of events that are not fully understood yet such as tremors or low frequency earthquakes.
Fichier principal
Vignette du fichier
Mastorovic_Giffard_Poli_2022.pdf (19.46 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03832942 , version 1 (28-10-2022)

Identifiers

Cite

Josipa Majstorović, Sophie Giffard-Roisin, Piero Poli. Interpreting convolutional neural network decision for earthquake detection with feature map visualization, backward optimization and layer-wise relevance propagation methods. Geophysical Journal International, 2023, 232 (2), pp.923-939. ⟨10.1093/gji/ggac369⟩. ⟨hal-03832942⟩
0 View
0 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More