This custom transformer processes signal files to create features used by DriverlessAI to solve a regression problem This recipe has been created in the context of LANL Earthquake Prediction Challenge on Kaggle https://www.kaggle.com/c/LANL-Earthquake-Prediction
To use the recipe you have to transform the original data into the following form:
- Signal data related to one label/target is stored in a separate file
- The dataset submitted to DAI is of the form : ID, signalFilePath, Target
Please make sure to set the file_path
feature as a text in DAI
To do so, click on the dataset in the dataset panel and chose DETAILS
Then in the detail panel, hover the file_path feature and choose text as the logical type
You may also want to disable the Text DAI Recipes.
signalFilePath
: file location storing signal information
The custom recipe outputs following features:
- Statistics: mean, median, min/max, standard deviation, skewness, kurtosis
- Mel Frequency Cepstral Coefficient (MFCC)
- Autocorrelation at different lags
- Trend information
- Number of events having an amplitude greater than a threshold
Python 3.6, DAI 1.7.0 and above
- librosa used for music and audio analysis
- tsfresh amazing time series and signal processing package.
- pywavelets for wavelet transforms.
- numba used to accelerate heavy computations
- progressbar2 to display progress as signals are processed