Kaggle Playground Competition (Link)
Project for Deep Learning at CU Boulder
My solution is based on Convolutional Neural Networks. The final model was selected through cross-validation. It is built upon a pre-trained ResNet34; larger sizes take much longer to train, while smaller ones yield poor results. I chose to balance the classes through augmentation: negative classes are augmented by a factor of 15, while positive ones by a factor of 10 (see DataGenerator
in tools.py
).
- exploration.ipynb: EDA, used only to check class balance.
- crossvalidation.ipynb: Code to run cross-validation. Note that the entire process involves only a subset of the given data.
- train.ipynb: Code to train the final model. Training is performed using the full train dataset (as provided).
- inference.ipynb: Code to make predictions using the chosen model.
- tools.py: All the functions used.
- Scientific paper on exactly the same topic, using neural networks: Link
- Article on preprocessing for this competition: Link
- Article from one of the best submission authors: Link
- Code for the previous article: Link Note: Some functions have been taken from this repository; this code is quite old, and there are many outdated functions that do not work with current package versions.
- Article from another one of the best submission authors: Link
- Article about test-time augmentation: Link
- Article about test-time augmentation: Link