23.1 #1 Solution - hybrid
The winning team was from Bosch Research, they present their solution at Kaggle discussion
The team consisted of
- Two Bosch research groups
- Bosch Corporate Research
- Bosch Center for AI (BCAI, Pittsburgh)
- Domain experts
- ML experts
23.1.1 Overall architecture
The winning team used a neural network
- Wrote NN model from scratch
- Model processes an entire molecule at once
- simultaneously making a prediction for each of the scalar couplings in the molecule
23.1.1 Input features and embeddings
- Embeddings
- Plus two scalar constants
23.1.2 Ensembling
Often a better performance can be achieved when ensembling several model together, good practice is it to use models which a dissimilar because the variance helps to improve the overall performance.
- Trained 13 models
- iterations and versions of same basic structure
- Best single model: -3.08
- Straight median across predictions: ~-3.22
- More involved blending: -3.24522
23.1.3 Hardware
The variety of models were trained on different machines, each running a Linux OS:
- 5 machines had 4 GPUs, each a NVIDIA GeForce RTX 2080 Ti
- 2 machines had 1 GPU NVIDIA Tesla V100 with 32 GB memory
- 6 machines had 1 GPU NVIDIA Tesla V100 with 16 GB memory
23.1.4 Software
The team did not use any of the popular ML frameworks but coded their models from scratch
- Python 3.5+
- PyTorch
- CUDA 10.1
- NVIDIA APEX (Only available through the repo at this phase)
23.1.5 Code on GitHub
A detailed explanation of the principle setup of the code for pre-processing and for the models is given at https://github.com/boschresearch/BCAI_kaggle_CHAMPS
using the median of all 13 models to determine which 9 models seemed best, then taking the mean of a few different medians of the different model predictions↩︎