23.1 #1 Solution - hybrid

The winning team was from Bosch Research, they present their solution at Kaggle discussion

The team consisted of

  • Two Bosch research groups
    • Bosch Corporate Research
    • Bosch Center for AI (BCAI, Pittsburgh)
  • Domain experts
  • ML experts

23.1.1 Overall architecture

The winning team used a neural network

  • Wrote NN model from scratch
  • Model processes an entire molecule at once
    • simultaneously making a prediction for each of the scalar couplings in the molecule

23.1.2 Input features and embeddings

  • Embeddings
  • Plus two scalar constants

23.1.3 Ensembling

Often a better performance can be achieved when ensembling several model together, good practice is it to use models which a dissimilar because the variance helps to improve the overall performance.

  • Trained 13 models
    • iterations and versions of same basic structure
  • Best single model: -3.08
  • Straight median across predictions: ~-3.22
  • More involved blending: -3.24522

23.1.4 Hardware

The variety of models were trained on different machines, each running a Linux OS:

  • 5 machines had 4 GPUs, each a NVIDIA GeForce RTX 2080 Ti
  • 2 machines had 1 GPU NVIDIA Tesla V100 with 32 GB memory
  • 6 machines had 1 GPU NVIDIA Tesla V100 with 16 GB memory

23.1.5 Software

The team did not use any of the popular ML frameworks but coded their models from scratch

  • Python 3.5+
  • PyTorch
  • CUDA 10.1
  • NVIDIA APEX (Only available through the repo at this phase)

23.1.6 Code on GitHub

A detailed explanation of the principle setup of the code for pre-processing and for the models is given at https://github.com/boschresearch/BCAI_kaggle_CHAMPS

  1. using the median of all 13 models to determine which 9 models seemed best, then taking the mean of a few different medians of the different model predictions↩︎