Machine Learning orientation
1
Introduction
I Machine learning: Shall we?
2
What is machine learning?
2.1
What is intelligence?
2.1.1
Definition of artificial intelligence sub domains
2.2
Is AI smarter than humans?
2.2.1
Thinking, fast and slow
(Kahneman
2011
)
2.3
Comparisons between AI and humans
2.3.1
Breast cancer detection
2.3.2
Working together: Lung cancer detection
2.3.3
ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
2.3.4
AlphaGo Zero
2.4
ML models with bias
2.5
Attacks on ML models
2.5.1
Adding noise to image leads to misclassification
2.5.2
But what about attacks on human perception?
2.5.3
Model Hacking ADAS to Pave Safer Roads for Autonomous Vehicles
3
Outlook
3.1
Development of life
3.1.1
When will superhuman AI come, and will it be good?
3.1.2
AI aftermath scenario
3.2
Data religion: Dataism
4
Discussion points
4.1
Researcher stops his work due to ethical concerns
4.2
Career: Oxford seeks AI ethics professor
4.3
Shaping Europe’s digital future: Commission presents strategies for data and Artificial Intelligence
II Machine learning fundamentals
5
Machine learning fundamentals
6
ML project process
6.1
Identify if ML is suited to fulfill need
6.2
Gather data
6.2.1
How much data is necessary?
6.2.2
Which data is useful?
6.3
Exploratory and quantitative data analysis
6.3.1
Example for exploratory and quantitative data analysis
6.3.2
Visualizations for Categorical Data: Exploring the OkCupid Data
6.4
Feature engineering
6.4.1
Encoding Categorical Predictors
6.4.2
Engineering numeric features
6.4.3
Feature importance
6.5
Model fit
6.6
Model tuning
6.6.1
Metrics
7
Machine learning types
7.1
Supervised learning
7.1.1
Self supervised learning
7.2
Unsupervised learning
7.2.1
Discovering clusters
7.2.2
Discovering latent factors
7.3
Reinforcement learning
7.3.1
Elements of reinforcement learning
7.3.2
RL algorithms
7.3.3
Example self driving car MIT
8
ML algorithms
8.1
Linear regression
8.1.1
Example for linear regression
8.2
Logistic regression
8.2.1
Python example logistic regression
8.3
Tree based methods
8.3.1
Splitting metrics
8.3.2
Ensembles
8.3.3
Random forest
8.3.4
Boosted trees
8.4
Support Vector Machine (SVM) TBD
8.4.1
Kernels
8.4.2
Python example for SVM
8.5
Neural networks
8.5.1
Convolutional Neural Network (CNN) TBD
8.5.2
RNN TBD
8.5.3
GANs
8.6
A Gentle Introduction to CycleGAN for Image Translation
8.6.1
Examples for GANs
8.7
Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.
8.8
Transformers TBD
8.8.1
Example transformer for time series forecasting
9
Food for the algorithms: Data
9.1
NLP
9.2
Dataset search from Google
III Explainable ML
10
Explainable ML tbd
10.1
Method: Layer-Wise Relevance Propagation
10.2
Method: SpRay
10.3
Method: Lime tbd
10.4
alibi tbd
10.5
tf-explain tbd
10.6
keras-salient-object-visualization
10.6.1
Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car
10.6.2
VisualBackProp: efficient visualization of CNNs
IV ML online resources
11
ML online courses
11.1
Coursera
11.2
Udemy
11.3
DataCamp
11.4
Udacity
11.4.1
Example for self-driving car course project
11.5
Fast.ai
11.6
FastAI
11.7
Kaggle Courses
12
ML online resources
12.1
In-depth introduction to machine learning in 15 hours of expert videos
12.1.1
An Introduction to Statistical Learning
12.2
The learning machine
12.3
DeepAI: The front page of A.I.
12.4
TensorFlow tutorials
12.4.1
MIT 6.S191 Introduction to Deep Learning
12.5
Embedding Projector
12.6
Tensorboard playground
12.7
Empowering companies to jumpstart AI and generate real-world value
13
ML online books
13.1
Neural Networks and Deep Learning
13.2
Deep Learning
V Examples from Kaggle
14
Examples in Kaggle
15
Melbourne University AES/MathWorks/NIH Seizure Prediction
15.1
Winning solution (1st)
15.1.1
Alex / Gilberto models
15.1.2
Feng models
15.1.3
Andriy models
15.1.4
Code on GitHub
15.2
Solution(4th place)
15.2.1
Pre-processing
15.2.2
Features
15.2.3
Model
15.2.4
GitHub code
16
Bosch Production Line Performance
16.1
1st place solution
16.1.1
Data exploration
16.1.2
Hand crafted features
16.1.3
Hardware
16.2
3rd place solution TBD
16.3
8th place solution with GitHub
16.3.1
Overall architecture
16.3.2
Input data sets
16.3.3
Ensembling
16.3.4
Features
16.3.5
Validation method
16.3.6
Software
16.3.7
Code on GitHub
17
Corporación Favorita Grocery Sales Forecasting
17.1
1st place solution
17.2
4th-Place Solution Overview
17.3
5th Place Solution
18
Severstal: Steel Defect Detection
19
Lyft 3D Object Detection for Autonomous Vehicles
19.1
3rd place solution
20
APTOS 2019 Blindness Detection
20.1
1st place solution summary
21
Predicting Molecular Properties
21.1
#1 Solution - hybrid
21.1.1
Overall architecture
21.1.2
Ensembling
21.1.3
Hardware
21.1.4
Software
21.1.5
Code on GitHub
21.2
#2 solution 🤖 Quantum Uncertainty 🤖
21.2.1
Overall architecture
21.2.2
Input features and embeddings
21.2.3
Data augmentation
21.2.4
Ensembling
21.2.5
Hardware
21.2.6
Software
21.2.7
Code on GitHub
22
Local examples
22.1
Master Autonomous Driving
22.2
University Suttgart: Indoor-Ortung mit Mobilfunk
22.3
Bionic Learning Network
VI Real world example
23
Real world example
23.1
Subject of the project
Depending from where you were looking:
Looking from the perspective of machine learning expert
23.2
Project phases
The main project phases are:
After data gathering iteration is trump
23.2.1
Feature engineering
23.3
Algorithm selection
23.3.1
Logistic regression
23.3.2
Tree based
23.3.3
Support Vector Machine (SVM) TBD
23.4
Performance measurement
23.4.1
Sensitivity and specificity
23.4.2
Receiver operating characteristic (ROC)
23.5
Confusion matrix and ROC for pulse
23.5.1
R Plots
23.6
Create augmented labeled data
23.6.1
Features of time signals
23.7
Features generated
23.7.1
Analysis of generated features
23.7.2
Dynamic time warp (DTW) for signal
23.8
Algorithm
23.9
Confusion matrix results logistic regression for measured data
23.9.1
ROC results for measured data
23.10
Several algorithms results for SNR = 18dB
23.10.1
ROC results for SNR 18dB
23.11
Compare models for SNR = 18dB
23.12
Optimize ML hyper parameter
VII Cloud-based machine learning
24
Cloud-based machine learning
VIII Kaggle Survey
25
Kaggle survey introduction
25.1
Kaggle survey details
25.2
Purpose
25.3
Navigation and handling
26
Results
26.1
Survey participants education level
26.2
Who uses which algorithm
26.3
Machine learning experience and algorithms
26.4
Experience and new algorithms
26.5
Role of participants
26.6
Company size
26.7
Company incorporation of machine learning
26.8
Favourite media sources on data science topics
26.9
Favourite online course platform
26.10
Favourite data analyzing tool
26.11
Experience in data analysis coding
26.12
Favourite integrated development environments (IDE’s)
26.13
Favourite hosted notebook products
26.14
Favourite programming languages
26.15
Recommended entry programming language
26.16
Favourite data visualization libraries or tools
26.17
Favourite specialized hardware
26.18
Favourite machine learning frameworks
26.19
Favourite cloud computing platforms
26.20
Favourite big data / analytics products
26.21
Favourite automated machine learning tools (or partial AutoML tools)
References
Published with bookdown
Machine learning orientation
23.5
Confusion matrix and ROC for pulse
23.5.1
R Plots