Machine Learning orientation
1
Introduction
I Machine learning: Shall we?
2
What is machine learning?
2.1
What is intelligence?
2.1.1
Definition of artificial intelligence sub domains
2.2
Is AI smarter than humans?
2.2.1
Thinking, fast and slow
(Kahneman
2011
)
2.3
Comparisons between AI and humans
2.3.1
Breast cancer detection
2.3.2
Working together: Lung cancer detection
2.3.3
ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
2.3.4
AlphaGo Zero
2.4
ML models with bias
2.5
Attacks on ML models
2.5.1
Adding noise to image leads to misclassification
2.5.2
But what about attacks on human perception?
2.5.3
Model Hacking ADAS to Pave Safer Roads for Autonomous Vehicles
2.6
Measuring the Algorithmic Efficiency of Neural Networks TBD
3
Data Ethics
3.1
Topics in data ethics
3.1.1
Recourse and accountabilty
3.1.2
Feedback loops
3.1.3
Bias
3.2
Identify and adressing ethical issues
3.2.1
Analyze a planned project
3.2.2
Process to implement
3.2.3
Diversity
3.3
Data Ethics consequences
3.3.1
Researcher stops his work due to ethical concerns
3.3.2
Career: Oxford seeks AI ethics professor
3.3.3
Shaping Europe’s digital future: Commission presents strategies for data and Artificial Intelligence
4
Strategies for machine learning
4.1
Management ML strategy: 7 steps for a successful ML project
4.2
Management ML strategy: Data Project Checklist
4.3
Project management ML strategy: The Drivetrain Approach
4.3.1
Recommendation system
4.3.2
Exercise: Optimizing lifetime customer value
4.4
Developer ML strategy TBD
5
Outlook of ML future
5.1
Development of life
5.1.1
When will superhuman AI come, and will it be good?
5.1.2
AI aftermath scenario
5.2
Data religion: Dataism
6
(PART) Machine learning fundamentals
7
Machine learning fundamentals
8
ML project process
8.1
Identify if ML is suited to fulfill need
8.2
Gather data
8.2.1
What kind of data can be used?
8.2.2
How much data is necessary?
8.2.3
Which data is useful?
8.3
Data analysis
8.3.1
Example for exploratory and quantitative data analysis
8.3.2
Visualizations for Categorical Data: Exploring the OkCupid Data
8.4
Feature engineering
8.4.1
Encoding Categorical Predictors
8.4.2
Engineering numeric features
8.4.3
Feature importance
8.5
Model fit
8.6
Model tuning
8.6.1
Metrics
8.7
Model deployment
9
Machine learning types
9.1
Supervised learning
9.1.1
Self supervised learning
9.2
Unsupervised learning
9.2.1
Discovering clusters
9.2.2
Discovering latent factors
9.3
Reinforcement learning
9.3.1
Elements of reinforcement learning
9.3.2
RL algorithms
9.3.3
Example self driving car MIT
10
ML algorithms
10.1
Linear regression
10.1.1
Example for linear regression
10.2
Logistic regression
10.2.1
Python example logistic regression
10.3
Tree based methods
10.3.1
Splitting metrics
10.3.2
Ensembles
10.3.3
Random forest
10.3.4
Boosted trees
10.4
Support Vector Machine (SVM) TBD
10.4.1
Kernels
10.4.2
Python example for SVM
10.5
Neural networks
10.5.1
Geometric Intuition for Training Neural Networks
10.5.2
Convolutional Neural Network (CNN) TBD
10.5.3
RNN TBD
10.5.4
GANs
10.6
A Gentle Introduction to CycleGAN for Image Translation
10.6.1
Examples for GANs
10.7
Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.
10.8
Transformers TBD
10.8.1
Example transformer for time series forecasting
11
Food for the algorithms: Data
11.1
NLP
11.2
Dataset search from Google
12
ML application examples
12.1
AI translation by DeepL
12.2
Artificial intelligence detects myocardial infarctions in the ECG more reliably than cardiologistsn
12.2.1
First in Germany: Artificial intelligence recognizes COVID-19 in clinical routine
12.3
A Deep Learning Approach to Antibiotic Discovery
12.4
Fundamental limits from chaos on instability time predictions in compact planetary systems
12.4.1
Comparison between SPOCK and previous models
12.5
Machine Learning Algorithms and Global Optimization Methods for SPICE Model Parameter Extraction
12.6
How many yards will an NFL player gain after receiving a handoff?
12.7
Predictive Maintenance for the elevator and escalator industry TBD
12.8
Disease outbreak risk software
12.9
Neural networks enable autonomous navigation of catheters
12.10
Bosch FLEXIDOME IP starlight 8000i
12.11
Master Autonomous Driving
12.12
University Suttgart: Indoor-Ortung mit Mobilfunk
II Explainable machine learning
13
Explainable machine learning
13.1
Method: Layer-Wise Relevance Propagation
13.2
Method: SpRay
13.3
Method: Salient-object-visualization
13.3.1
Tool: keras-salient-object-visualization
13.3.2
Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car
13.4
Tool: Lime
13.5
Tool: tf-explain tbd
13.6
Tool: alibi
III ML online resources
14
ML online courses
14.1
Coursera
14.2
Udemy
14.3
DataCamp
14.4
Udacity
14.4.1
Example for self-driving car course project
14.5
fast.ai
14.5.1
fast.ai Book
14.6
Kaggle Courses
14.7
Full Stack Deep Learning
15
ML online resources
15.1
In-depth introduction to machine learning in 15 hours of expert videos
15.1.1
An Introduction to Statistical Learning
15.2
The learning machine
15.3
DeepAI: The front page of A.I.
15.4
TensorFlow tutorials
15.4.1
MIT 6.S191 Introduction to Deep Learning
15.5
Embedding Projector
15.6
Tensorboard playground
15.7
Empowering companies to jumpstart AI and generate real-world value
15.8
TensorFlow, Keras and deep learning, without a PhD
15.9
Neural Networks and Deep Learning
15.10
Platform.ai: produce high-quality labels
16
ML online books
16.1
Neural Networks and Deep Learning
16.2
Deep Learning
IV Examples from Kaggle
17
Examples in Kaggle
18
Melbourne University AES/MathWorks/NIH Seizure Prediction
18.1
Winning solution (1st)
18.1.1
Alex / Gilberto models
18.1.2
Feng models
18.1.3
Andriy models
18.1.4
Code on GitHub
18.2
Solution(4th place)
18.2.1
Pre-processing
18.2.2
Features
18.2.3
Model
18.2.4
GitHub code
19
Bosch Production Line Performance
19.1
1st place solution
19.1.1
Data exploration
19.1.2
Hand crafted features
19.1.3
Hardware
19.2
3rd place solution TBD
19.3
8th place solution with GitHub
19.3.1
Overall architecture
19.3.2
Input data sets
19.3.3
Ensembling
19.3.4
Features
19.3.5
Validation method
19.3.6
Software
19.3.7
Code on GitHub
20
Corporación Favorita Grocery Sales Forecasting
20.1
1st place solution
20.2
4th-Place Solution Overview
20.3
5th Place Solution
21
Rossmann Store Sales
21.1
1st place solution
21.2
3rd place solution
22
Severstal: Steel Defect Detection
23
Lyft 3D Object Detection for Autonomous Vehicles
23.1
3rd place solution
24
APTOS 2019 Blindness Detection
24.1
1st place solution summary
25
Predicting Molecular Properties
25.1
#1 Solution - hybrid
25.1.1
Overall architecture
25.1.2
Ensembling
25.1.3
Hardware
25.1.4
Software
25.1.5
Code on GitHub
25.2
#2 solution 🤖 Quantum Uncertainty 🤖
25.2.1
Overall architecture
25.2.2
Input features and embeddings
25.2.3
Data augmentation
25.2.4
Ensembling
25.2.5
Hardware
25.2.6
Software
25.2.7
Code on GitHub
V Real world example
26
Real world example
26.1
Subject of the project
Depending from where you were looking:
Looking from the perspective of machine learning expert
26.2
Project phases
The main project phases are:
After data gathering iteration is trump
26.2.1
Feature engineering
26.3
Algorithm selection
26.3.1
Logistic regression
26.3.2
Tree based
26.3.3
Support Vector Machine (SVM) TBD
26.4
Performance measurement
26.4.1
Sensitivity and specificity
26.4.2
Receiver operating characteristic (ROC)
26.5
Confusion matrix and receiver operating characterstic (ROC) for pulse
26.5.1
Receiver operating characterstic (ROC) and probability density plots
26.6
Create augmented labeled data
26.6.1
Features of time signals
26.7
Features generated
26.7.1
Analysis of generated features
26.7.2
Dynamic time warp (DTW) for signal
26.8
Confusion matrix results logistic regression for measured data
26.8.1
ROC results for measured data
26.9
Several algorithms results for SNR = 18dB
26.9.1
ROC results for SNR 18dB
26.10
Calculation of return of invest (ROI)
26.10.1
Calculation of ML project invest
26.10.2
Calculation of ML profit
26.10.3
Resulting ROI
VI Cloud-based machine learning
27
Cloud-based machine learning
VII Kaggle Survey
28
Kaggle survey introduction
28.1
Kaggle survey details
28.2
Purpose
28.3
Navigation and handling
29
Results
29.1
Survey participants education level
29.2
Who uses which algorithm
29.3
Machine learning experience and algorithms
29.4
Experience and new algorithms
29.5
Role of participants
29.6
Company size
29.7
Company incorporation of machine learning
29.8
Favourite media sources on data science topics
29.9
Favourite online course platform
29.10
Favourite data analyzing tool
29.11
Experience in data analysis coding
29.12
Favourite integrated development environments (IDE’s)
29.13
Favourite hosted notebook products
29.14
Favourite programming languages
29.15
Recommended entry programming language
29.16
Favourite data visualization libraries or tools
29.17
Favourite specialized hardware
29.18
Favourite machine learning frameworks
29.19
Favourite cloud computing platforms
29.20
Favourite big data / analytics products
29.21
Favourite automated machine learning tools (or partial AutoML tools)
References
Published with bookdown
Machine learning orientation
Chapter 6
(PART) Machine learning fundamentals