ABCs of data science
  • About
Categories
All (26)
AI (8)
anomaly_detection (1)
bias (2)
clustering (2)
data_cleaning (2)
data_exploration (1)
data_science (26)
deep_learning (2)
deepfakes (1)
dimension_reduction (1)
distance_measures (1)
embedding (1)
embeddings (2)
ensembles (1)
gans (1)
interpretability (1)
metrics (1)
nlp (1)
optimization (1)
plotting (1)
pretrained_models (1)
random_forest (1)
reinforcement_learning (1)
reproducibility (1)
supervised_learning (9)
synthetic_media (1)
text_processing (1)
unsupervised_learning (4)
visualization (1)
xgboost (1)

Introduction

ABCs of data science is intended for anyone who wants to learn more about data science, regardless of skill level. It aims to give readers a high level overview of various data science concepts, so that they can explore these topics further. Note that these blogs were written before the explosion of LLMs but should hopefully provide some intuition into other data science techniques.

 

A is for Artificial Intelligence

data_science
AI
Jan 15, 2020

B is for Bias

data_science
bias
interpretability
Feb 1, 2020

C is for Clustering

data_science
clustering
unsupervised_learning
Mar 1, 2020

D is for Deep Learning

data_science
deep_learning
supervised_learning
AI
Apr 8, 2020

E is for Embeddings

data_science
embeddings
unsupervised_learning
AI
May 3, 2020

F is for F1 score

data_science
metrics
supervised_learning
AI
May 20, 2020

G is for Gradient Descent

data_science
optimization
supervised_learning
AI
May 30, 2020

H is for HDBSCAN

data_science
clustering
unsupervised_learning
Jul 1, 2020
 

I is for Interpretability

data_science
supervised_learning
AI
bias
Sep 27, 2020

J is for Jaccard metric

data_science
embeddings
distance_measures
Sep 27, 2020

K is for K-fold cross-validation

data_science
supervised_learning
AI
Oct 5, 2020

L is for Labelling Data

data_science
supervised_learning
AI
Oct 9, 2020
 

M is for Munging Data

data_science
data_cleaning
Oct 10, 2020
 

N is for Natural Language Processing (NLP)

data_science
nlp
text_processing
Oct 18, 2020

O is for Outlier Detection

data_science
unsupervised_learning
anomaly_detection
Oct 23, 2020

P is for Pandas

data_science
data_cleaning
data_exploration
Oct 25, 2020

Q is for Q-learning

data_science
reinforcement_learning
Dec 27, 2020

R is for Reproducibility

data_science
reproducibility
Dec 28, 2020

S is for Supervised Learning

data_science
supervised_learning
random_forest
deep_learning
Jan 17, 2021

T is for Transfer Learning

data_science
supervised_learning
pretrained_models
Jan 31, 2021

U is for UMAP

data_science
embedding
dimension_reduction
Feb 1, 2021

V is for Visualization

data_science
visualization
plotting
Feb 3, 2021

W is for Wasserstein GANs

data_science
synthetic_media
gans
deepfakes
Feb 4, 2021

Y is for You Should Talk to Your Clients

data_science
Feb 5, 2021
 

X is for XGBoost

data_science
supervised_learning
xgboost
ensembles
Feb 5, 2021
 

Z is for Zero to Done

data_science
Feb 6, 2021
No matching items