6: Missing and weird data

Option 2: removing samples

If only a small number of samples have missing values, you can drop them from the training data:
```
train.dropna(subset=[["features","we","care","about"]], inplace=True)
```
Good idea if the same samples have missing values from multiple features
Still useful to explore why data are missing

What should we do for inference?

Strategy	When to use
Constant	When there is a reasonable default value
Mean	Numeric features with normal distribution
Median	Numeric features with extreme outliers
KNN	Relationship with other features
Missingness indicator	If missingness seems informative