Загрузка страницы

My top 50 scikit-learn tips

If you already know the basics of scikit-learn, but you want to be more efficient and get up-to-date with the latest features, then THIS is the video for you.

My name is Kevin Markham, and I've been teaching Machine Learning in Python with scikit-learn for more than 8 years. Over the next 3 hours, I'm going to share with you my top 50 scikit-learn tips.

Each tip ranges from 2 to 8 minutes, and you can use the timestamp links below to skip along if you're already familiar with a particular tip.

👩‍💻 Code: https://github.com/justmarkham/scikit-learn-tips
🤖 Learn ML from me: https://courses.dataschool.io/ml-courses
💌 Weekly Data Science tips: https://tuesday.tips/

50 TIPS:
0:00 - Introduction
1:03 - 1. Transform data with ColumnTransformer
4:19 - 2. Seven ways to select columns
8:18 - 3. "fit" vs "transform"
10:53 - 4. Don't use "fit" on new data!
15:05 - 5. Don't use pandas for preprocessing!
19:00 - 6. Encode categorical features
24:07 - 7. Handle new categories in testing data
27:16 - 8. Chain steps with Pipeline
30:19 - 9. Encode "missingness" as a feature
33:12 - 10. Why set a random state?
35:40 - 11. Better ways to impute missing values
41:22 - 12. Pipeline vs make_pipeline
44:08 - 13. Inspect a Pipeline
47:03 - 14. Handle missing values automatically
49:47 - 15. Don't drop the first categorical level
54:15 - 16. Tune a Pipeline
1:01:09 - 17. Randomized search vs grid search
1:05:42 - 18. Examine grid search results
1:08:10 - 19. Logistic regression tuning parameters
1:12:41 - 20. Plot a confusion matrix
1:15:37 - 21. Plot multiple ROC curves
1:17:21 - 22. Use the correct Pipeline methods
1:18:59 - 23. Access model coefficients
1:20:11 - 24. Visualize a decision tree
1:23:57 - 25. Improve a decision tree by pruning it
1:25:23 - 26. Use stratified sampling when splitting data
1:29:40 - 27. Impute missing values for categoricals
1:32:10 - 28. Save a model or Pipeline
1:33:47 - 29. Add multiple text columns to a model
1:35:35 - 30. More ways to inspect a Pipeline
1:37:28 - 31. Know when shuffling is required
1:42:32 - 32. Use AUC with multiclass problems
1:46:04 - 33. Create custom features with scikit-learn
1:50:03 - 34. Automate feature selection
1:52:24 - 35. Use pandas objects with scikit-learn
1:53:37 - 36. Pass parameters as keyword arguments
1:55:23 - 37. Create an interactive Pipeline diagram
1:57:22 - 38. Get the names of transformed features
1:59:32 - 39. Load a toy dataset into pandas
2:01:33 - 40. View all model parameters
2:03:00 - 41. Encode binary features
2:06:59 - 42. Column selection tricks
2:10:02 - 43. Save time when encoding categoricals
2:16:53 - 44. Speed up a grid search
2:19:01 - 45. Create feature interactions
2:23:00 - 46. Ensemble multiple models
2:27:23 - 47. Tune an ensemble
2:31:22 - 48. Run part of a Pipeline
2:34:52 - 49. Tune multiple models at once
2:39:50 - 50. Solve many ML problems with one solution

Видео My top 50 scikit-learn tips канала Data School
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
20 апреля 2023 г. 19:56:43
02:47:31
Яндекс.Метрика