Skip to content

Khia canh Data cua project ML/AI

Published: at 08:51 AMSuggest Changes

Table of contents

Open Table of contents

1. Data Preparation

Thong thuong du lieu se duoc chia thanh 3 phan: training, validation va test.

Best practice la chung ta nen co mot test set va mot training set rieng biet. Training set se duoc chia ra thanh training va validation. Theo thoi gian training set cang ngay cang tang. Viec de test set rieng biet va co dinh tu dau den cuoi giup cho viec so sanh giua cac model chinh xac hon.

Nhu vay o buoc preparation nay, chi can quan tam toi viec chia du lieu thanh cac sets.

2. Data Exploration

Exploratory data analysis (EDA) la phuong phap giup chung ta hieu ve mat nghia cua du lieu ma chung ta dang xet toi.

Nhung ly do can toi phuong phap EDA:

3. Data Preprocessing

Data prepocessing bao gom hai loai: preparation va transformation.

Cong tac chuan bi bao gom viec sap xep va don dep du lieu. Con ve transformation bao gom feature encoding va feature engineering.

Chi tiet hon ve buoc transformation, ta co cac thao tac nhu sau:

3.1. Scaling

Mot vai dieu can biet:

3.2. Encoding

Thao tac cho phep bieu dien du lieu mot cach hieu qua, dam bao giu duoc cac signal va hoc nhung pattern. Gom cac phuong phap tieu bieu nhu:

∞. Cau hoi


Previous Post
Sống và suy nghĩ đơn giản hơn
Next Post
Mot san pham ML/AI hoan chinh