diff --git a/README.md b/README.md index bf53bd5a6e429560b6cfb24ba5e235a41597c712..e597a9ccb68128031dbf2305356977caa40ebe65 100644 --- a/README.md +++ b/README.md @@ -65,4 +65,11 @@ This layout appears user-friendly and efficient for retrieving passenger detials # Data -**Data Source** - We have use [Airline Dataset](https://www.kaggle.com/datasets/iamsouravbanerjee/airline-dataset/data?select=Airline+Dataset.csv) +**Data Source** - We have use [Airline Dataset](https://www.kaggle.com/datasets/iamsouravbanerjee/airline-dataset/data?select=Airline+Dataset.csv) from Kaggle. + +**Data Handling** - +1. _Outlier Detection_ - The method that can handle both numerical and categorical data effictibely is Isolation Forest algorithm. +2. _Fake Data_ - Added 25% realistic data balancing across target variable. + +There is more details explain about data in Wiki. +