diff --git a/README.md b/README.md index 1739f3cb0ed459328bc1383b639da70e70572add..4342bac2819aa68caa80ba4d283aaf44e8a6f285 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ | Mutote, Michael | 22202956 | | Gattousi, Fadi | 22211572 | -Recommendation Systems +Recommendation Systems Project 2 https://mygit.th-deg.de/mm13956/ws-23-sas-02 ------------------- @@ -23,18 +23,26 @@ pip install -r requirements.txt ## Usage 1. **Running the Application**: Execute `Main.py` to launch the application. This script uses the cleaned data from `diamonds.csv` and a pre-trained Random Forest Regressor model to predict diamond prices. 2. **Interacting with the GUI**: - - The GUI allows you to visualize data and predictions. Use the provided controls to interact with the data and view results. - - Main Window: Enter values for the diamond attributes and click on the "Calculate" button to predict the price of the diamond. The predicted price will be displayed in the "Predicted Price" field at the bottom. In addition, the main window displays a scatter plot of the carat weight vs. the price of the diamonds in the dataset. - -## Files in the Project + - The GUI allows you to visualize data and predict diamond prices. It consists of two Tabs: + - **Main Tab**: Enter values for the diamond attributes and click on the `Calculate Price` button to predict the price of the diamond. The predicted price will be displayed in the `Predicted Price` field at the bottom. In addition, the main window displays a scatter plot of the carat weight vs. the price of the diamonds in the dataset.  + - **Advanced Tab**: This tab allows you to visualize the relationship between the price and the other attributes of the diamonds. Select an attribute from the dropdown menu and click on the `Plot` button to display the scatter plot.  Also you can select the model you want to use for prediction from the dropdown menu and choose which attributes you want to use for prediction. The unselected attributes's fields will be disabled in the main tab. *The `Calculate Price` button will be disabled until the model is re-trained.* + + +## Model Selection +The following models are available for prediction: +- **Linear Regression**: This model assumes a linear relationship between the dependent and independent variables. We used the `LinearRegression` class from scikit-learn to train this model. The model achieved an R2 score of **0.88** and a mean squared error of **1,896,296.20** +- **XGBoost Regressor**: This model is an ensemble of decision trees. We used the `XGBRegressor` class from scikit-learn to train this model. The model achieved an R2 score of **0.98** and a mean squared error of **303,969.39** +- **Multi-layer Perceptron Regressor**: This model is a feedforward neural network with multiple hidden layers. We used the `MLPRegressor` class from scikit-learn to train this model. The model achieved an R2 score of **0.91** and a mean squared error of **31,366,191.38** and takes the longest time to train. +- **Random Forest Regressor**: This model is an ensemble of decision trees. We used the `RandomForestRegressor` class from scikit-learn to train this model. The model achieved an R2 score of **0.98** and a mean squared error of **292,726.76** <u> **Making it the best model for this dataset.**</u> + + +## Project Structure - `diamonds.csv`: Dataset containing diamond attributes. - `Main.py`: Main Python script with GUI for prediction and visualization. - `requirements.txt`: List of Python packages required for the project. - `training.ipynb`: Jupyter notebook for data preprocessing and analysis. +- `src/`: Folder containing images used in this README. +- `models/`: Folder containing the trained models. -## Contributing -Feel free to contribute to this project by submitting pull requests or suggesting improvements. - ---- - -This README provides a basic overview and guidance for your project. You might want to customize it further to include more specific details about the machine learning models, the GUI features, or any other unique aspects of your project. \ No newline at end of file +## References +- [Diamonds Dataset link from Kaggle](https://www.kaggle.com/shivam2503/diamonds) \ No newline at end of file diff --git a/src/advanced_tab.png b/src/advanced_tab.png new file mode 100644 index 0000000000000000000000000000000000000000..52e0b6cd025c751c58732149ce20b7607017baff Binary files /dev/null and b/src/advanced_tab.png differ diff --git a/src/main_tab.png b/src/main_tab.png new file mode 100644 index 0000000000000000000000000000000000000000..5f552316b2383eec28d6111a4629dee88cd10f78 Binary files /dev/null and b/src/main_tab.png differ