The Netflix Dataset Model Application was evaluated using **Linear Regression** and **Lasso Regression**.
The **Netflix Dataset Model Application** was evaluated using **Logistic Regression** and **Random Forest Classifier**.
## **Evaluation Metrics**
## Evaluation Metrics
1.**Loss Functions**:
- R² (Coefficient of Determination) → Score Function
2.**Cross-Validation Score**:
- See [Scikit-learn: Cross Validation](https://scikit-learn.org/stable/modules/cross_validation.html)
## **Comparison Results**
-**Classification Metrics:**
The return values of **R² Score** and **Cross-Validation Score** were analyzed for both models:
-**Accuracy (Score Function):** Measures the proportion of correctly predicted instances out of all instances.
-**Linear Regression** generally performed better in terms of both metrics.
-**Classification Report:** Includes Precision, Recall, and F1-Score for each genre, providing a detailed performance overview.
-**Lasso Regression**, while effective, showed slightly lower performance due to its regularization, which reduces overfitting but may compromise predictive accuracy for this dataset.
## **Conclusion**
-**Cross-Validation Score:**
Based on the evaluation, **Linear Regression** is the better model for predicting the release year of Netflix titles using the dataset, as it consistently outperforms Lasso Regression in terms of R² Score and Cross-Validation Score.
- Utilized Scikit-learn's cross-validation to assess model performance across multiple data splits, ensuring reliability and robustness of the results.
The performance of **Logistic Regression** and **Random Forest Classifier** was analyzed based on their **Accuracy Score** and **Cross-Validation Score**:
-**Logistic Regression:**
-**Accuracy Score:** Achieved higher accuracy on the validation set compared to Random Forest.
-**Cross-Validation Score:** Demonstrated consistent and superior performance across all cross-validation folds, indicating robust generalization capabilities.
-**Random Forest Classifier:**
-**Accuracy Score:** While effective, it showed slightly lower accuracy than Logistic Regression in this specific application.
-**Cross-Validation Score:** Exhibited more variability across folds, suggesting potential overfitting issues despite its ensemble nature.
## Conclusion
Based on the evaluation, **Logistic Regression** is the better model for predicting the genre of Netflix titles using the dataset. It consistently outperforms the Random Forest Classifier in terms of both **Accuracy Score** and **Cross-Validation Score**. Additionally, Logistic Regression offers greater interpretability and computational efficiency, making it more suitable for integration into the application. While Random Forest remains a powerful classifier, the simplicity and reliability of Logistic Regression align more closely with the project's objectives of delivering accurate and understandable genre predictions.