| ... | @@ -185,4 +185,37 @@ Finally, the synthetic data is concatenated with the original dataset to create |
... | @@ -185,4 +185,37 @@ Finally, the synthetic data is concatenated with the original dataset to create |
|
|
|
|
|
|
|
Purpose: This step combines the original data and synthetic data, creating a larger, augmented dataset that will be used for model training.
|
|
Purpose: This step combines the original data and synthetic data, creating a larger, augmented dataset that will be used for model training.
|
|
|
#### Conclusion
|
|
#### Conclusion
|
|
|
The synthetic data generation process introduces additional rows into the dataset, enriching the feature space and providing the model with more examples to learn from. By carefully generating realistic noise for each feature and simulating market behavior through volume spikes, this approach helps the model generalize better to unseen data. However, it’s crucial to ensure that the synthetic data aligns well with the real-world data distribution.However,after running multiple prediction models it was found that the real dataset were performing better.The detail analysis of the prediction models can be found on model selections.Please Take a look |
|
The synthetic data generation process introduces additional rows into the dataset, enriching the feature space and providing the model with more examples to learn from. By carefully generating realistic noise for each feature and simulating market behavior through volume spikes, this approach helps the model generalize better to unseen data. However, it’s crucial to ensure that the synthetic data aligns well with the real-world data distribution.However,after running multiple prediction models it was found that the real dataset were performing better.The following findings are there for one to check the effects numerically.
|
|
\ No newline at end of file |
|
|
|
|
|
#### With Real data filtered after observing data from data wrangler
|
|
|
|
|
|
|
|
Model Validation MSEs
|
|
|
|
Linear Regression MSE: 0.0000
|
|
|
|
|
|
|
|
Support Vector Regressor MSE: 158.9195
|
|
|
|
|
|
|
|
Random Forest Regressor MSE: 40.6625
|
|
|
|
|
|
|
|
#### With Fake data filtered after observing data from data wrangler
|
|
|
|
Model Validation MSEs
|
|
|
|
Linear Regression MSE: 548.2481
|
|
|
|
|
|
|
|
Support Vector Regressor MSE: 445.3477
|
|
|
|
|
|
|
|
Random Forest Regressor MSE: 453.4784
|
|
|
|
|
|
|
|
#### With Real data filtered after applying standard deviation method to remove outliers
|
|
|
|
Model Validation MSEs
|
|
|
|
Linear Regression MSE: 0.0000
|
|
|
|
|
|
|
|
Support Vector Regressor MSE: 367.5236
|
|
|
|
|
|
|
|
Random Forest Regressor MSE: 41.7337
|
|
|
|
|
|
|
|
#### With Fake data filtered after applying standard deviation method to remove outliers
|
|
|
|
Model Validation MSEs
|
|
|
|
Linear Regression MSE: 485.6242
|
|
|
|
|
|
|
|
Support Vector Regressor MSE: 383.9664
|
|
|
|
|
|
|
|
Random Forest Regressor MSE: 400.4829 |
|
|
|
\ No newline at end of file |