Folder Preprocessing:
- Shows the steps of how the preprocessing of the real data is done
- the 10 downlaoded datasets of CSE-CIC-IDS2018 are loaded, then combined
- unnecessary columns are dropped
- preprocessing steps are carried out
- droping nan, inf and -inf, negative values
- changing time to UNIX format
- aggregating and saving the classes back into different datasets
- "BruteForce": ["FTP-BruteForce", "SSH-Bruteforce", "Brute Force -Web", "Brute Force -XSS"],
- "DoS": ["DoS attacks-GoldenEye", "DoS attacks-Slowloris", "DoS attacks-Hulk", "DoS attacks-SlowHTTPTest", "DDoS attacks-LOIC-HTTP", "DDOS attack-HOIC", "DDOS attack-LOIC-UDP"],
- "Infiltration": ["Infilteration"],
- "Bot": ["Bot"],
- "Benign": ["Benign"]
Folder Models
- contains 4 notebooks
- SDV (GitHub repo name) notebook has CTGAN, CopulaGAN, and TVAE models more info at:
- Synthcity (GitHub repo name) notebook contains RTAVE and ADSGAN models more info at:
- TabFairGAN notebook has TabFairGAN models more info at:
- Combining_everthing contains the code of how synthetically generated datasets are combined into 1 dataset
Folder Classifiers
contains the code for both Random Forest and XGBoost classifiers and the proprocessing steps
Each cell in the notebook is explained separately.
sdmetric (used for evaluation) can be found at:
table-evaluator (used for the evaluation) can be found at: both are found at: