diff --git a/Thesis_Docs/Nikkhah_Nasab-Aida-Mastersthesis.pdf b/Thesis_Docs/Nikkhah_Nasab-Aida-Mastersthesis.pdf index 3b9c13139e0167b1cb2d9701865c1e35687f15a7..118fa2583762296419ec8e0cf8c4f55921008ad6 100644 Binary files a/Thesis_Docs/Nikkhah_Nasab-Aida-Mastersthesis.pdf and b/Thesis_Docs/Nikkhah_Nasab-Aida-Mastersthesis.pdf differ diff --git a/Thesis_Docs/main.tex b/Thesis_Docs/main.tex index 080f35f4ce7b42a51b04e8105107b3f696c14f18..a4dbdf8b16b793fe1daf871277fe22cc3b97b18e 100644 --- a/Thesis_Docs/main.tex +++ b/Thesis_Docs/main.tex @@ -768,11 +768,11 @@ The BAYWATCH extensions significantly enhance beacon detection accuracy by incor The enhancements in the DBAYWATCH framework, as detailed in this chapter, offer substantial improvements over the original BAYWATCH implementation. By reimplementing the base framework in Python and extending it with an advanced signal analysis pipeline, DBAYWATCH achieves improved accuracy and scalability in beacon detection. The comprehensive evaluation with both real and synthetic data underscores the critical impact of jitter on detection performance and provides clear guidelines for optimal parameter settings in practical network security applications. \chapter{Experiments and Discussions} -This chapter presents a comprehensive evaluation of the BAYWATCH framework to validate its efficacy in detecting malicious beaconing behavior in large-scale networks. The experiments are designed to address two objectives: first assessing the framework's robustness and accuracy under controlled noise conditions using synthetic datasets, and second evaluating its practical performance in real-world enterprise network environments. Synthetic data, generated with programmable noise levels and periodic patterns, enables systematic testing of BAYWATCH's core algorithms, such as the Fast Fourier Transform (FFT) and autocorrelation-based verification. Subsequently, the framework is deployed on a real-world dataset. This dual approach not only validates the theoretical soundness of the methodology but also demonstrates its scalability and operational feasibility. By synthesizing findings from both artificial and real-world scenarios, this chapter provides insights into BAYWATCH's strengths, limitations, and applicability in modern cybersecurity defense systems. +This chapter presents a comprehensive evaluation of the framework to validate its efficacy in detecting malicious beaconing behavior in large-scale networks. The experiments are designed to address two objectives: first assessing the framework's robustness and accuracy under controlled noise conditions using synthetic datasets, and second evaluating its practical performance in real-world enterprise network environments. Synthetic data, generated with programmable noise levels and periodic patterns, enables systematic testing of framework's core algorithms, such as the Fast Fourier Transform (FFT) and autocorrelation-based verification. Subsequently, the framework is deployed on a real-world dataset. This dual approach not only validates the theoretical soundness of the methodology but also demonstrates its scalability and operational feasibility. By synthesizing findings from both artificial and real-world scenarios, this chapter provides insights into framework's strengths, limitations, and applicability in modern cybersecurity defense systems. \section{Validation Steps} -The validation process in the BAYWATCH framework consists of three steps designed to identify malicious beaconing behavior. These steps ensure that only truly periodic and suspicious communication patterns are flagged, while minimizing false positives caused by noise or legitimate periodic traffic. The validation steps are as follows: +The validation process in the framework consists of three steps designed to identify malicious beaconing behavior. These steps ensure that only truly periodic and suspicious communication patterns are flagged, while minimizing false positives caused by noise or legitimate periodic traffic. The validation steps are as follows: \begin{enumerate} \item \textbf{FFT Candidate Detection with Power Threshold}: The first step involves applying the Fast Fourier Transform (FFT) to the time series of connection timestamps. The FFT converts the time-domain data into the frequency domain, revealing potential periodic patterns. A power threshold is then applied to filter out insignificant frequencies caused by noise. Only frequencies with amplitudes exceeding this threshold are retained as FFT candidates. @@ -796,29 +796,20 @@ After identifying candidate frequencies using the FFT, the BAYWATCH framework ve \subsection{Combination of FFT and ACF Results} -Figure \ref{fig:combinedall} shows the combined Frequency Spectrum with FFT \& ACF Candidates. The figure visualizes the results of the BAYWATCH framework’s frequency analysis for three domains: \texttt{fpc.msedge.net}, which is a URL from the real data but non-malicious; \texttt{m4v4r4c5.stackpathcdn.com}, which is a URL from the real data but exhibits malicious beaconing behavior; and \texttt{beacon8.com}, which is a synthetic URL with malicious beaconing behavior. - -The \textbf{x-axis} represents the frequency in Hertz (Hz), ranging from 0.000 to 0.21 Hz. The \textbf{y-axis} represents the amplitude of the frequency components, indicating the strength of the periodic signal at each frequency. - -For each domain, the \textbf{thinner line} represents the FFT spectrum, which highlights potential periodic behaviors in the frequency domain. The \textbf{dots} indicate the FFT candidates, which are frequencies with amplitudes exceeding the power threshold. Additionally, the \textbf{thicker lines} represent the autocorrelation function (ACF) candidates, which are frequencies confirmed by the ACF as having strong temporal consistency. Finally, the \textbf{combined candidates} are frequencies identified by both the FFT and ACF, indicating high-confidence beaconing behavior. The combined candidates are highlighted in the figure by 'X' markers, showing the agreement between the two methods in detecting malicious beaconing. This cross-validation ensures that only the most suspicious URLs are flagged for further investigation, reducing false positives and enhancing the framework’s accuracy. +The final step combines the results from the FFT and ACF steps to confirm malicious beaconing behavior. A URL is flagged as a beaconing candidate only if it is identified by both the FFT and ACF analyses. This cross-validation ensures that only high-confidence signals are detected, minimizing false positives and enhancing the framework's accuracy. \begin{figure} \centering - \includegraphics[width=\textwidth]{../Thesis_Docs/media/output.png} - \caption{Frequency Spectrum with FFT \& ACF Candidates. The x-axis represents frequency (Hz), and the y-axis represents amplitude. The figure shows FFT candidates, ACF candidates, and combined candidates for the domains \texttt{fpc.mesedge.net}, \texttt{m4v4+fc5.stackpathcdn.com}, and \texttt{beacon8.com}} + \includegraphics[width=\textwidth]{../Thesis_Docs/media/candidates.png} + \caption{Frequency Spectrum with FFT \& ACF Candidates. The x-axis represents frequency (Hz), and the y-axis represents amplitude. The figure shows candidates for the domains "fpc.mesedge.net", "m4v4+fc5.stackpathcdn.com", and "beacon7.example.com"} \label{fig:combinedall} \end{figure} -Figure \ref{fig:combined} illustrates the combined candidate frequencies identified by the BAYWATCH framework for two domains: \texttt{m4v474c5.stackpathedn.com} and \texttt{beacon8.com}. The figure highlights the frequencies that were confirmed as beaconing candidates by both the Fast Fourier Transform (FFT) and the autocorrelation function (ACF). The \textbf{x-axis} represents the frequency in Hertz (Hz), while the \textbf{y-axis} represents the amplitude of the frequency components, indicating the strength of the periodic signal. +Figure \ref{fig:combinedall} presents the analysis of three selected URLs "fpc.mesedge.net", "m4v4+fc5.stackpathcdn.com", and "beacon7.example.com", derived from both real and synthetic data. The first URL represents a non-beaconing behavior observed in real data, meaning that no periodic transmission pattern is present. The second URL, also extracted from real data, exhibits a clear beaconing behavior. The third URL corresponds to a synthetic beacon, artificially generated to simulate a periodic transmission pattern. -For each domain, the \textbf{connected lines} represent the combined candidate frequencies, which are frequencies identified by both the FFT and ACF as exhibiting strong periodic behavior. These combined candidates are for detecting malicious beaconing, as they represent high-confidence signals that are unlikely to be caused by noise or legitimate traffic. As shown in the figure \ref{fig:combined}, the URL \texttt{fpc.mesedge.net} does not exhibit beaconing behavior, which is why it was not marked in the figure \ref{fig:combinedall} and is also absent as a candidate in figure \ref{fig:combined}. In contrast, the URLs \texttt{m4v4+fc5.stackpathcdn.com} and \texttt{beacon8.com} show clear periodic patterns, as evidenced by the combined candidates identified by the BAYWATCH framework. These results demonstrate the framework's ability to accurately detect malicious beaconing behavior in network traffic data. +The x-axis represents the frequency range, corresponding to different time intervals, while the y-axis indicates the amplitude of the detected signals. The results show that for the first URL, which does not exhibit beaconing behavior, very few significant points appear in the output, confirming the absence of strong periodic patterns. In contrast, the second beacon, which originates from real data, displays a periodic behavior with a transmission interval of 10 seconds. Similarly, the synthetic beacon demonstrates a periodicity of 20 seconds. -\begin{figure} - \centering - \includegraphics[width=\textwidth]{../Thesis_Docs/media/combined_output.png} - \caption{Combined Frequency Spectrum with FFT \& ACF Candidates. The x-axis represents frequency (Hz), and the y-axis represents amplitude. The figure shows combined candidates for the domains \texttt{m4v4+fc5.stackpathcdn.com}, and \texttt{beacon8.com}} - \label{fig:combined} -\end{figure} +By applying the detection algorithm to this dataset and analyzing the output, it becomes evident that the algorithm effectively identifies periodic signals in both real and synthetic beaconing behaviors. The results highlight the robustness of the method, demonstrating its ability to distinguish between beaconing and non-beaconing activity while accurately capturing different periodic transmission intervals. \section{Discussion} @@ -826,9 +817,13 @@ The BAYWATCH framework’s combination of Fast Fourier Transform (FFT) and autoc However, the framework has certain limitations. First, its reliance on historical data means it cannot detect zero-day beaconing behavior, as it requires a sufficient time window to analyze periodicity. Second, while the framework effectively filters out most noise, it occasionally flags legitimate periodic traffic (e.g., news feeds) as suspicious. This issue could be mitigated by integrating adaptive whitelisting mechanisms that dynamically update based on observed traffic patterns and threat intelligence feeds. -Future work could explore several directions to enhance the framework’s capabilities. Real-time streaming analysis could enable the detection of beaconing behavior as it occurs, rather than relying on historical data. Additionally, machine learning techniques could be integrated to improve the classification of legitimate and malicious periodic traffic, further reducing false positives. Finally, extending the framework to analyze other types of network traffic (e.g., DNS, NetFlow) could provide a more comprehensive approach to detecting advanced threats like APTs and botnets. +One critical aspect of evaluating the effectiveness of the proposed algorithm is its execution time. After applying the necessary preprocessing and filtering steps on the real data, the algorithm was executed, and the results were obtained in less than 10 seconds. This rapid response time demonstrates the efficiency of the implemented pipeline. + +The fast execution is largely attributed to the effectiveness of the preprocessing steps. By applying various filtering techniques beforehand, the data was already refined and structured, reducing computational complexity in the subsequent analysis. As a result, the algorithm was able to process the data efficiently, extracting periodic patterns without significant delays. + +Compared to traditional methods that may require extensive computational resources or longer processing times due to noise and redundant data, the proposed approach provides a streamlined and optimized solution. The ability to generate results within such a short timeframe highlights the algorithm’s suitability for real-time or near-real-time applications in network traffic analysis and beacon detection. -In conclusion, the BAYWATCH framework represents a significant step forward in the detection of malicious beaconing behavior. Its modular design, scalability, and high accuracy make it a practical tool for enterprise threat detection. By addressing its current limitations and exploring future enhancements, the framework could become an even more powerful component of modern cybersecurity defense systems. +In conclusion, the framework represents a significant step forward in the detection of malicious beaconing behavior. Its modular design, scalability, and high accuracy make it a practical tool for enterprise threat detection. By addressing its current limitations and exploring future enhancements, the framework could become an even more powerful component of modern cybersecurity defense systems. \chapter{Conclusion and Future Work} diff --git a/Thesis_Docs/media/FFT3.png b/Thesis_Docs/media/FFT3.png deleted file mode 100644 index 075636b5e7c178581d900e563a1e224b4b501f00..0000000000000000000000000000000000000000 Binary files a/Thesis_Docs/media/FFT3.png and /dev/null differ diff --git a/Thesis_Docs/media/auto3.png b/Thesis_Docs/media/auto3.png deleted file mode 100644 index b579b4b6982bf9b913036b7034f445628c7f9833..0000000000000000000000000000000000000000 Binary files a/Thesis_Docs/media/auto3.png and /dev/null differ diff --git a/Thesis_Docs/media/auto3_100.png b/Thesis_Docs/media/auto3_100.png deleted file mode 100644 index 233beb3cbfc589ae0c7c5669c5c958264248e1cc..0000000000000000000000000000000000000000 Binary files a/Thesis_Docs/media/auto3_100.png and /dev/null differ diff --git a/Thesis_Docs/media/auto3_1000.png b/Thesis_Docs/media/auto3_1000.png deleted file mode 100644 index 598d33828895e385a47c825faae0740fa3b430ed..0000000000000000000000000000000000000000 Binary files a/Thesis_Docs/media/auto3_1000.png and /dev/null differ diff --git a/Thesis_Docs/media/candidates.png b/Thesis_Docs/media/candidates.png new file mode 100644 index 0000000000000000000000000000000000000000..83724b8dc3052f072f4738beca45e2a7acb3f568 Binary files /dev/null and b/Thesis_Docs/media/candidates.png differ diff --git a/Thesis_Docs/media/combined_output.png b/Thesis_Docs/media/combined_output.png deleted file mode 100644 index b2a4250c56bfbb7d79232f2b7b61c0abc18b2e5d..0000000000000000000000000000000000000000 Binary files a/Thesis_Docs/media/combined_output.png and /dev/null differ diff --git a/Thesis_Docs/media/output.png b/Thesis_Docs/media/output.png deleted file mode 100644 index f9cd1c8c71e5d7f97724b054116e49d0d8d8d8b9..0000000000000000000000000000000000000000 Binary files a/Thesis_Docs/media/output.png and /dev/null differ