diff --git a/Thesis_Docs/Nikkhah_Nasab-Aida-Mastersthesis.pdf b/Thesis_Docs/Nikkhah_Nasab-Aida-Mastersthesis.pdf
index 7e3a5a94b2e735b443278789e6c728b2b5ca72d9..0e3295d241468b265aae1a849fbdc8596ff2586e 100644
Binary files a/Thesis_Docs/Nikkhah_Nasab-Aida-Mastersthesis.pdf and b/Thesis_Docs/Nikkhah_Nasab-Aida-Mastersthesis.pdf differ
diff --git a/Thesis_Docs/Nikkhah_Nasab-Aida-Mastersthesis.tex b/Thesis_Docs/Nikkhah_Nasab-Aida-Mastersthesis.tex
index a6d0fc2319e24a81e125bb98968a1130546ea184..3d18bfbb1a916999b5f01775ecd6a8458bcd0db8 100644
--- a/Thesis_Docs/Nikkhah_Nasab-Aida-Mastersthesis.tex
+++ b/Thesis_Docs/Nikkhah_Nasab-Aida-Mastersthesis.tex
@@ -102,11 +102,13 @@
 \chapter*{Abstract}
 \addcontentsline{toc}{chapter}{Abstract}
 
-In today’s interconnected digital landscape, Advanced Persistent Threats (APTs) exploit stealthy beaconing behavior to evade detection, posing significant risks to enterprise networks. This thesis investigates the performance of the BAYWATCH framework in identifying APTs by analyzing periodic communication patterns within extensive network log data.
+In today’s interconnected digital landscape, Advanced Persistent Threats (APTs) exploit stealthy beaconing behavior to evade detection, posing significant risks to enterprise networks. These sophisticated cyber threats can infiltrate systems, remain undetected for extended periods, and exfiltrate sensitive data, making them a formidable challenge for cybersecurity professionals. This thesis investigates the performance of the BAYWATCH framework in identifying APTs by analyzing periodic communication patterns within extensive network log data, aiming to enhance early detection and mitigation strategies.
 
-The study employs a signal analysis pipeline that combines Fast Fourier Transform (FFT) for frequency-domain detection with autocorrelation function (ACF) for time-domain verification. This dual approach ensures robust identification of periodicities, even under noisy conditions. To systematically evaluate resilience, synthetic datasets with programmable jitter (2–150 seconds) and beacon intervals (10–300 seconds) are generated, alongside validation using real-world enterprise network traces. Key innovations include permutation-based FFT thresholding, bandpass filtering, and frequency-lag correlation, collectively improving detection accuracy while minimizing false positives.
+This thesis offers a comprehensive examination of the BAYWATCH framework, an advanced system designed for monitoring, detecting, and analyzing data patterns, applied to both real-world and synthetic datasets. The research represents the theoretical underpinnings of BAYWATCH, outlining its algorithmic architecture, essential components, and the innovative methods it utilizes for real-time anomaly detection and data pattern recognition. Through a systematic evaluation, the study assesses the framework’s performance in controlled experimental settings and its effectiveness in complex, real-world scenarios. 
 
-This work contributes a scalable, efficient solution for early APT detection, validated in both controlled and operational environments. Future directions include real-time streaming analysis, machine learning integration for anomaly detection, and extension to IoT and cloud infrastructures. The thesis advances proactive cybersecurity strategies, offering a practical tool to safeguard large-scale networks against evolving threats.
+The study employs a comprehensive signal analysis pipeline that combines Fast Fourier Transform (FFT) for frequency-domain detection with autocorrelation function (ACF) for time-domain verification. This dual approach ensures robust identification of periodicities, even under noisy conditions. To systematically evaluate the resilience and effectiveness of the BAYWATCH framework, synthetic datasets with programmable jitter (ranging from 2 to 150 seconds) and beacon intervals (spanning 10 to 300 seconds) are generated. These synthetic datasets are complemented by validation using real-world enterprise network traces, providing a thorough assessment of the framework's capabilities in diverse operational environments.
+
+The insights gained fromthis research contribute to a deeper understanding of data monitoring systems and offer practical recommendations for future improvements, thereby advancing the application of intelligent data analysis techniques in both academic research and industry practice.
 
 \tableofcontents
 
diff --git a/Thesis_Docs/main.tex b/Thesis_Docs/main.tex
index 684331cf1d5a2f527d7100731987686236a62b05..c88398f547dc2d0f54c12d88090d97c30f300001 100644
--- a/Thesis_Docs/main.tex
+++ b/Thesis_Docs/main.tex
@@ -497,7 +497,13 @@ Analyzing the time intervals between URL requests is important for identifying p
     \label{fig:timeintervallog}
 \end{figure}
 
-Figure \ref{fig:timeintervallog} illustrates the distribution of time intervals between URL requests, with the Y-axis displayed on a logarithmic scale. The X-axis represents time intervals in seconds, divided into 65 bins, where each bin corresponds to a one-second interval ranging from 0 to 65 seconds. The use of a logarithmic scale on the Y-axis is particularly useful for visualizing the wide range of request counts. By compressing the scale for higher values and expanding it for lower values, the logarithmic scale enables a clearer and more detailed comparison of the frequency of requests across different time intervals. The visualization reveals a consistent pattern where the number of requests decreases as the time interval between them increases. However, there is a noticeable spike in the number of requests at every 10-second interval, suggesting periodicity in user behavior. This periodicity could be indicative of regular user activities, such as polling mechanisms, automated updates, or recurring checks for new information. These behaviors are common in legitimate network traffic and can help establish a baseline for normal activity. The identification of such periodic patterns is important in network traffic analysis, as it helps differentiate between regular activity and potential malicious behavior. For instance, if a URL exhibits similar periodic patterns but with irregular or unexpected intervals, it could be a sign of beaconing—a technique often used by malware to maintain communication with a command-and-control (C2) server. In this case, the analysis could reveal anomalies in the intervals that deviate from expected patterns, potentially indicating a botnet or other malicious activity. By comparing these patterns against known baselines of legitimate traffic, it becomes easier to identify and flag suspicious requests for further investigation.
+Figure \ref{fig:timeintervallog} illustrates the distribution of time intervals between URL requests, with the Y-axis displayed on a logarithmic scale. The X-axis represents time intervals in seconds, divided into 65 bins, where each bin corresponds to a one-second interval ranging from 0 to 65 seconds. 
+
+The use of a logarithmic scale on the Y-axis is particularly useful for visualizing the wide range of request counts. By compressing the scale for higher values and expanding it for lower values, the logarithmic scale enables a clearer and more detailed comparison of the frequency of requests across different time intervals. 
+
+The visualization reveals a consistent pattern where the number of requests decreases as the time interval between them increases. However, there is a noticeable spike in the number of requests at every 10-second interval, suggesting periodicity in user behavior. This periodicity could be indicative of regular user activities, such as polling mechanisms, automated updates, or recurring checks for new information. These behaviors are common in legitimate network traffic and can help establish a baseline for normal activity. 
+
+The identification of such periodic patterns is important in network traffic analysis, as it helps differentiate between regular activity and potential malicious behavior. For instance, if a URL exhibits similar periodic patterns but with irregular or unexpected intervals, it could be a sign of beaconing—a technique often used by malware to maintain communication with a command-and-control (C2) server. In this case, the analysis could reveal anomalies in the intervals that deviate from expected patterns, potentially indicating a botnet or other malicious activity. By comparing these patterns against known baselines of legitimate traffic, it becomes easier to identify and flag suspicious requests for further investigation.
 
 \begin{figure}
     \centering
@@ -506,7 +512,13 @@ Figure \ref{fig:timeintervallog} illustrates the distribution of time intervals
     \label{fig:timeintervallogmin}
 \end{figure}
 
-Figure \ref{fig:timeintervallogmin} extends the analysis of time intervals between URL requests to a larger time scale, with the X-axis each representing a one-minute interval, except for the last bin, which aggregates data from intervals longer than 31 minutes. To avoid losing beaconing data at the edges, each bin spans ±30 seconds; for example, the 1-minute bin represents data from 30 to 90 seconds. The Y-axis remains on a logarithmic scale, ensuring that both high-frequency and low-frequency intervals are visible and can be compared effectively. This use of a logarithmic scale enables the identification of trends across various time scales, making it a powerful tool for understanding patterns in network traffic. Similar to the analysis presented in Figure \ref{fig:timeintervallog}, the visualization reveals a decreasing trend in the number of requests as the time interval between them increases. This suggests that user interactions are typically clustered within shorter time intervals, with longer gaps between requests. However, a notable spike in request frequency appears every 5 minutes, indicating a periodic pattern at a larger time scale. This periodicity is consistent across all URLs in the dataset, suggesting that it represents a common behavior such as scheduled tasks, automated updates, or regular user interactions. These spikes could correspond to routine activities in many systems or applications that are configured to perform tasks at fixed intervals—such as background data synchronization, refresh cycles, or regular system health checks. The observed periodic behavior is particularly significant in the context of detecting malicious beaconing activity. Malicious software, including botnets and malware, often utilizes similar periodic behavior to maintain communication with command-and-control (C2) servers, operating at regular intervals. By identifying these regular spikes in request frequency, organizations can establish a baseline for normal network behavior and detect any deviations that might indicate unauthorized or suspicious activities. The consistent periodicity observed across the dataset could thus serve as a key indicator for detecting potential threats and taking proactive security measures. The logarithmic scale is important for effectively visualizing the wide range of time intervals and request counts. The logarithmic scale compresses the scale for higher values and expands it for lower values, allowing for a more balanced view of both common and rare events. This enhanced visualization capability enables a clearer understanding of the temporal dynamics of user interactions and supports the identification of periodic patterns, which are important for detecting stealthy beaconing behavior in network traffic. Ultimately, this approach aids in distinguishing between normal and abnormal patterns, enhancing the framework’s ability to identify potential security threats.
+Figure \ref{fig:timeintervallogmin} extends the analysis of time intervals between URL requests to a larger time scale, with the X-axis each representing a one-minute interval, except for the last bin, which aggregates data from intervals longer than 31 minutes. To avoid losing beaconing data at the edges, each bin spans ±30 seconds; for example, the 1-minute bin represents data from 30 to 90 seconds. The Y-axis remains on a logarithmic scale, ensuring that both high-frequency and low-frequency intervals are visible and can be compared effectively. This use of a logarithmic scale enables the identification of trends across various time scales, making it a powerful tool for understanding patterns in network traffic. 
+
+Similar to the analysis presented in Figure \ref{fig:timeintervallog}, the visualization reveals a decreasing trend in the number of requests as the time interval between them increases. This suggests that user interactions are typically clustered within shorter time intervals, with longer gaps between requests. However, a notable spike in request frequency appears every 5 minutes, indicating a periodic pattern at a larger time scale. This periodicity is consistent across all URLs in the dataset, suggesting that it represents a common behavior such as scheduled tasks, automated updates, or regular user interactions. These spikes could correspond to routine activities in many systems or applications that are configured to perform tasks at fixed intervals—such as background data synchronization, refresh cycles, or regular system health checks. 
+
+The observed periodic behavior is particularly significant in the context of detecting malicious beaconing activity. Malicious software, including botnets and malware, often utilizes similar periodic behavior to maintain communication with command-and-control (C2) servers, operating at regular intervals. By identifying these regular spikes in request frequency, organizations can establish a baseline for normal network behavior and detect any deviations that might indicate unauthorized or suspicious activities. The consistent periodicity observed across the dataset could thus serve as a key indicator for detecting potential threats and taking proactive security measures. 
+
+The logarithmic scale is important for effectively visualizing the wide range of time intervals and request counts. The logarithmic scale compresses the scale for higher values and expands it for lower values, allowing for a more balanced view of both common and rare events. This enhanced visualization capability enables a clearer understanding of the temporal dynamics of user interactions and supports the identification of periodic patterns, which are important for detecting stealthy beaconing behavior in network traffic. Ultimately, this approach aids in distinguishing between normal and abnormal patterns, enhancing the framework’s ability to identify potential security threats.
 
 \section{Distribution of Hosts Based on Unique URLs Contacted}
 Understanding the interaction patterns of hosts within the network is important for identifying key services, detecting anomalies, and optimizing network performance. By analyzing the distribution of hosts based on the number of unique URLs they contacted, insights can be gained into the concentration of network activity and the diversity of services being accessed. This analysis helps highlight the most active hosts and their browsing behaviors, providing valuable information for pinpointing critical network resources, determining high-traffic users, and identifying potential security concerns. For example, an unusually high number of unique URL requests from a single host may indicate an abnormal pattern, which could suggest automated processes or even malicious behavior. By focusing on the number of unique URLs accessed by each host, this section offers a clear understanding of how traffic is distributed across the network and how hosts interact with various services. Additionally, this analysis aids in understanding the level of engagement with different network segments, assisting network administrators in optimizing resource allocation and managing network load during peak times."
@@ -518,7 +530,11 @@ Understanding the interaction patterns of hosts within the network is important
     \label{fig:ip}
 \end{figure}
 
-Figure \ref{fig:ip} illustrates the distribution of hosts (IP addresses) based on the number of unique URLs they contacted. The X-axis represents the number of unique URLs, ranging from 1 to 15, while the Y-axis shows the count of hosts within each category. The visualization highlights that the majority of hosts interact with only a small number of unique URLs. Specifically, approximately 17,500 hosts contacted exactly two unique URLs, while around 15,000 hosts interacted with only one unique URL. As the number of unique URLs increases, the number of hosts decreases significantly, although there are still many hosts contacting more than a few URLs. This pattern suggests that network activity is highly concentrated around a small set of destinations, with most hosts accessing only a limited range of resources. For example, hosts that contact only one or two unique URLs are likely interacting with essential services such as internal tools, authentication servers, or frequently accessed websites. In contrast, hosts contacting a larger number of unique URLs may represent more diverse or specialized activities, such as administrators, developers, or automated systems performing a variety of tasks across the network. This distribution of host behavior emphasizes the importance of leveraging whitelists to filter out known legitimate traffic, ensuring that analysis can focus on detecting potentially suspicious activities. The concentration of network traffic on a limited set of URLs also carries significant implications for network monitoring and security. By identifying the most frequently accessed URLs, organizations can prioritize security measures for resources that are most likely to be targeted by malicious actors. URLs that experience high traffic are often the focal points of cyberattacks, such as phishing schemes, malware distribution, or command-and-control (C2) communication. By directing attention to these critical resources, organizations can enhance their ability to detect and mitigate emerging threats. Additionally, monitoring the distribution of hosts based on the number of unique URLs they access can help identify anomalous behavior. For instance, a host that unexpectedly begins contacting a large number of unique URLs could indicate suspicious activity, such as a compromised device engaged in reconnaissance or data exfiltration. Establishing a baseline for normal host behavior allows organizations to more effectively identify deviations that may require further investigation, enhancing overall network security.
+Figure \ref{fig:ip} illustrates the distribution of hosts (IP addresses) based on the number of unique URLs they contacted. The X-axis represents the number of unique URLs, ranging from 1 to 15, while the Y-axis shows the count of hosts within each category. The visualization highlights that the majority of hosts interact with only a small number of unique URLs. Specifically, approximately 17,500 hosts contacted exactly two unique URLs, while around 15,000 hosts interacted with only one unique URL. As the number of unique URLs increases, the number of hosts decreases significantly, although there are still many hosts contacting more than a few URLs. 
+
+This pattern suggests that network activity is highly concentrated around a small set of destinations, with most hosts accessing only a limited range of resources. For example, hosts that contact only one or two unique URLs are likely interacting with essential services such as internal tools, authentication servers, or frequently accessed websites. In contrast, hosts contacting a larger number of unique URLs may represent more diverse or specialized activities, such as administrators, developers, or automated systems performing a variety of tasks across the network. This distribution of host behavior emphasizes the importance of leveraging whitelists to filter out known legitimate traffic, ensuring that analysis can focus on detecting potentially suspicious activities. The concentration of network traffic on a limited set of URLs also carries significant implications for network monitoring and security. 
+
+By identifying the most frequently accessed URLs, organizations can prioritize security measures for resources that are most likely to be targeted by malicious actors. URLs that experience high traffic are often the focal points of cyberattacks, such as phishing schemes, malware distribution, or command-and-control (C2) communication. By directing attention to these critical resources, organizations can enhance their ability to detect and mitigate emerging threats. Additionally, monitoring the distribution of hosts based on the number of unique URLs they access can help identify anomalous behavior. For instance, a host that unexpectedly begins contacting a large number of unique URLs could indicate suspicious activity, such as a compromised device engaged in reconnaissance or data exfiltration. Establishing a baseline for normal host behavior allows organizations to more effectively identify deviations that may require further investigation, enhancing overall network security.
 
 \textbf{Analysis of URL Connections}
 
@@ -555,7 +571,7 @@ All the hosts in one day are 208,516; however, until now, only 61,207 hosts have
 \end{itemize}
 
 \section{Summary}
-The data analysis presented in this chapter offers a detailed and comprehensive examination of the dataset's structure, user behavior, and network interactions. By utilizing a variety of visualization tools and statistical methods, the chapter identifies and uncovers key patterns that not only contribute to a better understanding of the data but also provide actionable insights for optimizing network performance and enhancing security measures. The analysis begins with a focus on URL request counts, offering a clear view of the frequency and distribution of web traffic. This helps highlight which URLs are most frequently accessed by hosts within the network, shedding light on the overall popularity of various resources. Understanding the distribution of these request counts is for determining which URLs should be prioritized in network monitoring and security management. The high-traffic URLs, in particular, are often more susceptible to attacks, such as phishing, malware distribution, or even DDoS attacks. By recognizing these hotspots, network administrators can more effectively allocate resources to ensure that these critical URLs are properly secured and monitored. Further investigation into the 24-hour visit patterns of hosts reveals how user activity is distributed across time. By analyzing these temporal patterns, the chapter sheds light on peak usage times, user behavior trends, and possible anomalies. A close examination of these patterns provides a deeper understanding of when the network is most active and helps detect deviations that might indicate unusual or malicious behavior. For instance, atypical spikes in activity at specific hours of the day could signal security incidents such as bot traffic or unauthorized access attempts. This aspect of the analysis is for optimizing network resources and managing traffic loads during high-usage periods, ensuring the network's stability and performance. Another aspect of the analysis involves the time intervals between requests. This segment of the study reveals how hosts interact with the network, providing insights into the frequency of user requests and the temporal gaps between them. This can help identify periodic or repetitive behavior, which may indicate underlying issues such as inefficient resource usage or even intentional attempts at evading detection. The analysis of time intervals is for identifying malicious activities, such as beaconing—a pattern in which an infected device sends regular, seemingly benign requests to a specific URL to maintain communication with a command-and-control server. Detecting such behaviors can play an important role in early-stage threat detection, as it allows for the identification of compromised devices or ongoing cyberattacks before they escalate. The distribution of hosts based on the number of unique URLs they contact provides a further layer of insight into user and network behavior. This analysis highlights the concentration of network activity and reveals how different hosts interact with various resources. For example, some hosts may only contact a limited number of URLs, often related to essential services, while others might interact with a broader set of resources. The latter group may represent specialized functions or more complex network activities. By understanding the distribution of hosts across different sets of URLs, organizations can better prioritize their security efforts and ensure that high-risk activities are closely monitored. This distribution can also help distinguish between normal and anomalous behaviors, offering clues about potential security threats or misconfigurations within the network. Collectively, these findings emphasize the importance of focusing on high-traffic URLs and understanding the temporal patterns in user activity. By identifying periodic behaviors or unusual request intervals, it becomes possible to detect anomalies that could indicate malicious intent or system vulnerabilities. The insights provided by this analysis are important for creating more effective detection mechanisms within the BAYWATCH framework, laying a strong foundation for the development of robust network security tools and strategies. The use of advanced visualization techniques and statistical analysis in this chapter is instrumental in uncovering these patterns. These tools provide a clear and intuitive way to visualize complex data sets, helping to identify trends and outliers that may otherwise go unnoticed. This approach not only contributes to a deeper understanding of the dataset but also facilitates the identification of areas that require further investigation or intervention. By offering a comprehensive view of the network's structure and behavior, this chapter provides a solid foundation for enhancing network security, improving performance, and developing more effective detection and mitigation mechanisms for potential threats. In conclusion, the data analysis conducted in this chapter offers a thorough understanding of network dynamics, highlighting key areas for improvement in both security and performance optimization. By examining the dataset's structure, user behavior, and network interactions through various lenses, this chapter delivers valuable insights that can guide future research and the implementation of more sophisticated network management strategies. These findings are for building a proactive security posture, ensuring the network remains resilient against evolving threats while maintaining optimal performance.
+The data analysis presented in this chapter provides a comprehensive understanding of the dataset’s structure, user behavior, andnetworkinteractions. By visualizing URL request counts, analyzing 24-hour visit patterns, examining time intervals between requests, and studying the distribution of hosts, this chapter uncovers key insights that can inform network optimization and security strategies. The findings highlight the importance of focusing on high-traffic URLs, understanding temporal patterns in user activity, and detecting periodic behavior that may indicate malicious beaconing. These insights lay the foundation for further analysis and the development of effective detection mechanisms in the BAYWATCH framework. By leveraging advanced visualization techniques and statistical methods, this chapter offers valuable insights into the dataset’s characteristics and user behavior, providing a solid basis for enhancing network security and performance.
 
 \chapter{Implementation}