Skip to content
Snippets Groups Projects
Commit f4af2b3a authored by Aida Nikkhah Nasab's avatar Aida Nikkhah Nasab
Browse files

add new Mastersthesis.pdf and Mastersthesis.blg files; update references.bib...

add new Mastersthesis.pdf and Mastersthesis.blg files; update references.bib with additional sources
parent 86b96777
No related branches found
No related tags found
No related merge requests found
Pipeline #57877 failed
This is BibTeX, Version 0.99d (TeX Live 2022/dev/Debian)
Capacity: max_strings=200000, hash_size=200000, hash_prime=170003
The top-level auxiliary file: Nikkhah_Nasab-Aida-Mastersthesis.aux
The style file: IEEEtran.bst
Reallocated singl_function (elt_size=4) to 100 items from 50.
Reallocated singl_function (elt_size=4) to 100 items from 50.
Reallocated singl_function (elt_size=4) to 100 items from 50.
Reallocated wiz_functions (elt_size=4) to 6000 items from 3000.
Reallocated singl_function (elt_size=4) to 100 items from 50.
Database file #1: ../Thesis_Docs/sources/references.bib
-- IEEEtran.bst version 1.14 (2015/08/26) by Michael Shell.
-- http://www.michaelshell.org/tex/ieeetran/bibtex/
-- See the "IEEEtran_bst_HOWTO.pdf" manual for usage information.
Done.
You've used 25 entries,
4087 wiz_defined-function locations,
962 strings with 13639 characters,
and the built_in function-call counts, 22307 in all, are:
= -- 1697
> -- 639
< -- 184
+ -- 347
- -- 123
* -- 1077
:= -- 3093
add.period$ -- 62
call.type$ -- 25
change.case$ -- 27
chr.to.int$ -- 447
cite$ -- 25
duplicate$ -- 1561
empty$ -- 1857
format.name$ -- 140
if$ -- 5278
int.to.chr$ -- 0
int.to.str$ -- 25
missing$ -- 289
newline$ -- 100
num.names$ -- 25
pop$ -- 727
preamble$ -- 1
purify$ -- 0
quote$ -- 2
skip$ -- 1693
stack$ -- 0
substring$ -- 1113
swap$ -- 1292
text.length$ -- 35
text.prefix$ -- 0
top$ -- 5
type$ -- 25
warning$ -- 0
while$ -- 107
width$ -- 27
write$ -- 259
File added
......@@ -13,77 +13,101 @@ The research is guided by several key questions, including: How can beaconing be
The thesis is organized into a cohesive narrative that begins by establishing the foundational background and core concepts essential to understanding network security and periodicity detection. Following this, a review of related work contextualizes the current research within the broader field. The methodology chapter then details the advanced techniques introduced in the framework. Chapter 5, Data Analysis is an exploration of real-world network log data to uncover patterns and insights related to beaconing behavior, setting the stage for subsequent evaluations. Chapter 6 is a detailed description of the procedures and techniques employed to generate synthetic beaconing data, which is used to validate the performance of the detection framework under controlled conditions. Chapter 7 is Evaluation and Results. An investigation and comparison of the framework’s performance on both real and synthetic data, summarizing key findings and contributions, and discussing potential improvements. Finally Chapter 8 is Conclusions and Future Work. The final chapter presents the overall conclusions of the research, outlines the contributions made, and proposes directions for future research in the field of network security.
\chapter{Background}
This chapter provides the foundational knowledge for understanding the context and significance of this research. It begins with an overview of the cybersecurity landscape and Advanced Persistent Threats (APTs), followed by enterprise network vulnerabilities. It then explains periodicity detection techniques, time-series databases like InfluxDB, and concludes with the BAYWATCH framework. These concepts are critical for detecting beaconing behavior in enterprise networks.
This chapter provides the foundational knowledge necessary for understanding the context and significance of this research. It begins with an overview of the cybersecurity landscape, emphasizing the current state, emerging trends, and persistent challenges faced by organizations. It then explores Advanced Persistent Threats (APTs) and their sophisticated, covert tactics that pose significant risks to enterprise networks. The discussion also covers the concept of periodicity in network communication, which is for detecting anomalies in cybersecurity contexts. On top of that, the chapter represents the role of time series databases, with a specific focus on InfluxDB, in managing and analyzing the vast amounts of data generated in cybersecurity operations. Finally, the chapter introduces the BAYWATCH framework, which serves as the foundation for the research by providing a structured approach to detecting beaconing behavior in network traffic.
The field of cybersecurity is continually evolving, with new threats emerging as technology advances. Understanding these threats and the strategies to counter them is for protecting sensitive information, ensuring the continuity of operations, and maintaining the integrity of enterprise networks. This chapter lays the foundation for the research by discussing key concepts and technologies relevant to cybersecurity, setting the stage for the detailed analysis and solutions proposed in subsequent chapters.
\section{Cybersecurity Landscape}
The cybersecurity landscape is characterized by dynamic and evolving threats, including malware, ransomware, and APTs. Organizations face challenges in protecting networks due to increasing digitization, cloud adoption, and IoT proliferation. Figure \ref{fig:maps} illustrates the global distribution of cyber threats.
The cybersecurity landscape is characterized by a dynamic and increasingly complex environment where various types of cyber threats continually evolve. Organizations across the globe face numerous challenges in protecting their networks, data, and systems from these threats, which range from malware and ransomware to sophisticated nation-state attacks.
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{../Thesis_Docs/media/maps.png}
\caption{Global cybersecurity threat map \cite{bitdefender}.}
\label{fig:maps}
\end{figure}
Cybersecurity encompasses a wide range of practices, technologies, and strategies aimed at safeguarding information and systems from unauthorized access, damage, or disruption. It involves both proactive measures, such as implementing robust security architectures and practices, and reactive measures, such as incident response and recovery strategies. The cybersecurity landscape is shaped by various factors, including the rapid digitization of industries, the proliferation of cloud services, the Internet of Things (IoT), and the increasing sophistication of cyber attacks.
\subsection{Emerging Trends and Challenges}
Key challenges include:
\begin{itemize}
\item \textbf{Ransomware-as-a-Service (RaaS):} Lowering the barrier for attackers \cite{ransomware2022}.
\item \textbf{Skills Gap:} Shortage of skilled professionals \cite{cybersecurity_skills_gap}.
\end{itemize}
The rapid digitization of industries, the increasing reliance on cloud services, and the proliferation of Internet of Things (IoT) devices have significantly expanded the attack surface for cyber threats. These developments, while beneficial, have introduced new vulnerabilities that attackers are quick to exploit. Additionally, the rise of ransomware as a service (RaaS) and the growing sophistication of phishing attacks reflect the evolving threat landscape.
\section{Advanced Persistent Threats (APTs)}
APTs are prolonged, stealthy attacks often state-sponsored. Figure \ref{fig:apt_attack_lifecycle} shows their lifecycle.
Another significant challenge is the shortage of skilled cybersecurity professionals, which hampers the ability of organizations to effectively defend against these threats. This gap is exacerbated by the complexity of modern networks and the need for advanced tools and techniques to detect and mitigate sophisticated attacks.
\section{Advanced Persistent Threats (APTs) and Covert Tactics}
Advanced Persistent Threats (APTs) represent one of the most sophisticated and dangerous forms of cyber attacks. APTs involve prolonged, targeted efforts by attackers, typically state-sponsored or highly organized criminal groups, aimed at stealing sensitive information, disrupting operations, or compromising infrastructure. Unlike traditional cyber attacks, which may be opportunistic and short-lived, APTs are characterized by their stealth, persistence, and the significant resources devoted to them.
\begin{figure}[htbp]
\begin{figure}
\centering
\includegraphics[width=\textwidth]{../Thesis_Docs/media/apt_attack_lifecycle.png}
\caption{APT attack lifecycle \cite{charan2021dmpt}.}
\caption{APT attack lifecycle \cite{charan2021dmapt}}
\label{fig:apt_attack_lifecycle}
\end{itemize}
\end{figure}
Figure \ref{fig:apt_attack_lifecycle} illustrates the lifecycle of an APT attack, highlighting the various stages involved, from initial reconnaissance to exfiltration of data. Understanding these stages is crucial for developing effective detection and mitigation strategies.
APT actors employ various covert tactics to remain undetected and achieve their objectives. Some of these tactics include:
\subsection{Covert Tactics}
\begin{itemize}
\item Spear phishing \cite{spear_phishing}.
\item Zero-day exploits \cite{zero_day}.
\item Command-and-Control (C2) communication \cite{c2_communication}.
\item \textbf{Spear Phishing:} Crafting highly personalized email messages that appear legitimate to the recipient. These emails are designed to trick recipients into clicking on malicious links or attachments, leading to the compromise of their credentials or systems \cite{caputo2013going}.
\item \textbf{Zero-Day Exploits:} Exploiting previously unknown vulnerabilities in software or hardware, which have not yet been patched by the vendor. This allows attackers to gain unauthorized access to systems without triggering existing security defenses \cite{bilge2012before}.
\item \textbf{Lateral Movement:} After gaining initial access, attackers move within the compromised network, exploring and compromising additional systems to find and exfiltrate valuable data. This tactic often involves the use of legitimate administrative tools to avoid detection.
\item \textbf{Command and Control (C2):} Establishing a secure communication channel with the compromised systems to remotely control them, issue commands, and exfiltrate data \cite{eisenberg2018network}.
\end{itemize}
\section{Enterprise Networks}
Enterprise networks (Figure \ref{fig:enterprise_network_diagram}) are vulnerable to insider threats, misconfigurations, and supply chain attacks \cite{supply_chain_attacks}.
Enterprise networks are the backbone of modern organizations, providing the necessary infrastructure for communication, data sharing, and operational efficiency. However, their complexity and scale make them attractive targets for cyber attackers. Understanding the architecture, components, and vulnerabilities of enterprise networks is crucial for developing effective cybersecurity strategies.
\begin{figure}[htbp]
\centering
\includegraphics[width=0.7\textwidth]{../Thesis_Docs/media/enterprise_network_diagram.png}
\caption{Enterprise network architecture.}
\caption{Enterprise network diagram}
\label{fig:enterprise_network_diagram}
\end{figure}
\section{Periodicity in Network Communication}
Periodic patterns (e.g., beaconing) are detected using:
Figure \ref{fig:enterprise_network_diagram} provides a visual representation of an enterprise network, illustrating the various components such as servers, workstations, routers, and communication links, as well as potential points of vulnerability.
\subsection{Key Aspects of Enterprise Networks}
Enterprise networks typically consist of multiple interconnected subsystems, including:
\begin{itemize}
\item \textbf{Fast Fourier Transform (FFT):} Converts time-domain data to frequency components.
\item \textbf{Autocorrelation:} Measures self-similarity at different time lags.
\item \textbf{Network Architecture:} The physical and logical design of the network, including the layout and interconnection of routers, switches, firewalls, and other network devices. A well-designed architecture enhances security by segmenting the network and controlling traffic flow.
\item \textbf{Security Protocols:} Protocols such as TLS (Transport Layer Security) and IPSec (Internet Protocol Security) protect data in transit. Additionally, firewalls, intrusion detection/prevention systems (IDS/IPS), and encryption mechanisms are employed to safeguard data and systems.
\item \textbf{Access Controls:} Policies and technologies that regulate who can access specific data and resources within the network. This includes user authentication, role-based access control (RBAC), and multi-factor authentication (MFA) to ensure that only authorized personnel can access sensitive information.
\item \textbf{Network Monitoring and Management:} Tools and practices for monitoring network traffic, identifying anomalies, and managing network resources to maintain performance and security.
\end{itemize}
To simulate real-world conditions, artificial datasets often introduce \textbf{jitter}—random delays in beacon intervals—to mimic network irregularities \cite{jitter_analysis}.
\subsection{Vulnerabilities in Enterprise Networks}
Despite the implementation of robust security measures, enterprise networks remain vulnerable to a variety of threats, including:
\begin{itemize}
\item \textbf{Insider Threats:} Employees or contractors with legitimate access who misuse their privileges, either maliciously or negligently.
\item \textbf{Advanced Malware:} Malware designed to bypass traditional security measures, often delivered through phishing attacks or drive-by downloads.
\item \textbf{Misconfigurations:} Incorrectly configured devices or systems that leave the network open to exploitation.
\item \textbf{Supply Chain Attacks:} Attacks that target the software or hardware supply chain, introducing vulnerabilities that can be exploited after deployment.
\end{itemize}
\section{Time Series Databases and InfluxDB}
Time-series databases (TSDBs) like InfluxDB (Figure \ref{fig:influxdb_architecture}) manage temporal data for cybersecurity analytics.
Time-series databases (TSDBs) are optimized for storing and querying temporal data. In cybersecurity, they enable efficient analysis of network traffic patterns over time.
InfluxDB is a popular TSDB known for its high throughput and SQL-like query language (Flux). Key features include:
\begin{itemize}
\item Time-optimized storage for efficient data retrieval.
\item Retention policies for automated data lifecycle management.
\item Integration with visualization tools like Grafana.
\end{itemize}
InfluxDB supports real-time monitoring and historical analysis of network traffic, making it ideal for detecting anomalies like beaconing. For example, its ability to handle high-frequency timestamped data aligns with the BAYWATCH framework's requirements for processing large-scale network logs.
\subsubsection{Applications in Cybersecurity}
InfluxDB can be employed in cybersecurity for:
\begin{itemize}
\item \textbf{Real-Time Monitoring:} Capturing and analyzing live data to detect anomalies and potential threats.
\item \textbf{Historical Analysis:} Storing historical data for trend analysis and forensic investigations.
\item \textbf{Alerting:} Setting up alerts based on specific criteria to notify administrators of suspicious activities.
\item \textbf{Visualization:} Integrating with visualization tools like Grafana to create dashboards that display network metrics and security insights.
\end{itemize}
\begin{figure}
\centering
\includegraphics[width=\textwidth]{../Thesis_Docs/media/influxdb_architecture.png}
\caption{InfluxDB architecture \cite{influxdb2023}.}
\caption{InfluxDB Architecture \cite{influxdb2023}}
\label{fig:influxdb_architecture}
\end{figure}
\subsection{InfluxDB Features}
\begin{itemize}
\item High-throughput data ingestion \cite{influxdb_throughput}.
\item Retention policies \cite{influxdb_retention}.
\item Flux query language \cite{influxdb_flux}.
\end{itemize}
Figure \ref{fig:influxdb_architecture} illustrates the architecture of InfluxDB and how data flows through the system, from ingestion to querying and visualization.
\section{Overview of the BAYWATCH Framework}
The BAYWATCH framework consists of four main phases, each involving one or more filtering steps. These phases are:
......@@ -209,7 +233,7 @@ To minimize the manual investigation workload, the BAYWATCH framework employs a
reducing the number of cases that require manual investigation.
\section{Summary}
This chapter covered cybersecurity threats, APTs, periodicity detection, InfluxDB, and the BAYWATCH framework. These concepts underpin the methodology for detecting beaconing behavior in enterprise networks.
This chapter has provided a comprehensive overview of the cybersecurity landscape, APTs and their covert tactics, enterprise networks, periodicity in network communication, and time series databases, with a detailed focus on InfluxDB. These foundational topics are crucial for understanding the subsequent chapters, which will represent related work, methodology, implementation, experiments, and results. The knowledge gained from this background will inform the development and evaluation of advanced techniques for detecting and mitigating cyber threats in enterprise networks.
\chapter{Related Work}
......@@ -374,7 +398,7 @@ The artificial data was used in conjunction with real-world network traffic to p
The BAYWATCH framework is a robust and scalable methodology designed to detect stealthy beaconing behavior in large-scale enterprise networks. It operates in four main phases: \textbf{Whitelist Analysis}, which eliminates known legitimate traffic using universal and local whitelists; \textbf{Time Series Analysis}, which identifies periodic communication patterns using advanced signal processing techniques such as Fast Fourier Transform (FFT), autocorrelation, and bandpass filtering; \textbf{Suspicious Indicator Analysis}, which further filters out legitimate behavior by analyzing domain-specific indicators like URL tokens and novelty; and \textbf{Investigation and Verification}, where remaining suspicious cases are manually reviewed using a bootstrapping process to minimize workload. The framework was evaluated using both \textbf{real-world data}, collected from a large-scale enterprise network, and \textbf{artificial data}, which simulated various beaconing scenarios with controlled jitter ranges (2, 5, 10, 30, and 60 seconds) and noise levels. The integration of real-world and artificial data ensures a comprehensive evaluation, demonstrating the framework's ability to reliably detect malicious beaconing behavior while remaining robust to real-world perturbations and noise. This makes BAYWATCH a valuable tool for securing enterprise networks against advanced cyber threats.
\chapter{Data Analysis}
This chapter delves into the detailed analysis of the dataset, focusing on understanding user behavior, temporal patterns, and network interactions. By employing advanced visualization techniques and statistical methods, this chapter aims to uncover meaningful insights into the dataset's structure, identify patterns, and detect potential anomalies. The analysis is divided into four main sections: \textbf{Visualization of URL Request Counts}, \textbf{24-Hour URL Visit Analysis}, \textbf{Time Interval Analysis of URL Requests}, and \textbf{Distribution of Hosts Based on Unique URLs Contacted}. Each section provides a comprehensive exploration of the data, supported by visualizations and detailed interpretations.
This chapter represents the detailed analysis of the dataset, focusing on understanding user behavior, temporal patterns, and network interactions. By employing advanced visualization techniques and statistical methods, this chapter aims to uncover meaningful insights into the dataset's structure, identify patterns, and detect potential anomalies. The analysis is divided into four main sections: \textbf{Visualization of URL Request Counts}, \textbf{24-Hour URL Visit Analysis}, \textbf{Time Interval Analysis of URL Requests}, and \textbf{Distribution of Hosts Based on Unique URLs Contacted}. Each section provides a comprehensive exploration of the data, supported by visualizations and detailed interpretations.
\section{Visualization of URL Request Counts}
Understanding the distribution and frequency of URL requests is for identifying patterns and anomalies in user behavior. This section presents visualizations of URL request counts using both logarithmic and linear scales, enabling a detailed comparison of visit frequencies across different URLs.
......
Thesis_Docs/media/apt_attack_lifecycle.png

95.6 KiB | W: | H:

Thesis_Docs/media/apt_attack_lifecycle.png

68.4 KiB | W: | H:

Thesis_Docs/media/apt_attack_lifecycle.png
Thesis_Docs/media/apt_attack_lifecycle.png
Thesis_Docs/media/apt_attack_lifecycle.png
Thesis_Docs/media/apt_attack_lifecycle.png
  • 2-up
  • Swipe
  • Onion skin
......@@ -71,19 +71,39 @@
note = {Accessed: 2024-08-13}
}
@inproceedings{bilge2012before,
title={Before we knew it: an empirical study of zero-day attacks in the real world},
author={Bilge, Leyla and Dumitra{\c{s}}, Tudor},
booktitle={Proceedings of the 2012 ACM conference on Computer and communications security},
pages={833--844},
year={2012}
}
@article{caputo2013going,
title={Going spear phishing: Exploring embedded training and awareness},
author={Caputo, Deanna D and Pfleeger, Shari Lawrence and Freeman, Jesse D and Johnson, M Eric},
journal={IEEE security \& privacy},
volume={12},
number={1},
pages={28--38},
year={2013},
publisher={IEEE}
}
@Misc{bitdefender,
title = {Global Cybersecurity Threat Map},
author = "{Bitdefender}",
url = {https://threatmap.bitdefender.com/},
note = {Accessed: 2024-08-13}
@article{eisenberg2018network,
title={Network foundation for command and control (C2) systems: literature review},
author={Eisenberg, Daniel A and Alderson, David L and Kitsak, Maksim and Ganin, Alexander and Linkov, Igor},
journal={IEEE Access},
volume={6},
pages={68782--68794},
year={2018},
publisher={IEEE}
}
@article{charan2021dmapt,
@incollection{charan2021dmapt,
title={Dmapt: Study of data mining and machine learning techniques in advanced persistent threat attribution and detection},
author={Charan, PV Sai and Anand, P Mohan and Shukla, Sandeep K},
journal={Data Mining-Concepts and Applications},
pages={63},
booktitle={Data Mining-Concepts and Applications},
year={2021},
publisher={IntechOpen}
}
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment