In this project, we focused on analyzing and comparing network traffic from several common applications. Our goal was to understand traffic characteristics across multiple layers and determine how patterns can be distinguished between different applications.
The analysis included:
- Capturing traffic using Wireshark.
- Decoding traffic using saved TLS keys.
- Comparing packet amount and sizes.
- Drawing conclusions regarding an attacker's ability to identify the application the user accessed, based on hash of the 4-tuple flow ID availability.
Please take a look at the attached PDF.
Before running the scripts, ensure you have the following installed:
- Python Version: Python 3.13.1 (or similiar)
- TShark Installation (Required for
pyshark):- Linux: Run
sudo apt install tshark - Windows: Install Wireshark and ensure
tsharkis in the systemPATH.
- Linux: Run
If you haven't already, clone the repository:
git clone https://github.com/Raz99/CN_final_project.git
cd CN_final_project- Ensure
pipis Installed (For some Python versions):python -m ensurepip --default-pip
- Create and Activate a Virtual Environment (Recommended):
python -m venv venv source venv/bin/activate # Linux/Mac venv\Scripts\activate # Windows
- Install Required Libraries:
These libraries are used as follows:
pip install pandas matplotlib pyshark numpy
pandasfor data manipulation.matplotlib.pyplotfor generating graphs.pysharkfor reading network traffic from PCAP files.numpyfor numerical operations.collections.Counterfor counting occurrences of elements in datasets.
This script analyzes the recorded network traffic (PCAP file) by extracting relevant information and presenting it graphically. It performs three types of analyses:
- IP header fields
- TCP header fields
- TLS header fields
This script analyzes the recorded network traffic (PCAP file) by extracting relevant information and presenting it graphically. It performs a unique analysis based on packet sizes and generates graphs for:
- Small packets (size < 200 bytes)
- Medium packets (200 bytes <= size <= 1000 bytes)
- Large packets (size > 1000 bytes)
Both scripts use the pyshark library to read the recorded files and matplotlib to generate graphs.
To run the scripts, make sure to input the path and filenames of the recordings under the apps dict defined at the beginning of each script. Current structure:
apps = {
"filtered_chrome.pcap": "Chrome",
"filtered_firefox.pcap": "Firefox",
"filtered_spotify.pcap": "Spotify",
"filtered_youtube.pcap": "YouTube",
"filtered_zoom.pcap": "Zoom"
}Note: The scripts plot_network_traffic.py and plot_packet_sizes.py process PCAP files, which may take some time depending on the file size. Running them on some recordings could result in longer execution times.
This script analyzes the recorded network traffic (PCAP file) by extracting relevant information and presenting it graphically. It performs five types of analyses:
- Packet size distribution
- Time differences between packets
- Flow volume (Bytes per flow)
- Flow size (Number of packets per flow)
- Common destination IP addresses
The script uses the pyshark library to read the recorded file and matplotlib to generate graphs.
To run the script, make sure to input the path and filename of the recording under the pcap_file variable defined at the beginning of the script. Current structure:
pcap_file = 'filtered_spotify_and_emails.pcap'This section discusses core challenges in transport and network layers, such as diagnosing slow file transfers, handling TCP flow control, optimizing routing decisions, improving performance with MPTCP, and identifying sources of packet loss.
This section summarizes three research papers that present advanced methods for classifying encrypted internet traffic.
- The first study introduces FlowPic, which converts flow data into images and uses CNNs for accurate traffic classification.
- The second study presents hRFTC, a hybrid method combining TLS handshake features with flow statistics for early classification.
- The third study shows how machine learning can infer OS, browser, and application from traffic patterns without accessing payload content.
For the remaining parts (Part 3 & Bonus), there is a detailed explanation below.
For browser recordings, we performed the following actions:
- Opened the applications and reached the homepage.
- Searched for "Ariel University" and accessed the university's website.
- Accessed the Spotify website via Chrome and played a podcast (audio only).
- Accessed the YouTube website via Chrome and played a video (with both video and audio).
- Opened the Zoom desktop application and conducted a video call between two computers (including camera, microphone and chat).
- Accessed the Spotify website via Chrome and played a podcast (audio only).
- Simultaneously, we opened Gmail in Chrome and occasionally sent emails.
- Shows the number of packets per second at the IP layer.
- Provides insight into the volume of data transmitted in each application per second.
- Displays the number of packets per second at the TCP layer.
- Shows the number of connections established and the amount of traffic generated per second.
- Displays the number of packets per second at the TLS layer.
- Shows the number of encrypted packets per second.
Displays three different graphs:
- Small packets
- Medium packets
- Large packets
These graphs are useful for understanding the frequency and size of packets in each application.
Displays five different graphs:
- Packet size distribution
- Time differences between packets
- Flow volume (Bytes per flow)
- Flow size (Number of packets per flow)
- Common destination IP addresses
- YouTube
- ChatGPT
- Wikipedia
The full and detailed list is included in the attached PDF (link above).
- Raz Cohen - GitHub, LinkedIn
- Ronen Chereshnya - GitHub, LinkedIn
- Shir Bismuth - GitHub
- Clara Franco - GitHub
Course Lecturer: Professor Amit Zeev Dvir