Communication Networks - Final Project

Project Description

In this project, we focused on analyzing and comparing network traffic from several common applications. Our goal was to understand traffic characteristics across multiple layers and determine how patterns can be distinguished between different applications.

The analysis included:

Capturing traffic using Wireshark.
Decoding traffic using saved TLS keys.
Comparing packet amount and sizes.
Drawing conclusions regarding an attacker's ability to identify the application the user accessed, based on hash of the 4-tuple flow ID availability.

Please take a look at the attached PDF.

Installation

Prerequisites

Before running the scripts, ensure you have the following installed:

Python Version: Python 3.13.1 (or similiar)
TShark Installation (Required for pyshark):
- Linux: Run sudo apt install tshark
- Windows: Install Wireshark and ensure tshark is in the system PATH.

Cloning the Repository

If you haven't already, clone the repository:

git clone https://github.com/Raz99/CN_final_project.git
cd CN_final_project

Setting Up the Environment

Ensure pip is Installed (For some Python versions):
```
python -m ensurepip --default-pip
```

Create and Activate a Virtual Environment (Recommended):

python -m venv venv
source venv/bin/activate  # Linux/Mac
venv\Scripts\activate     # Windows

Install Required Libraries:
```
pip install pandas matplotlib pyshark numpy
```
These libraries are used as follows:
- pandas for data manipulation.
- matplotlib.pyplot for generating graphs.
- pyshark for reading network traffic from PCAP files.
- numpy for numerical operations.
- collections.Counter for counting occurrences of elements in datasets.

Scripts

plot_network_traffic.py

This script analyzes the recorded network traffic (PCAP file) by extracting relevant information and presenting it graphically. It performs three types of analyses:

IP header fields
TCP header fields
TLS header fields

plot_packet_sizes.py

This script analyzes the recorded network traffic (PCAP file) by extracting relevant information and presenting it graphically. It performs a unique analysis based on packet sizes and generates graphs for:

Small packets (size < 200 bytes)
Medium packets (200 bytes <= size <= 1000 bytes)
Large packets (size > 1000 bytes)

Both scripts use the pyshark library to read the recorded files and matplotlib to generate graphs. To run the scripts, make sure to input the path and filenames of the recordings under the apps dict defined at the beginning of each script. Current structure:

apps = {
    "filtered_chrome.pcap": "Chrome",
    "filtered_firefox.pcap": "Firefox",
    "filtered_spotify.pcap": "Spotify",
    "filtered_youtube.pcap": "YouTube",
    "filtered_zoom.pcap": "Zoom"
}

Note: The scripts plot_network_traffic.py and plot_packet_sizes.py process PCAP files, which may take some time depending on the file size. Running them on some recordings could result in longer execution times.

plots_bonus.py

This script analyzes the recorded network traffic (PCAP file) by extracting relevant information and presenting it graphically. It performs five types of analyses:

Packet size distribution
Time differences between packets
Flow volume (Bytes per flow)
Flow size (Number of packets per flow)
Common destination IP addresses

The script uses the pyshark library to read the recorded file and matplotlib to generate graphs. To run the script, make sure to input the path and filename of the recording under the pcap_file variable defined at the beginning of the script. Current structure:

pcap_file = 'filtered_spotify_and_emails.pcap'

Review of the first two parts

Part 1:

This section discusses core challenges in transport and network layers, such as diagnosing slow file transfers, handling TCP flow control, optimizing routing decisions, improving performance with MPTCP, and identifying sources of packet loss.

Part 2:

This section summarizes three research papers that present advanced methods for classifying encrypted internet traffic.

The first study introduces FlowPic, which converts flow data into images and uses CNNs for accurate traffic classification.
The second study presents hRFTC, a hybrid method combining TLS handshake features with flow statistics for early classification.
The third study shows how machine learning can infer OS, browser, and application from traffic patterns without accessing payload content.

For the remaining parts (Part 3 & Bonus), there is a detailed explanation below.

Recordings

Part 3:

Chrome & Firefox

For browser recordings, we performed the following actions:

Opened the applications and reached the homepage.
Searched for "Ariel University" and accessed the university's website.

Spotify

Accessed the Spotify website via Chrome and played a podcast (audio only).

YouTube

Accessed the YouTube website via Chrome and played a video (with both video and audio).

Zoom

Opened the Zoom desktop application and conducted a video call between two computers (including camera, microphone and chat).

Bonus:

Spotify & Gmail

Accessed the Spotify website via Chrome and played a podcast (audio only).
Simultaneously, we opened Gmail in Chrome and occasionally sent emails.

Graphs

Part 3:

Graph A: IP Header Fields

Shows the number of packets per second at the IP layer.
Provides insight into the volume of data transmitted in each application per second.

Graph B: TCP Header Fields

Displays the number of packets per second at the TCP layer.
Shows the number of connections established and the amount of traffic generated per second.

Graph C: TLS Header Fields

Displays the number of packets per second at the TLS layer.
Shows the number of encrypted packets per second.

Graph D: Packet Size

Displays three different graphs:

Small packets
Medium packets
Large packets

These graphs are useful for understanding the frequency and size of packets in each application.

Bonus:

Displays five different graphs:

Packet size distribution
Time differences between packets
Flow volume (Bytes per flow)
Flow size (Number of packets per flow)
Common destination IP addresses

Sources

YouTube
ChatGPT
Wikipedia

The full and detailed list is included in the attached PDF (link above).

Authors

Raz Cohen - GitHub, LinkedIn
Ronen Chereshnya - GitHub, LinkedIn
Shir Bismuth - GitHub
Clara Franco - GitHub

Acknowledgments

Course Lecturer: Professor Amit Zeev Dvir

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
res		res
src		src
Communication Networks - Final Project.pdf		Communication Networks - Final Project.pdf
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Communication Networks - Final Project

Project Description

Installation

Prerequisites

Cloning the Repository

Setting Up the Environment

Scripts

plot_network_traffic.py

plot_packet_sizes.py

plots_bonus.py

Review of the first two parts

Part 1:

Part 2:

Recordings

Part 3:

Chrome & Firefox

Spotify

YouTube

Zoom

Bonus:

Spotify & Gmail

Graphs

Part 3:

Graph A: IP Header Fields

Graph B: TCP Header Fields

Graph C: TLS Header Fields

Graph D: Packet Size

Bonus:

Sources

Authors

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages