This repository contains the complete implementation and analysis pipeline of an EEG-based lie detection experiment. Inspired by Neural processes underlying faking and concealing a personal identity: An EEG study, this project aims to classify truthful versus deceptive responses to identity-related prompts using neural network models, traditional machine learning algorithms, and carefully designed preprocessing and feature extraction techniques.
Developed by the Lie-Detector Team, the experiment and analyses included here serve as a comprehensive exploration—from raw EEG data collection, preprocessing, and exploratory analysis, through feature engineering, model training, and evaluation.
- Project Description
- Project Structure
- Data and Experiment Setup
- Preprocessing, Feature Extraction & ICA
- Machine Learning & Neural Networks
- EDA & Results Visualization
- References and Related Work
- Placeholders for Media
This project revolves around detecting deception in identity statements using EEG signals. Participants were instructed to respond “yes” or “no” to identity-related information in multiple experimental blocks, sometimes honestly and sometimes deceptively. By analyzing the resulting EEG signals, the objective was to uncover neural markers of truth-telling and deception.
Key Highlights:
- Inspired Research: Built upon previous EEG studies exploring neural correlates of deception and identity masking.
- Robust Experimentation: Controlled EEG experiment with well-defined blocks of trials for truthful and deceptive responses to real or fake personal identities.
- Comprehensive Data Pipeline: From EEG headset recordings to final classification models, including preprocessing, ICA, feature extraction, and hyperparameter optimization.
- Multi-Model Approach: Includes Random Forest, SVM, KNN, Logistic Regression, and advanced neural network architectures (DGCNN, FBCNet, LSTM).
.
├── README.md (You are here - Main Project Introduction)
├── experiment
│ ├── README.md
│ ├── eeg_data
│ │ └── ...EEG files per participant
│ └── src
│ ├── assets
│ ├── eeg_headset (EEG acquisition code)
│ │ └── README.md
│ ├── gui (Graphical User Interface for experiment)
│ │ └── README.md
│ └── personal_data (Identity generation and management)
│ └── README.md
└── classificators_and_data
├── README.md
├── data (Processed data folders)
├── data_extractor (Data loading and formatting)
│ └── README.md
├── data_preprocessing (Preprocessing scripts)
├── final_models (Scripts and results of best models)
│ ├── README.md
│ ├── neural_networks
│ │ └── README.md
│ └── random_forest
│ └── README.md
├── machine_learning (Classical ML pipelines and results)
│ ├── README.md
│ ├── ica (Independent Component Analysis)
│ │ └── README.md
│ ├── results (Evaluation metrics, confusion matrices)
│ └── training (Hyperparameter search, feature selection)
│ └── README.md
└── neural_networks (Deep learning models and logs)
├── README.md
└── ai (Core training scripts, dataset management)
For detailed descriptions, please see the individual README.md files in the corresponding directories.
Participants responded to identity-related prompts (their own, fake, celebrity, and random identities) under instructions to either tell the truth or lie. The experiment directory contains code for:
- Personal Data Generation: Real, fake, celebrity, and random identity details managed by a personal data module.
- GUI: A Pygame-based interface presenting stimuli and recording participant responses.
- EEG Headset Integration: Data acquisition scripts utilizing MNE and BrainAccess libraries, with annotated trials.
Relevant Links:
Before model training, EEG signals were preprocessed to remove noise and artifacts. Techniques included band-pass filtering, notch filters, and ICA for artifact removal. Additional feature sets were engineered (mean, std, variance, skewness, kurtosis, frequency band powers) to boost classification performance.
Relevant Links:
- Data Extractor README
- Machine Learning ICA README
- Data Preprocessing & Feature Engineering Notebooks
Multiple classifiers were tested:
- Traditional ML: Random Forest, SVM, KNN, Logistic Regression.
- Neural Networks: DGCNN, FBCNet, and LSTM architectures trained on EEG timeseries and extracted features.
Grid searches and hyperparameter tuning refined model performance. Subject-based and random splits were compared, highlighting the challenges in generalizing across individuals.
Relevant Links:
Extensive Exploratory Data Analysis (EDA) provided insights into response times, event-related potentials (ERPs), and participant consistency. While EDA findings did not directly influence model training, they offered a deeper understanding of the dataset.
Relevant Links:
EDA Plots:
- Original paper: Neural processes underlying faking and concealing a personal identity: An EEG study
- EEG and MNE documentation: MNE-Python
- TorchEEG: TorchEEG Documentation



