Skip to content

AsherPe/Lmitation_Learning-LunarLander

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 Imitation Learning on LunarLander

📌 Overview

This project explores Imitation Learning in the LunarLander environment from Gymnasium.
Instead of learning through trial-and-error like traditional Reinforcement Learning, the agent learns to act by mimicking expert behavior using supervised learning techniques.

The goal is to train a model that can replicate expert decisions and successfully land the spacecraft.


🧠 Key Idea

Imitation Learning bridges the gap between:

  • Supervised Learning (learning from labeled data)
  • Reinforcement Learning (learning from rewards)

In this project, we apply Behavior Cloning, where the model learns a direct mapping: state -> action.


🎮 Environment

  • Environment: LunarLander (Gymnasium)
  • State Space: 8-dimensional continuous vector
  • Action Space: 4 discrete actions:
    • Do nothing
    • Fire left engine
    • Fire main engine
    • Fire right engine

The objective is to land safely between the flags with minimal velocity and correct orientation.


⚙️ Methodology

1. Data Collection

  • Generated expert trajectories using a heuristic / rule-based policy
  • Collected (state, action) pairs

2. Behavior Cloning

  • Trained a neural network using supervised learning
  • Input: state vector
  • Output: predicted action

3. Model Training

  • Loss function: Cross-Entropy
  • Optimization: Gradient-based optimization
  • Goal: Minimize difference between predicted and expert actions

📊 Results

  • The trained agent is able to imitate expert-like behavior
  • Successfully performs controlled descent in many scenarios
  • Demonstrates stable policy learning without reward-based training

📁 Project Structure

├── notebook.ipynb # Main training and evaluation notebook ├── data/ # Collected expert data (if exists) ├── models/ # Saved models (optional) └── README.md


▶️ How to Run

  1. Open the notebook in Google Colab / Jupyter
  2. Run all cells
  3. The agent will:
    • Train on expert data
    • Evaluate performance in the environment

🛠️ Technologies

  • Python
  • TensorFlow / Keras
  • Gymnasium
  • NumPy
  • Matplotlib

💡 Key Takeaways

  • Imitation Learning can simplify RL problems by removing the need for reward design
  • Behavior Cloning is simple but sensitive to distribution shift
  • Demonstrates how supervised learning can be applied to sequential decision-making

✨ Future Improvements

  • Implement DAgger for better generalization
  • Compare with RL methods (DQN / Policy Gradient)
  • Improve robustness to unseen states

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors