Intelligent Pac-Man Agent

An autonomous AI agent designed to play Pac-Man using real-time computer vision and hybrid intelligence. This project combines local high-speed processing for immediate reflexes with cloud-based AI (Google Gemini) for high-level strategic planning.

🚀 Features

Real-time Screen Capture: Uses mss for low-latency, pixel-perfect game state acquisition.
Computer Vision Pipeline:
- Custom MapExtractor to build a grid representation of the game level.
- ObjectDetectorCV using template matching to track Pac-Man, ghosts, and pellets.
Hybrid AI Architecture:
- Local Agent: Fast, deterministic policy for collision avoidance and pathfinding (30 FPS).
- Cloud Strategist: Asynchronous integration with Google Gemini to analyze game state and provide strategic advice.
Modular Design: Decoupled modules for Capture, Vision, Agent, and Control, allowing for easy experimentation with different algorithms (e.g., RL vs. Heuristic).

🛠️ Tech Stack

Language: Python 3.x
Computer Vision: OpenCV (cv2), NumPy
Input/Output: mss (Screen Capture), pyautogui/keyboard (Control)
AI Integration: Google Generative AI SDK (Gemini)

📂 Project Structure

├── agent/          # Decision making logic (Pathfinding, Policies)
├── ai_google/      # Google Gemini integration for strategic advice
├── capture/        # Screen capture implementation
├── control/        # Keyboard input simulation
├── vision/         # Computer vision pipeline (Detection, Mapping)
├── docs/           # Documentation and Architecture details
├── tools/          # Calibration and utility scripts
└── main.py         # Application entry point

⚡ Getting Started

Clone the repository

git clone https://github.com/snowholt/Inteligent-PacMan.git
cd Inteligent-PacMan

Install Dependencies
```
pip install -r requirements.txt
```
(Note: Ensure you have opencv-python, numpy, mss, google-generativeai, etc. installed)
Configuration
- Update config.py with your screen region coordinates.
- Set your Google API Key in .env:
```
GOOGLE_API_KEY=your_api_key_here
```
Run the Agent Open your Pac-Man game window and run:
```
python main.py
```

🧠 Architecture

The system follows a robotic Sense-Plan-Act loop:

Sense: Capture screen frame -> Detect objects -> Update World Model.
Plan: Calculate costs -> Consult Policy/Gemini -> Determine next move.
Act: Send keystroke to OS.

For more details, see ARCHITECTURE.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intelligent Pac-Man Agent

🚀 Features

🛠️ Tech Stack

📂 Project Structure

⚡ Getting Started

🧠 Architecture

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
agent		agent
ai_google		ai_google
capture		capture
control		control
docs		docs
geminiLumina		geminiLumina
tools		tools
utils		utils
vision		vision
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
config.py		config.py
debug_mss.py		debug_mss.py
main.py		main.py

snowholt/Inteligent-PacMan

Folders and files

Latest commit

History

Repository files navigation

Intelligent Pac-Man Agent

🚀 Features

🛠️ Tech Stack

📂 Project Structure

⚡ Getting Started

🧠 Architecture

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages