Skip to content

Latest commit

 

History

History
433 lines (277 loc) · 8.32 KB

File metadata and controls

433 lines (277 loc) · 8.32 KB

A Brief Overview - PostgreSQL

1. Introduction

PostgreSQL (often called Postgres) is an advanced, open-source, object-relational database management system (ORDBMS) known for:

  • High reliability
  • Strong SQL compliance
  • Extensibility
  • ACID transactions
  • Support for both relational + semi-structured data

PostgreSQL is widely used in financial systems, analytics, SaaS platforms, microservices, geospatial applications, and more.

It is maintained by a large global community and is considered one of the most powerful open-source databases.


2. Key Features of PostgreSQL

2.1 ACID Compliance

Provides strong guarantees for:

  • Atomicity
  • Consistency
  • Isolation
  • Durability

Ensures safe, reliable transactions—ideal for financial and mission-critical applications.

2.2 MVCC (Multi-Version Concurrency Control)

Allows multiple readers and writers without locking, preventing read-write conflicts.

Benefits:

  • High concurrency
  • No dirty reads
  • Non-blocking SELECT queries

This makes PostgreSQL highly scalable under heavy workloads.

2.3 Extensibility

PostgreSQL is known as the “developer’s database” because it is fully extensible.

You can add:

  • Custom data types
  • Custom operators
  • Custom indexing methods
  • Stored procedures (in PL/pgSQL, Python, Java, Perl, C, etc.)
  • Extensions (PostGIS, pg_partman, TimescaleDB, etc.)

2.4 Advanced SQL Support

PostgreSQL supports nearly all SQL standards plus many extensions:

  • FULL, LEFT, RIGHT OUTER JOIN
  • Window Functions
  • CTEs (WITH queries)
  • Recursive queries
  • Partial indexes
  • Array & JSON operations
  • Upsert (INSERT ... ON CONFLICT)

2.5 JSON and Semi-Structured Data

PostgreSQL provides powerful support for JSON:

  • json and jsonb (binary JSON — faster and compressed)
  • Indexing on JSON fields
  • Filtering inside JSON (@>, ->, #>>)
  • Perfect alternative to NoSQL for most workloads

Example:

SELECT data->'name' FROM users;

2.6 Strong Indexing Options

Index types:

Index Type Use Case
B-Tree Default, most queries
Hash Equality lookups
GIN Full-text search, JSONB
GiST Spatial/geometric data
BRIN Big analytical tables
SP-GiST Hierarchical data

PostgreSQL has one of the most advanced indexing systems in the database world.

2.7 Replication & High Availability

Supports:

  • Streaming replication (asynchronous/synchronous)
  • Logical replication
  • WAL archiving (Write-Ahead Logging)
  • Failover protection (Patroni, repmgr)
  • Sharding (Citus extension)

2.8 Security Features

  • Row-level security
  • Role-based access control
  • SSL/TLS support
  • Data encryption (client-side, extension-based for server-side)
  • Audit logging

3. PostgreSQL Architecture

PostgreSQL uses a process-based architecture (not thread-based).

3.1 Components

Postmaster / Postgres Server

Parent process that:

  • Starts PostgreSQL
  • Listens for connections
  • Forks new backend processes

Backend Process

Each client connection gets its own backend process.

Shared Buffers

A shared memory area used for caching pages.

WAL Buffer

Temporary storage for write-ahead logs before flushing to disk.

Background Worker Processes

Process Purpose
WAL Writer Writes WAL records to disk
Background Writer Writes dirty pages to disk
Checkpointer Flushes data periodically
Autovacuum Worker Removes dead tuples
Stats Collector Maintains query statistics

4. Transaction Management

PostgreSQL uses:

4.1 Write-Ahead Logging (WAL)

Before modifying data, PostgreSQL writes logs to disk ensuring crash safety.

4.2 MVCC (Multi-Version Concurrency Control)

Provides snapshots for queries → avoids locks.

4.3 Vacuum & Autovacuum

Removes dead tuples generated by MVCC.

VACUUM;
VACUUM ANALYZE;

5. Data Types in PostgreSQL

PostgreSQL has rich data types beyond typical SQL systems:

Numeric types

  • INTEGER, BIGINT, SMALLINT
  • NUMERIC (exact)
  • REAL, DOUBLE PRECISION

Character types

  • CHAR, VARCHAR, TEXT

Date/Time

  • DATE, TIME, TIMESTAMP
  • INTERVAL

Boolean

  • true, false

JSON types

  • json
  • jsonb

Array

int[]  → {1,2,3}

HSTORE

Key-value store.

Geometric types

Points, polygons, circles, etc.

Network types

IP address, MAC, subnet.


6. PostgreSQL SQL Features

6.1 Joins

Supports all join types:

SELECT *
FROM orders o
LEFT JOIN customers c
ON o.customer_id = c.id;

6.2 Window Functions

SELECT name, salary,
       RANK() OVER (ORDER BY salary DESC)
FROM employees;

6.3 CTE (WITH Clause)

WITH recursive nums AS (
  SELECT 1 AS n
  UNION ALL
  SELECT n+1 FROM nums WHERE n < 10
)
SELECT * FROM nums;

6.4 Partitioning

Supports:

  • Range partitioning
  • List partitioning
  • Hash partitioning
CREATE TABLE sales (
    id INT,
    amount INT,
    sale_date DATE
) PARTITION BY RANGE (sale_date);

7. PostgreSQL Tools

7.1 psql (CLI tool)

Most commonly used CLI interface.

psql -U postgres -d mydb

7.2 pgAdmin

GUI tool to manage databases.

7.3 Extensions Ecosystem

Popular extensions:

Extension Purpose
PostGIS GIS & geographic data
pg_partman Partition automation
uuid-ossp UUID generation
pg_stat_statements Query monitoring
Citus Distributed PostgreSQL

8. Performance Optimization

8.1 Use Indexes Wisely

CREATE INDEX idx_name ON users(name);

8.2 ANALYZE and EXPLAIN

EXPLAIN ANALYZE SELECT * FROM orders WHERE id = 10;

8.3 Query Optimization

  • Avoid SELECT *
  • Use prepared statements
  • Prefer jsonb over json
  • Normalize or partition large tables

8.4 Tune PostgreSQL Configuratiom

Key settings in postgresql.conf:

  • shared_buffers
  • work_mem
  • maintenance_work_mem
  • effective_cache_size
  • wal_buffers

9. Replication and Failover

9.1 Streaming Replication

Primary → Standby servers.

pg_basebackup -h primary -D data -U replicator -P

9.2 Logical Replication

Supports table-level replication.

9.3 Sharding using Citus

Distributes tables across multiple nodes → horizontally scalable DB.


10. Backup & Restore

10.1 SQL Dump

pg_dump -U postgres mydb > backup.sql
psql -U postgres mydb < backup.sql

10.2 Physical Backup

Using pg_basebackup.


11. Security and Roles

11.1 Roles and Permissions

CREATE ROLE appuser LOGIN PASSWORD 'pass';
GRANT SELECT ON ALL TABLES IN SCHEMA public TO appuser;

11.2 Row-Level Security

ALTER TABLE employees ENABLE ROW LEVEL SECURITY;

12. PostgreSQL vs MySQL

Feature PostgreSQL MySQL
Type ORDBMS RDBMS
JSON support Excellent (jsonb) Moderate
Concurrency MVCC, non-blocking MVCC but more locks
Extensibility Very high Limited
Index types Many Few
ACID strictness Strong Varies by engine
Use Cases Complex apps, analytics Web apps, CRUD APIs

13. Use Cases of PostgreSQL

  • Banking & financial systems
  • Large-scale e-commerce
  • Data warehousing
  • SaaS platforms
  • Geospatial applications (PostGIS)
  • Event logging systems
  • Microservices (with JSONB)
  • Real-time analytics (Citus, TimescaleDB)

Key Takeaways

  • PostgreSQL is one of the most powerful open-source databases.

  • Offers advanced features such as MVCC, JSONB, GIS, window functions, CTEs.

  • Highly extensible and customizable.

  • Strong SQL compliance and ACID guarantees.

  • Ideal for scalable, reliable, enterprise-level applications.