From d056e4dfd5789bbacd0a756b822cc9d89d8c8340 Mon Sep 17 00:00:00 2001 From: Abhishek Chatterjee Date: Sun, 8 Feb 2026 14:34:27 +0530 Subject: [PATCH 1/2] IMDEEPMIND-24: Add completion status to database system notes and apply minor text and formatting refinements. --- .../database-systems/database-storage.md | 14 +++++--- .../index-organized-storage.md | 15 +++++--- .../database-systems/introduction.md | 6 ++++ docs/databases/database-systems/lsm-tree.md | 18 ++++++---- .../database-systems/relational-algebra.md | 6 +++- .../tuple-oriented-storage.md | 34 +++++++++++++------ 6 files changed, 65 insertions(+), 28 deletions(-) diff --git a/docs/databases/database-systems/database-storage.md b/docs/databases/database-systems/database-storage.md index 93ebf6fa..950c354a 100644 --- a/docs/databases/database-systems/database-storage.md +++ b/docs/databases/database-systems/database-storage.md @@ -4,6 +4,12 @@ sidebar_position: 3 # Database Storage +:::tip[Status] + +This note is complete, reviewed, and considered stable. + +::: + **Database storage** is the physical representation of data within a database system. It's typically organized into **files** and **pages**. ## Storage Hierarchy @@ -114,7 +120,7 @@ Understanding the storage hierarchy is crucial for designing efficient and cost- - **Example:** Jumping directly to a specific page in a book using the table of contents. > Random access on **non-volatile** storage is almost always **much slower** than sequential access. -> DBMS will want to maximize sequential access. +> DBMS will want to maximize sequential access.s ## Database Storage Layers @@ -124,9 +130,9 @@ A database storage system can be thought of as three stacked layers, each respon ```mermaid flowchart TB - Logical["Logical Layer\n(schema, tables, queries, indexes)"] - StorageEngine["Storage Engine\n(buffer manager, page manager, access methods, recovery)"] - Physical["Physical Layer\n(file system, block device, disk/SSD, cloud storage)"] + Logical["Logical Layer (schema, tables, queries, indexes)"] + StorageEngine["Storage Engine (buffer manager, page manager, access methods, recovery)"] + Physical["Physical Layer (file system, block device, disk/SSD, cloud storage)"] Logical -->|requests| StorageEngine StorageEngine -->|I/O| Physical ``` diff --git a/docs/databases/database-systems/index-organized-storage.md b/docs/databases/database-systems/index-organized-storage.md index 0691a991..7a4c53e6 100644 --- a/docs/databases/database-systems/index-organized-storage.md +++ b/docs/databases/database-systems/index-organized-storage.md @@ -4,6 +4,12 @@ sidebar_position: 5 # Index Organized Storage +:::tip[Status] + +This note is complete, reviewed, and considered stable. + +::: + Index-Organized Storage (IOS) is a storage technique in databases where data is stored directly in the index structure itself. Unlike traditional tables where data and indexes are stored separately, an index-organized table (IOT) combines both the data and index, allowing for efficient access patterns and performance benefits in specific use cases.
@@ -13,10 +19,10 @@ flowchart LR root((Root)) internal1((Internal Node)) internal2((Internal Node)) - leafA("[Leaf: PK=1\nRowData]") - leafB("[Leaf: PK=2\nRowData]") - leafC("[Leaf: PK=100\nRowData]") - secIdx("[Secondary Index\n(Non-clustered)]") + leafA("[Leaf: PK=1 RowData]") + leafB("[Leaf: PK=2 RowData]") + leafC("[Leaf: PK=100 RowData]") + secIdx("[Secondary Index (Non-clustered)]") root --> internal1 root --> internal2 @@ -25,7 +31,6 @@ flowchart LR internal2 --> leafC secIdx --> leafB secIdx --> leafC - ```
diff --git a/docs/databases/database-systems/introduction.md b/docs/databases/database-systems/introduction.md index fac7b711..fe10997b 100644 --- a/docs/databases/database-systems/introduction.md +++ b/docs/databases/database-systems/introduction.md @@ -4,6 +4,12 @@ sidebar_position: 1 # Introduction +:::tip[Status] + +This note is complete, reviewed, and considered stable. + +::: + **Database Systems** are sophisticated software systems designed to store, manage, retrieve, and protect data efficiently. Understanding the internals and architecture of database systems is crucial for building scalable and reliable applications. ## Core Concepts diff --git a/docs/databases/database-systems/lsm-tree.md b/docs/databases/database-systems/lsm-tree.md index 9999dfa6..c8a4efb8 100644 --- a/docs/databases/database-systems/lsm-tree.md +++ b/docs/databases/database-systems/lsm-tree.md @@ -4,7 +4,11 @@ sidebar_position: 6 # Log-Structured Merge Tree - +:::tip[Status] + +This note is complete, reviewed, and considered stable. + +::: Log-Structured Merge (LSM) trees are a fundamental data structure used in database storage, particularly for handling high-write workloads efficiently. Here’s a breakdown of key concepts and considerations when working with LSM storage: @@ -261,7 +265,7 @@ An SSTable, a file stored on disk, is an immutable and sorted structure optimize - _Type_: Distributed SQL database. - _Usage_: TiDB uses an LSM-based storage layer (with RocksDB or TiKV) for high-performance and distributed data management, balancing SQL compatibility with NoSQL performance. -## **3. Time-Series Databases** +## Time-Series Databases ### InfluxDB @@ -273,7 +277,7 @@ An SSTable, a file stored on disk, is an immutable and sorted structure optimize - _Type_: Time-series database built on PostgreSQL. - _Usage_: While PostgreSQL uses a B-tree structure by default, TimescaleDB includes LSM options and optimizations for handling high-frequency data insertions in time-series data. -### **4. Search and Logging Databases** +### Search and Logging Databases ### Elasticsearch @@ -292,9 +296,9 @@ An SSTable, a file stored on disk, is an immutable and sorted structure optimize ## FAQ: LSM Tree Sizing and Tuning -### How do I select the number of levels? +### How do we select the number of levels? -- The number of levels is determined primarily by your total on-disk dataset size (S), the size of your base level or memtable flush size (M), and the growth factor between levels (T). As a rule-of-thumb for leveling compaction: +- The number of levels is determined primarily by our total on-disk dataset size (S), the size of our base level or memtable flush size (M), and the growth factor between levels (T). As a rule-of-thumb for leveling compaction: L ≈ ceil(log_T(S / M)) @@ -304,7 +308,7 @@ An SSTable, a file stored on disk, is an immutable and sorted structure optimize - Choose the memtable / base SSTable size small enough to avoid large L0 write stalls. - Use a T value (commonly 8–10 for leveling) to make level sizes grow geometrically and keep levels manageable. -### How do I choose level sizes and the growth factor (T)? +### How do we choose level sizes and the growth factor (T)? - Strategy: @@ -312,7 +316,7 @@ An SSTable, a file stored on disk, is an immutable and sorted structure optimize - Select a growth factor T where each level is roughly T times the previous level. - Larger T reduces the number of levels (and total compaction passes) but increases individual level sizes and can increase read costs for certain patterns. -- Example: If your base level (L1) target is 1GB and T=10, then L2 target is 10GB, L3 is 100GB and so on. +- Example: If our base level (L1) target is 1GB and T=10, then L2 target is 10GB, L3 is 100GB and so on. - Tuning trade-offs: - Larger T -> fewer levels -> potentially lower write amplification -> larger compaction work per event -> potentially higher per-compaction latency impact. diff --git a/docs/databases/database-systems/relational-algebra.md b/docs/databases/database-systems/relational-algebra.md index ade720b6..65d9de44 100644 --- a/docs/databases/database-systems/relational-algebra.md +++ b/docs/databases/database-systems/relational-algebra.md @@ -4,7 +4,11 @@ sidebar_position: 2 # Relational Algebra - +:::tip[Status] + +This note is complete, reviewed, and considered stable. + +::: **Relational Algebra** is a formal system of query operations used on relational databases. It provides a mathematical foundation for understanding how database queries work and serves as the theoretical basis for SQL. Relational algebra operations allow us to manipulate relations (tables) and extract meaningful information from data. diff --git a/docs/databases/database-systems/tuple-oriented-storage.md b/docs/databases/database-systems/tuple-oriented-storage.md index 9a5b44ff..3718544a 100644 --- a/docs/databases/database-systems/tuple-oriented-storage.md +++ b/docs/databases/database-systems/tuple-oriented-storage.md @@ -4,6 +4,12 @@ sidebar_position: 4 # Tuple Oriented Storage +:::tip[Status] + +This note is complete, reviewed, and considered stable. + +::: + ## Storage Manager The Storage Manager is a critical component of a Database Management System (DBMS) responsible for managing the physical storage and retrieval of data. It acts as an interface between the DBMS and the underlying storage devices, such as hard disk drives (HDDs) or solid-state drives (SSDs). @@ -49,7 +55,7 @@ flowchart LR ### Database Page Components -Each database page can be thought of as four main components: the Header, the Slot Directory, the Data Area, and the Free Space. Together these components let the DBMS store variable-length tuples efficiently and handle inserts, updates, and deletes without layout changes to the schema. +Each database page can be thought of as four main components: the Header, the Slot Directory, the Data Area, and the Free Space. Together, these components let the DBMS store variable-length tuples efficiently and handle inserts, updates, and deletes without layout changes to the schema. #### Header @@ -104,7 +110,7 @@ This layout — the slot directory growing forward and the Data Area growing bac ### Considerations -- **Page Size:** The choice of page size can impact performance and storage efficiency. Larger pages may reduce the number of I/O operations but can also lead to wasted space if pages are not fully utilized. Also write operations can be slow if the page size is too large. +- **Page Size:** The choice of page size can impact performance and storage efficiency. Larger pages may reduce the number of I/O operations but can also lead to wasted space if pages are not fully utilized. Write operations can also be slow if the page size is too large. - **Page Organization:** The way data is organized within a page can affect retrieval efficiency. Techniques like B-trees, hash tables, and heap files are commonly used. - **Page Compression:** Compressing data within pages can reduce storage requirements and improve I/O performance. @@ -379,20 +385,26 @@ A tuple is essentially a sequence of bytes (these bytes do not have to be contig - Bit Map for NULL values. - Note that the DBMS does not need to store meta-data about the schema of the database here. - **Tuple Data:** Actual data for attributes. - - Attributes are typically stored in the order that you specify them when you create the table. + - Attributes are typically stored in the order that we specify them when we create the table. - Most DBMSs do not allow a tuple to exceed the size of a page. -- **Unique Identifier:** - - Each tuple in the database is assigned a unique identifier. - - Most common: page id + (offset or slot). - - An application cannot rely on these ids to mean anything. - -**Denormalized Tuple Data**: If two tables are related, the DBMS can “pre-join” them, so the tables end up on the same page. This makes reads faster since the DBMS only has to load in one page rather than two separate pages. However, it makes updates more expensive since the DBMS needs more space for each tuple. +- **Tuple Header:** Contains meta-data about the tuple. + - Visibility information for the DBMS’s concurrency control protocol (i.e., information about which transaction created/modified that tuple). + - Bit Map for NULL values. + - Note that the DBMS does not need to store meta-data about the schema of the database here. +- **Tuple Data:** Actual data for attributes. + - Attributes are typically stored in the order that we specify them when we create the table. + - Most DBMSs do not allow a tuple to exceed the size of a page. +- **Unique Identifier:** + - Each tuple in the database is assigned a unique identifier. + - Most common: page id + (offset or slot). + - An application cannot rely on these IDs to mean anything. + +**Denormalized Tuple Data**: If two tables are related, the DBMS can “pre-join” them, so the tables end up on the same page. This makes reads faster since the DBMS only has to load one page rather than two separate pages. However, it makes updates more expensive since the DBMS needs more space for each tuple. ## Large Attribute Storage (Overflow & External Storage) When an attribute value (for example, a large TEXT or BLOB column) cannot fit comfortably into the remaining free space on a page, DBMSs use a few common techniques to store it without violating page-size constraints: -- Inline first, overflow later: The DBMS tries to store a small prefix of the attribute inline and stores the remaining portion in overflow pages. The main tuple contains a pointer or descriptor to the overflow chain so the DBMS can reconstruct the full attribute when needed. - External storage (TOAST / LOB table): Some systems (e.g., PostgreSQL) store large attributes in a separate storage object (TOAST table or LOB store). The tuple stores a compact reference (pointer) to the external storage. - Overflow pages / chained pages: The DBMS stores the large attribute across multiple overflow pages and links them so the attribute can be streamed or reassembled by following pointers. - Compression & partial inline: The DBMS may compress the attribute or save a compressed chunk inline and use external storage for the rest. @@ -433,6 +445,6 @@ In MVCC systems, multiple versions of a tuple can exist concurrently. A tuple is ### Notes & trade-offs -- VACUUM is necessary to reclaim space and update statistics—the more frequently you vacuum, the less bloat and the better the optimizer statistics. +- VACUUM is necessary to reclaim space and update statistics—the more frequently we vacuum, the less bloat and the better the optimizer statistics. - Reclaiming/deleting tuples requires careful management of concurrency and durability (WAL) to ensure other transactions cannot read partially-deleted states. - Some reclamation processes only mark slots available for reuse (so physical layout doesn't change), while table-rewrite operations will physically compact data and change tuple offsets. From 3f536fed72cf652273d5424c92e434846973d351 Mon Sep 17 00:00:00 2001 From: Abhishek Chatterjee Date: Sun, 8 Feb 2026 14:39:18 +0530 Subject: [PATCH 2/2] IMDEEPMIND: 24: Lint fixes --- .../tuple-oriented-storage.md | 22 +++++++++---------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/databases/database-systems/tuple-oriented-storage.md b/docs/databases/database-systems/tuple-oriented-storage.md index 3718544a..e5ba7543 100644 --- a/docs/databases/database-systems/tuple-oriented-storage.md +++ b/docs/databases/database-systems/tuple-oriented-storage.md @@ -387,17 +387,17 @@ A tuple is essentially a sequence of bytes (these bytes do not have to be contig - **Tuple Data:** Actual data for attributes. - Attributes are typically stored in the order that we specify them when we create the table. - Most DBMSs do not allow a tuple to exceed the size of a page. -- **Tuple Header:** Contains meta-data about the tuple. - - Visibility information for the DBMS’s concurrency control protocol (i.e., information about which transaction created/modified that tuple). - - Bit Map for NULL values. - - Note that the DBMS does not need to store meta-data about the schema of the database here. -- **Tuple Data:** Actual data for attributes. - - Attributes are typically stored in the order that we specify them when we create the table. - - Most DBMSs do not allow a tuple to exceed the size of a page. -- **Unique Identifier:** - - Each tuple in the database is assigned a unique identifier. - - Most common: page id + (offset or slot). - - An application cannot rely on these IDs to mean anything. +- **Tuple Header:** Contains meta-data about the tuple. + - Visibility information for the DBMS’s concurrency control protocol (i.e., information about which transaction created/modified that tuple). + - Bit Map for NULL values. + - Note that the DBMS does not need to store meta-data about the schema of the database here. +- **Tuple Data:** Actual data for attributes. + - Attributes are typically stored in the order that we specify them when we create the table. + - Most DBMSs do not allow a tuple to exceed the size of a page. +- **Unique Identifier:** + - Each tuple in the database is assigned a unique identifier. + - Most common: page id + (offset or slot). + - An application cannot rely on these IDs to mean anything. **Denormalized Tuple Data**: If two tables are related, the DBMS can “pre-join” them, so the tables end up on the same page. This makes reads faster since the DBMS only has to load one page rather than two separate pages. However, it makes updates more expensive since the DBMS needs more space for each tuple.