rust-lang
diff --git a/‎src/SUMMARY.md‎
Lines changed: 12 additions & 1 deletion b/‎src/SUMMARY.md‎
Lines changed: 12 additions & 1 deletion
diff --git a/‎src/debuginfo/CodeView.pdf‎
209 KB b/‎src/debuginfo/CodeView.pdf‎
209 KB
diff --git a/‎src/debuginfo/debugger-internals.md‎
Lines changed: 14 additions & 0 deletions b/‎src/debuginfo/debugger-internals.md‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎src/debuginfo/debugger-visualizers.md‎
Lines changed: 111 additions & 0 deletions b/‎src/debuginfo/debugger-visualizers.md‎
Lines changed: 111 additions & 0 deletions
diff --git a/‎src/debuginfo/gdb-internals.md‎
Lines changed: 4 additions & 0 deletions b/‎src/debuginfo/gdb-internals.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎src/debuginfo/gdb-visualizers.md‎
Lines changed: 9 additions & 0 deletions b/‎src/debuginfo/gdb-visualizers.md‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎src/debuginfo/intro.md‎
Lines changed: 114 additions & 0 deletions b/‎src/debuginfo/intro.md‎
Lines changed: 114 additions & 0 deletions
@@ -232,11 +232,22 @@
     - [Debugging LLVM](./backend/debugging.md)
     - [Backend Agnostic Codegen](./backend/backend-agnostic.md)
     - [Implicit caller location](./backend/implicit-caller-location.md)
+- [Debug Info](./debuginfo/intro.md)
+    - [Rust Codegen](./debuginfo/rust-codegen.md)
+    - [LLVM Codegen](./debuginfo/llvm-codegen.md)
+    - [Debugger Internals](./debuginfo/debugger-internals.md)
+        - [LLDB Internals](./debuginfo/lldb-internals.md)
+        - [GDB Internals](./debuginfo/gdb-internals.md)
+    - [Debugger Visualizers](./debuginfo/debugger-visualizers.md)
+        - [LLDB - Python Providers](./debuginfo/lldb-visualizers.md)
+        - [GDB - Python Providers](./debuginfo/gdb-visualizers.md)
+        - [CDB - Natvis](./debuginfo/natvis-visualizers.md)
+    - [Testing](./debuginfo/testing.md)
+    - [(Lecture Notes) Debugging support in the Rust compiler](./debugging-support-in-rustc.md)
 - [Libraries and metadata](./backend/libs-and-metadata.md)
 - [Profile-guided optimization](./profile-guided-optimization.md)
 - [LLVM source-based code coverage](./llvm-coverage-instrumentation.md)
 - [Sanitizers support](./sanitizers.md)
-- [Debugging support in the Rust compiler](./debugging-support-in-rustc.md)
 
 ---
 
 
@@ -0,0 +1,14 @@
+# Debugger Internals
+
+It is the debugger's job to convert the debug info into an in-memory representation. Both the
+interpretation of the debug info and the in-memory representation are arbitrary; anything will do
+so long as meaningful information can be reconstructed while the program is running. The pipeline
+from raw debug info to usable types can be quite complicated.
+
+Once the information is in a workable format, the debugger front-end then must provide a way to
+interpret and display the data, a way for users to interact with it, and an API for extensibility.
+
+Debuggers are vast systems and cannot be covered completely here. This section will provide a brief
+overview of the subsystems directly relevant to the Rust debugging experience.
+
+Microsoft's debugging engine is closed source, so it will not be covered here.
@@ -0,0 +1,111 @@
+# Debugger Visualizers
+
+These are typically the last step before the debugger displays the information, but the results may
+be piped through a debug adapter such as an IDE's debugger API.
+
+The term "Visualizer" is a bit of a misnomer. The real goal isn't just to prettify the output, but
+to provide an interface for the user to interact with that is as useful as possible. In many cases
+this means reconstructing the original type as closely as possible to its Rust representation, but
+not always.
+
+The visualizer interface allows generating "synthetic children" - fields that don't exist in the
+debug info, but can be derived from invariants about the language and the type itself. A simple
+example is allowing one to interact with the elements of a `Vec<T>` instead of just it's `*mut u8`
+heap pointer, length, and capacity.
+
+## `rust-lldb`, `rust-gdb`, and `rust-windbg.cmd`
+
+These support scripts are distributed with Rust toolchains. They locate the appropriate debugger and
+the toolchain's visualizer scripts, then launch the debugger with the appropriate arguments to load
+the visualizer scripts before a debugee is launched/attached to.
+
+## `#![debugger_visualizer]`
+
+[This attribute][dbg_vis_attr] allows Rust library authors to include pretty printers for their
+types within the library itself. These pretty printers are of the same format as typical
+visualizers, but are embedded directly into the compiled binary. These scripts are loaded
+automatically by the debugger, allowing a seamless experience for users. This attribute currently
+works for GDB and natvis scripts.
+
+[dbg_vis_attr]: https://doc.rust-lang.org/reference/attributes/debugger.html#the-debugger_visualizer-attribute
+
+GDB python scripts are embedded in the `.debug_gdb_scripts` section of the binary. More information
+can be found [here](https://sourceware.org/gdb/current/onlinedocs/gdb.html/dotdebug_005fgdb_005fscripts-section.html). Rustc accomplishes this in [`rustc_codegen_llvm/src/debuginfo/gdb.rs`][gdb_rs]
+
+[gdb_rs]: https://github.com/rust-lang/rust/blob/main/compiler/rustc_codegen_llvm/src/debuginfo/gdb.rs
+
+Natvis files can be embedded in the PDB debug info using the [`/NATVIS` linker option][linker_opt],
+and have the [highest priority][priority] when a type is resolving which visualizer to use. The
+files specified by the attribute are collected into
+[`CrateInfo::natvis_debugger_visualizers`][natvis] which are then added as linker arguments in
+[`rustc_codegen_ssa/src/back/linker.rs`][linker_rs]
+
+[linker_opt]: https://learn.microsoft.com/en-us/cpp/build/reference/natvis-add-natvis-to-pdb?view=msvc-170
+[priority]: https://learn.microsoft.com/en-us/visualstudio/debugger/create-custom-views-of-native-objects?view=visualstudio#BKMK_natvis_location
+[natvis]: https://github.com/rust-lang/rust/blob/e0e204f3e97ad5f79524b9c259dc38df606ed82c/compiler/rustc_codegen_ssa/src/lib.rs#L212
+[linker_rs]: https://github.com/rust-lang/rust/blob/main/compiler/rustc_codegen_ssa/src/back/linker.rs#L1106
+
+LLDB is not currently supported, but there are a few methods that could potentially allow support in
+the future. Officially, the intended method is via a [formatter bytecode][bytecode]. This was
+created to offer a comparable experience to GDB's, but without  the safety concerns associated with
+embedding an entire python script. The opcodes are limited, but it works with `SBValue` and `SBType`
+in roughly the same way as python visualizer scripts. Implementing this would require writing some
+sort of DSL/mini compiler.
+
+[bytecode]: https://lldb.llvm.org/resources/formatterbytecode.html
+
+Alternatively, it might be possible to copy GDB's strategy entirely: create a bespoke section in the
+binary and embed a python script in it. LLDB will not load it automatically, but the python API does
+allow one to access the [raw sections of the debug info][SBSection]. With this, it may be possible
+to extract the python script from our bespoke section and then load it in during the startup of
+Rust's visualizer scripts.
+
+[SBSection]: https://lldb.llvm.org/python_api/lldb.SBSection.html#sbsection
+
+## Performance
+
+Before tackling the visualizers themselves, it's important to note that these are part of a
+performance-sensitive system. Please excuse the break in formality, but: if I have to spend
+significant time debugging, I'm annoyed. If I have to *wait on my debugger*, I'm pissed.
+
+Every millisecond spent in these visualizers is a millisecond longer for the user to see output.
+This can be especially painful for large stackframes that contain many/large container types.
+Debugger GUI's such as VSCode will request the whole stack frame at once, and this can result in
+delays of tens of seconds (or even minutes) before being able to interact with any variables in the
+frame.
+
+There is a tendancy to balk at the idea of optimizing Python code, but it really can have a
+substantial impact. Remember, there is no compiler to help keep the code fast. Even simple
+transformations are not done for you. It can be difficult to find Python performance tips through
+all the noise of people suggesting you don't bother optimizing Python, so here are some things to
+keep in mind that are relevant to these scripts:
+
+* Everything allocates, even `int`
+* Use tuples when possible. `list` is effectively `Vec<Box<[Any]>>`, whereas tuples are equivalent
+to `Box<[Any]>`. They have one less layer of indirection, don't carry extra capacity and can't
+grow/shrink which can be advantageous in many cases. An additional benefit is that Python caches and
+recycles the underlying allocations of all tuples up to size 20.
+* Regexes are slow and should be avoided when simple string manipulation will do
+* Strings are immutable, thus many string operations implictly copy the contents.
+* When concatenating large lists of strings, `"".join(iterable_of_strings)` is typically the fastest
+way to do it.
+* f-strings are generally the fastest way to do small, simple string transformations such as
+surrounding a string with parentheses.
+* The act of calling a function is somewhat slow (even if the function is completely empty). If the
+code section is very hot, consider inlining the function manually.
+* Local variable access is significantly faster than global and built-in function access
+* Member/method access via the `.` operator is also slow, consider reassigning deeply nested values
+to local variables to avoid this cost (e.g. `h = a.b.c.d.e.f.g.h`).
+* Accessing inherited methods and fields is about 2x slower than base-class methods and fields.
+Avoid inheritance whenever possible.
+* Use [`__slots__`](https://wiki.python.org/moin/UsingSlots) wherever possible. `__slots__` is a way
+to indicate to Python that your class's fields won't change and speeds up field access by a
+noticable amount. This does require you to name your fields in advance and initialize them in
+`__init__`, but it's a small price to pay for the benefits.
+* Match statements/if..elif..else are not optimized in any way. The conditions are checked in order,
+1 by 1. If possible, use an alternative such as dictionary dispatch or a table of values
+* Compute lazily when possible
+* List comprehensions are typically faster than loops, generator comprehensions are a bit slower
+than list comprehensions, but use less memory. You can think of comprehensions as equivalent to
+Rust's `iter.map()`. List comprehensions effectively call `collect::<Vec<_>>` at the end, whereas
+generator comprehensions do not.
@@ -0,0 +1,4 @@
+# (WIP) GDB Internals
+
+GDB's Rust support lives at `gdb/rust-lang.h` and `gdb/rust-lang.c`. The expression parsing support
+can be found in `gdb/rust-exp.h` and `gdb/rust-parse.c`
@@ -0,0 +1,9 @@
+# (WIP) GDB - Python Providers
+
+Below are links to relevant parts of the GDB documentation
+
+* [Overview on writing a pretty printer](https://sourceware.org/gdb/current/onlinedocs/gdb.html/Writing-a-Pretty_002dPrinter.html#Writing-a-Pretty_002dPrinter)
+* [Pretty Printer API](https://sourceware.org/gdb/current/onlinedocs/gdb.html/Pretty-Printing-API.html#Pretty-Printing-API) (equivalent to LLDB's `SyntheticProvider`)
+* [Value API](https://sourceware.org/gdb/current/onlinedocs/gdb.html/Values-From-Inferior.html#Values-From-Inferior) (equivalent to LLDB's `SBValue`)
+* [Type API](https://sourceware.org/gdb/current/onlinedocs/gdb.html/Types-In-Python.html#Types-In-Python) (equivalent to LLDB's `SBType`)
+* [Type Printing API](https://sourceware.org/gdb/current/onlinedocs/gdb.html/Type-Printing-API.html#Type-Printing-API) (equivalent to LLDB's `SyntheticProvider.get_type_name`)
@@ -0,0 +1,114 @@
+# Debug Info
+
+Debug info is a collection of information generated by the compiler that allows debuggers to
+correctly interpret the state of a program while it is running. That includes things like mapping
+instruction addresses to lines of code in the source file, and type layout information so that
+bytes in memory can be read and displayed in a meaningful way.
+
+Debug info can be a slightly overloaded term, covering all the layers between Rust MIR, and the
+end-user seeing the output of their debugger onscreen. In brief, the stack from beginning to end is
+as follows:
+
+1. Rustc inspects the MIR and communicates the relevant source, symbol, and type information to LLVM
+2. LLVM translates this information into a target-specific debug info format during compilation
+3. A debugger reads and interprets the debug info, mapping source-lines and allowing the debugee's
+variables in memory to be located and read with the correct layout
+4. Built-in debugger formatting and styling is applied to variables
+5. User-defined scripts are run, formatting and styling the variables further
+6. The debugger frontend displays the variable to the user, possibly through the means of additional
+API layers (e.g. VSCode extension by way of the
+[Debug Adapter Protocol](https://microsoft.github.io/debug-adapter-protocol/))
+
+
+> NOTE: This subsection of the dev guide is perhaps more detailed than necessary. It aims to collect
+> a large amount of scattered information into one place and equip the reader with as firm a grasp of
+> the entire debug stack as possible.
+>
+> If you are only interested in working on the visualizer
+> scripts, the information in the [debugger-visualizers](./debugger-visualizers.md) and
+> [testing](./testing.md) will suffice. If you need to make changes to Rust's debug node generation,
+> please see [rust-codegen](./rust-codegen.md). All other sections are supplementary, but can be
+> vital to understanding some of the compromises the visualizers or codegen need to make. It can
+> also be valuable to know when a problem might be better solved in LLVM or the debugger itself.
+
+# DWARF
+
+The is the primary debug info format for `*-gnu` targets. It is typically bundled in with the
+binary, but it [can be generated as a separate file](https://gcc.gnu.org/wiki/DebugFission). The
+DWARF standard is available [here](https://dwarfstd.org/).
+
+> NOTE: To inspect DWARF debug info, [gimli](https://crates.io/crates/gimli) can be used
+> programatically. If you prefer a GUI, the author recommends [DWEX](https://github.com/sevaa/dwex)
+
+# PDB/CodeView
+
+The primary debug info format for `*-msvc` targets. PDB is a proprietary container format created by
+Microsoft that, unfortunately,
+[has multiple meanings](https://docs.rs/ms-pdb/0.1.10/ms_pdb/taster/enum.Flavor.html).
+We are concerned with ordinary PDB files, as Portable PDB is used mainly for .Net applications. PDB
+files are separate from the compiled binary and use the `.pdb` extension.
+
+PDB files contain CodeView objects, equivalent to DWARF's tags. CodeView, the debugger that
+consumed CodeView objects, was originally released in 1985. Its original intent was for C debugging,
+and was later extended to support Visual C++. There are still minor alterations to the format to
+support modern architectures and languages, but many of these changes are undocumented and/or
+sparsely used.
+
+It is important to keep this context in mind when working with CodeView objects. Due to its origins,
+the "feature-set" of these objects is very limited, and focused around the core features of C. It
+does not have many of the convenience or features of modern DWARF standards. A fair number of
+workarounds exist within the debug info stack to compensate for CodeView's shortcomings.
+
+Due to its proprietary nature, it is very difficult to find information about PDB and CodeView. Many
+of the sources were made at vastly different times and contain incomplete or somewhat contradictory
+information. As such this page will aim to collect as many sources as possible.
+
+* [CodeView 1.0 specification](./CodeView.pdf)
+* LLVM
+    * [CodeView Overview](https://llvm.org/docs/SourceLevelDebugging.html#codeview-debug-info-format)
+    * [PDB Overview and technical details](https://llvm.org/docs/PDB/index.html)
+* Microsoft
+    * [microsoft-pdb](https://github.com/microsoft/microsoft-pdb) - A C/C++ implementation of a PDB
+    reader. The implementation does not contain the full PDB or CodeView specification, but does
+    contain enough information for other PDB consumers to be written. At time of writing (Nov 2025),
+    this repo has been archived for several years.
+    * [pdb-rs](https://github.com/microsoft/pdb-rs/) - A Rust-based PDB reader and writer based on
+    other publicly-available information. Does not guarantee stability or spec compliance. Also
+    contains `pdbtool`, which can dump PDB files (`cargo install pdbtool`)
+    * [Debug Interface Access SDK](https://learn.microsoft.com/en-us/visualstudio/debugger/debug-interface-access/getting-started-debug-interface-access-sdk).
+    While it does not document the PDB format directly, details can be gleaned from the interface
+    itself.
+
+# Debuggers
+
+Rust supports 3 major debuggers: GDB, LLDB, and CDB. Each has its own set of requirements,
+limitations, and quirks. This unfortunately creates a large surface area to account for.
+
+> NOTE: CDB is a proprietary debugger created by Microsoft. The underlying engine also powers
+>WinDbg, KD, the Microsoft C/C++ extension for VSCode, and part of the Visual Studio Debugger. In
+>these docs, it will be referred to as CDB for consistency
+
+While GDB and LLDB do offer facilities to natively support Rust's value layout, this isn't
+completely necessary. Rust currently outputs debug info very similar to that of C++, allowing
+debuggers without Rust support to work with a slightly degraded experience. More detail will be
+included in later sections, but here is a quick reference for the capabilities of each debugger:
+
+| Debugger | Debug Info Format | Native Rust support | Expression Style | Visualizer Scripts |
+| --- | --- | --- | --- | --- |
+| GDB | DWARF | Full | Rust | Python |
+| LLDB | DWARF and PDB | Partial | C/C++ | Python |
+| CDB | PDB | None | C/C++ | Natvis |
+
+> IMPORTANT: CDB can be assumed to run only on Windows. No assumptions can be made about the OS
+>running GDB or LLDB.
+
+## Unsupported
+
+Below, are several unsupported debuggers that are of particular note due to their potential impact
+in the future.
+
+* [Bugstalker](https://github.com/godzie44/BugStalker) is an x86-64 Linux debugger written in Rust,
+specifically to debug Rust programs. While promising, it is still in early development.
+* [RAD Debugger](https://github.com/EpicGamesExt/raddebugger) is a Windows-only GUI debugger. It has
+a custom debug info format that PDB is translated into. The project also includes a linker that can
+generate their new debug info format during the linking phase.