Skip to content

RFC: Redesign of Lean FS API #13638

@algebraic-dev

Description

@algebraic-dev

Summary

This proposal redesigns Lean's FS and Path API using LibUV, replacing a lot of types, making them more complete and removing the FILE* dependency in the C++ side. It also adds directory traversals, metadata inspection, some useful functions like copying without reading the entire file into memory and a way to integrate with Std.Async in a clean way. This design changes FS.Handle and FS.Stream to a layer approach with more low-level changes that requires the user to do synchronization and buffering on the Lean side.

Since these things will depend on Std abstractions, it lives under Std.FS, Std.Async.FS, and Std.Path.

Migration

A lot of functions are just going to be moved to other namespaces like IO.FS.readFile that goes to Std.FS.readFile. The Handle type will just be split into multiple low level ones like File, Dir, Handle (a smaller version with uv_pipe_t and uv_tty_t).

Old New Notes
System.FilePath Std.Path
IO.FS.Handle Std.FS.File for regular files opened by path
IO.FS.Handle Std.FS.Dir for regular directories opened by path
IO.FS.Handle IO.Handle for stdin/stdout/stderr and IPC pipes
IO.FS.Stream IO.Stream Keep the abstraction for LSP capture of the Stdout
IO.FS.Handle.putStr / IO.FS.Handle.putStrLn Std.FS.File.putStr / Std.FS.File.putStrLn same semantics
IO.FS.Handle.isEof no direct equivalent; File.readAt returns empty at EOF
IO.FS.readFile Std.FS.readFile same semantics
IO.FS.writeFile Std.FS.writeFile same semantics
IO.FS.readBinFile Std.FS.readBinFile same semantics
IO.FS.writeBinFile Std.FS.writeBinFile same semantics
IO.FS.lines Std.FS.lines same semantics
IO.FS.hardLink Std.FS.hardLink same semantics

Async Integration

uv_fs_* operations can be used asynchronously and synchronously (by specifying a NULL loop and callback), so with a Std/Async/FS we can just add operations in the namespace like .readAsync that will return a Promise instead of blocking. But, the exception is file locking with flock (POSIX) and LockFileEx (Windows) that are both blocking syscalls with no libuv equivalents, they SHOULDN'T run on the event loop thread so we schedule another work thread just for them using uv_queue_work and after that it resumes a Promise for Async Operations. As flock is advisory it does not interfere with any of the operations of libuv and thus, is safe to use with another flocks.

Concurrency Model

All raw IO types (File, Handle, Pipe, Dir) are not thread-safe by default. Concurrency and parallelism safety is achieved through explicit wrappers like Mutex α and RecursiveMutex α.

Core Abstractions

Paths and Filesystem Entries

  • Path: Replacement for FilePath, supporting: startsWith, stripPrefix, relativeTo, and glob.
  • Dir: An open directory handle.
  • DirEntry: A single filesystem entry produced during directory iteration.
  • Metadata: Filesystem metadata for files, directories, or special entries.
  • FileType: Enumeration of entry kinds: file, dir, symlink, blockDevice, charDevice, fifo, socket, unknown.
  • File: A thin wrapper around uv_file. Not thread-safe by default, concurrent access must be explicitly synchronized using Mutex.
  • BufferedReader α: Buffered wrapper around any readable type.
  • BufferedWriter: Buffered writer for File.
  • LineWriter: A writer wrapper that flushes automatically on newline characters (\n). Used by Stdout.

Handles and Streams

  • Handle: Abstract representation of a system stream endpoint (uv_tty_t or uv_pipe_t).
  • Stream: A record of closures that abstracts over any readable/writable endpoint.
  • Pipe: Wrapper around uv_pipe_t for anonymous and named pipes.

Type Classes

  • Read: Typeclass for types that support reading bytes; provides readAt : α → (n : USize) → ByteArray → IO ByteArray. Implemented by File, Handle, and Pipe.
  • Write: Typeclass for types that support writing bytes; provides write : α → ByteArray → IO Unit. Implemented by File, Handle, and Pipe.

Detailed Explanation

Iterators

Some operations return IterM (defined in Std/Data/Iterators) rather than eagerly collected Arrays.

Iterator State Type Element Used by
DirIterator DirEntry Dir.iter
WalkIterator DirEntry FS.walk
GlobIterator Path FS.glob

FileType

inductive FileType where
  | file        -- regular file
  | dir         -- directory
  | symlink     -- symbolic link
  | blockDevice -- block device (e.g. disk)
  | charDevice  -- character device (e.g. /dev/null)
  | fifo        -- named pipe (FIFO)
  | socket      -- Unix domain socket
  | unknown     -- type not reported by the OS (e.g. some network filesystems)

OpenMode

OpenMode is a struct specifying how a file is opened.

Fields

Field Type Description
OpenMode.read Bool Allow reads
OpenMode.write Bool Allow writes
OpenMode.append Bool All writes go to end-of-file; incompatible with truncate
OpenMode.truncate Bool Truncate to zero on open; requires write
OpenMode.create Bool Create the file if it does not exist (O_CREAT)
OpenMode.createNew Bool Create the file, failing if it already exists (O_CREAT | O_EXCL); guarantees exclusive creation
OpenMode.custom Option USize Pass raw OS-level flags directly; merged with the flags derived from the other fields. Use when no predefined field covers the required behavior.

Permissions

AccessRight and FileRight keep the same shape as IO.AccessRight/IO.FileRight in the current API, moved to Std.FS.

structure AccessRight where
  /-- The file can be read. -/
  read      : Bool := false
  /-- The file can be written to. -/
  write     : Bool := false
  /-- The file can be executed. -/
  execution : Bool := false

structure FileRight where
  /-- The owner's permissions to access the file. -/
  user  : AccessRight := {}
  /-- The assigned group's permissions to access the file. -/
  group : AccessRight := {}
  /-- The permissions that all others have to access the file. -/
  other : AccessRight := {}
Name Type Description
FileRight.flags FileRight → UInt32 Convert to a raw POSIX bit field (for chmod, etc.)
FileRight.default FileRight 0o644 — owner read/write; group and other read
FileRight.defaultDir FileRight 0o755 — owner read/write/execute; group and other read/execute

Path Type

Std.Path replaces FilePath in the new API. It's almost the same API with some functionalities added like checking for relative paths, removing prefix paths, and glob matching.

Function Type Description
Path.mk String → Path Construct from raw string
Path.ofList List String → Path Build from component list
Path.join / / Path → Path → Path Join two path segments
Path.normalize Path → Path Normalize separators; uppercase drive on Windows
Path.isAbsolute Path → Bool True if path starts at root or a drive letter
Path.isRelative Path → Bool True if path is not absolute
Path.parent Path → Option Path Parent directory; none for root
Path.fileName Path → Option String Last component; none for ., .., or root
Path.fileStem Path → Option String Filename without the last extension
Path.extension Path → Option String Last extension without the leading .
Path.withFileName Path → String → Path Replace the last component
Path.withExtension Path → String → Path Replace the last extension
Path.addExtension Path → String → Path Append an extension without removing existing ones
Path.components Path → List String Split into path components
Path.startsWith Path → Path → Bool True if the path has the given prefix
Path.stripPrefix Path → Path → Option Path Remove a prefix; none if not present
Path.endsWith Path → Path → Bool True if path ends with the given relative path or filename
Path.hasExtension Path → Bool True if the filename has any extension
Path.relativeTo (base target : Path) → Option Path Lexical relative path from base to target; none if on different volumes (Windows)
Path.matchesGlob Path → String → Bool True if the path's file name matches a shell glob pattern (*, ?, [...]); * does not cross separator

File Type

File is a wrapper around uv_file with no buffering and no built-in lock. If buffering or locking is needed, wrap with Mutex (BufferedWriter File) or call File.lock.

Function Type Description Operation
File.open Path → (mode : OpenMode := .read) → IO File Open an existing file. Fails if the file does not exist. Default mode is read-only; pass .readWrite to open for reading and writing without truncation. uv_fs_open
File.create Path → (mode : OpenMode := .create) → (perm : FileRight := .default) → IO File Create or open a file. uv_fs_open
File.withFile Path → (mode : OpenMode := .read) → (File → IO α) → IO α Open a file, run an action, close in a finally block.
File.close File → IO Unit Explicitly close the file. Prefer withFile or explicit close. Closing does not call fsync; use syncAll before closing for durability. uv_fs_close
File.syncAll File → IO Unit Flush data and metadata to the device (fsync). uv_fs_fsync
File.syncData File → IO Unit Flush data only, skipping metadata (fdatasync). Cheaper when durability of timestamps/size is not required. uv_fs_fdatasync
File.lock File → (exclusive : Bool := true) → IO Unit Acquire a shared or exclusive lock, blocking until available. uv_queue_work + (LockFileEx on Windows, flock on POSIX)
File.tryLock File → (exclusive : Bool := true) → IO Bool Try to acquire a lock without blocking. Returns false immediately if held by another process. (LockFileEx on Windows, flock on POSIX)
File.unlock File → IO Unit Release the lock. Idempotent; succeeds even if no lock is held. UnlockFileEx on Windows, flock on POSIX
File.atomically File → (exclusive : Bool := true) → IO α → IO α Lock, run action, unlock in finally. Uses File.lock/File.unlock.
File.readAt File → (offset : UInt64) → (n : USize) → (buf : ByteArray) → IO ByteArray Read up to n bytes at offset into buf without moving the cursor (pread). Returns the filled slice; use the return value, not buf, after the call. uv_fs_read
File.writeAt File → (offset : UInt64) → ByteArray → IO Unit Write at offset uv_fs_write
File.write File → ByteArray → IO Unit Write at the current cursor position.
File.setLength File → (len : UInt64) → IO Unit Truncate or extend the file to exactly len bytes. uv_fs_ftruncate
File.metadata File → IO Metadata Return metadata for the open file. Avoids TOCTOU vs Path.metadata. uv_fs_fstat
File.setPermissions File → FileRight → IO Unit Set the file's permission bits. uv_fs_fchmod
File.setTimes File → (accessed : Timestamp) → (modified : Timestamp) → IO Unit Set access and modification timestamps. uv_fs_futime
File.chown File → (uid gid : UInt32) → IO Unit Change the owner and group of the open file. On Windows this is a noop. uv_fs_fchown

Handle Type

Handle wraps uv_tty_t or uv_pipe_t.

Distinction from File. uv_guess_handle can return UV_FILE when stdio is redirected to a regular file (e.g. ./program > out.txt). In that case, Handle dispatches reads and writes via uv_fs_read/uv_fs_write internally, but it is still typed as a Handle, not a File.

The standard handles can wrap Handle as follows:

  • Stdin: Mutex (BufferedReader Handle)
  • Stdout: RecursiveMutex (LineWriter Handle)
  • Stderr: Mutex Handle
Function Type Description libuv
Handle.read Handle → (n : USize) → IO ByteArray Read up to n bytes. Returns empty at EOF. uv_read_start / uv_fs_read
Handle.write Handle → ByteArray → IO Unit Write bytes. uv_write / uv_fs_write
Handle.flush Handle → IO Unit Flush any buffered output. No-op for unbuffered handles.
Handle.close Handle → IO Unit Close the handle and release resources. uv_close / uv_fs_close
Handle.isTty Handle → BaseIO Bool Return true if the handle refers to a terminal. uv_guess_handle
Handle.isPipe Handle → BaseIO Bool Return true if the handle refers to a pipe. uv_guess_handle
Handle.isFile Handle → BaseIO Bool Return true if the handle refers to a file. uv_guess_handle

Dir

Dir holds a uv_dir_t.

Function Type Description libuv
Dir.open Path → IO Dir Open a directory for iteration. uv_fs_opendir
Dir.withDir Path → (Dir → IO α) → IO α Open a directory, run an action, close in a finally block. uv_fs_opendir + uv_fs_closedir
Dir.close Dir → IO Unit Explicitly close the directory handle. uv_fs_closedir
Dir.next Dir → IO (Option DirEntry) Return the next entry, or none when exhausted. Order is filesystem-defined. uv_fs_readdir
Dir.iter Dir → IO (IterM (α := DirIterator) IO DirEntry) Lazy iterator over directory entries. Each step calls readdir. Works with for entry in dir.iter do and all IterM combinators. uv_fs_readdir
Dir.metadata Dir → IO Metadata Return metadata for the directory itself. uv_fs_fstat

DirEntry

DirEntry is produced by Dir.next. It holds the parent Dir so its open methods can construct full paths as dir.path / entry.fileName. It already exists as IO.FS.DirEntry so it's included here for completeness.

Function Type Description libuv
DirEntry.fileName DirEntry → String The entry name within its parent directory.
DirEntry.path DirEntry → Path Full path, constructed as dir.path / entry.fileName.
DirEntry.fileType DirEntry → IO FileType Return the file type. backed by uv_dirent_t
DirEntry.isDir DirEntry → IO Bool Return true if the entry is a directory. Convenience wrapper around fileType. uv_fs_readdir / uv_fs_stat
DirEntry.metadata DirEntry → IO Metadata Return full metadata for the entry. Always issues a stat call; use fileType when only the type is needed.

FS Operations

These functions operate on the filesystem by path. They live in the FS namespace rather than Path because Path is a pure value type for path manipulation; IO operations belong in FS.

Function Type Description libuv
FS.copyFile Path → Path → IO Unit Copy a file. uv_fs_copyfile
FS.removeFile Path → IO Unit Delete a file. uv_fs_unlink
FS.removeDir Path → IO Unit Remove an empty directory. Fails if the directory is not empty. uv_fs_rmdir
FS.removeDirAll Path → IO Unit Remove a directory and all its contents recursively. uv_fs_scandir + uv_fs_scandir_next + uv_fs_unlink + uv_fs_rmdir
FS.createDir Path → (perm : FileRight := .defaultDir) → IO Unit Create a directory. Parent must exist. perm sets the initial mode bits (default 0o755). uv_fs_mkdir
FS.createDirAll Path → (perm : FileRight := .defaultDir) → IO Unit Create a directory and all missing parent directories. No-op if the directory already exists. perm is applied to newly created directories only. uv_fs_mkdir (repeated)
FS.rename Path → Path → IO Unit Rename or move a file or directory. uv_fs_rename
FS.hardLink (orig link : Path) → IO Unit Create a hard link at link pointing to orig. Both paths must be on the same filesystem. uv_fs_link
FS.copyDir (src dst : Path) → IO Unit Recursively copy a directory tree from src to dst. dst must not exist; creates it with the same permission bits as src. Files are copied via uv_fs_copyfile. Symlinks are recreated verbatim rather than followed. uv_fs_copyfile + uv_fs_scandir
FS.chown Path → (uid gid : UInt32) → IO Unit Change the owner and group of the file or directory at path. Follows symlinks. On Windows this is a no-op. uv_fs_chown
FS.lchown Path → (uid gid : UInt32) → IO Unit Like FS.chown but operates on the symlink itself rather than its target. On Windows this is a no-op. uv_fs_lchown
FS.truncate Path → (len : UInt64) → IO Unit Truncate or extend the file at path to exactly len bytes without opening it first. Follows symlinks. Complement to File.setLength for callers that do not have an open fd. uv_fs_truncate

Convenience

Function Type Description libuv
FS.readFile Path → IO String Read an entire UTF-8 file into a string. Fails on invalid UTF-8. uv_fs_open + uv_fs_read + uv_fs_close
FS.readBinFile Path → IO ByteArray Read an entire file into a ByteArray. uv_fs_open + uv_fs_read + uv_fs_close
FS.lines Path → IO (Array String) Read all lines of a UTF-8 file into an array. Implemented via BufferedReader. uv_fs_open + uv_fs_read + uv_fs_close
FS.writeFile Path → String → IO Unit Write a UTF-8 string to a file, creating or truncating it. uv_fs_open + uv_fs_write + uv_fs_close
FS.writeBinFile Path → ByteArray → IO Unit Write bytes to a file, creating or truncating it. uv_fs_open + uv_fs_write + uv_fs_close
FS.appendFile Path → ByteArray → IO Unit Append bytes to a file, creating it if it does not exist. uv_fs_open + uv_fs_write + uv_fs_close
FS.appendTextFile Path → String → IO Unit Append a UTF-8 string to a file, creating it if it does not exist. uv_fs_open + uv_fs_write + uv_fs_close
FS.readDir Path → IO (Array DirEntry) List all entries in a directory. Order is filesystem-defined; use FS.readDirSorted for stable ordering. uv_fs_scandir + uv_fs_scandir_next
FS.readDirSorted Path → IO (Array DirEntry) Like FS.readDir but sorted by name. uv_fs_scandir + uv_fs_scandir_next

Temporary Files

The default dir is resolved via IO.tempDir, not hardcoded to "/tmp", so these functions work correctly on Windows (%TEMP%) and platforms where $TMPDIR is set.

Function Type Description libuv
FS.createTempFile (dir : Option Path := none) → IO (File × Path) Create a secure temporary file. none resolves dir to IO.tempDir. Caller is responsible for deleting it. uv_fs_mkstemp
FS.createTempDir (dir : Option Path := none) → IO Path Create a secure temporary directory. none resolves dir to IO.tempDir. Caller is responsible for deleting it. uv_fs_mkdtemp
FS.withTempFile (dir : Option Path := none) → (File → Path → IO α) → IO α Create a temporary file, run an action, delete it in a finally block. uv_fs_mkstemp + uv_fs_unlink
FS.withTempDir (dir : Option Path := none) → (Path → IO α) → IO α Create a temporary directory, run an action, delete it recursively in a finally block. uv_fs_mkdtemp + FS.removeDirAll

Symlinks

The current API has symlinkMetadata (reads metadata without following the link), but no way to create symlinks or read their targets. hardLink is in FS Operations.

Function Type Description libuv
FS.createSymlink (target : Path) → (link : Path) → (dir : Bool := false) → IO Unit Create a symbolic link at link pointing to target. target is stored verbatim and need not exist at creation time. The dir flag is required on Windows (UV_FS_SYMLINK_DIR) when the target is a directory; on POSIX it is ignored. uv_fs_symlink
FS.readSymlink Path → IO Path Read the raw target of a symbolic link without resolving it. Contrast with FS.realPath which follows the full chain. uv_fs_readlink

Metadata

Timestamps use Std.Time.Timestamp. creationTime is Option Timestamp because Linux does not expose file creation time.

structure Metadata where
  accessed     : Timestamp
  modified     : Timestamp
  creationTime : Option Timestamp
  byteSize     : UInt64
  type         : FileType
  numLinks     : UInt64
  permissions  : FileRight
  inode        : Option UInt64  -- none on FAT32 and some network filesystems
  device       : Option UInt64  -- none on FAT32 and some network filesystems
  uid          : Option UInt32  -- owner user ID; none on Windows
  gid          : Option UInt32  -- owner group ID; none on Windows
Function Type Description libuv
FS.metadata Path → IO Metadata Return metadata for a path, following symlinks. uv_fs_stat
FS.symlinkMetadata Path → IO Metadata Return metadata for a path without following the final symlink. uv_fs_lstat
FS.isDir Path → BaseIO Bool Return true if the path exists and is a directory. Returns false on any error. uv_fs_stat
FS.isFile Path → BaseIO Bool Return true if the path exists and is a regular file. Returns false on any error. uv_fs_stat
FS.isSymlink Path → BaseIO Bool Return true if the path is a symbolic link without following it. Returns false on any error. uv_fs_lstat
FS.pathExists Path → BaseIO Bool Return true if the path exists (as any file type). Returns false on any error. uv_fs_stat
FS.getPermissions Path → IO FileRight Return permission bits by path. Follows symlinks. uv_fs_stat
FS.setPermissions Path → FileRight → IO Unit Set permission bits by path. Follows symlinks. uv_fs_chmod
FS.setTimes Path → (accessed : Timestamp) → (modified : Timestamp) → IO Unit Set both access and modification timestamps by path. uv_fs_utime
FS.realPath Path → IO Path Resolve a path to its absolute canonical form, following all symlinks and collapsing ./... uv_fs_realpath
File.getPermissions File → IO FileRight Return the open file's current permission bits. uv_fs_fstat
Metadata.sameFile Metadata → Metadata → Bool Return true if two Metadata values refer to the same underlying file, compared by inode and device. Returns false if either has no inode (e.g. FAT32).

Directory Utilities

Function Type Description libuv
FS.glob Path → String → IO (IterM (α := GlobIterator) IO Path) Lazy iterator over all paths under path matching a glob pattern. Supports *, **, ?. Symlinks matching the pattern are yielded; ** does not descend into symlinked directories. uv_fs_scandir + uv_fs_scandir_next
FS.walk Path → IO (IterM (α := WalkIterator) IO DirEntry) Lazy recursive directory walk. uv_fs_scandir + uv_fs_scandir_next

Metadata

Metadata

Assignees

No one assigned

    Labels

    RFCRequest for comments

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions