Skip to content
Open
3 changes: 1 addition & 2 deletions Documentation/filesystems/iomap/operations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@ The following address space operations can be wrapped easily:
* ``readahead``
* ``writepages``
* ``bmap``
* ``swap_activate``

``struct iomap_write_ops``
--------------------------
Expand Down Expand Up @@ -747,7 +746,7 @@ function.
Swap File Activation
====================

The ``iomap_swapfile_activate`` function finds all the base-page aligned
The ``iomap_swap_activate`` function finds all the base-page aligned
regions in a file and sets them up as swap space.
The file will be ``fsync()``'d before activation.
``IOMAP_REPORT`` will be passed as the ``flags`` argument to
Expand Down
35 changes: 16 additions & 19 deletions Documentation/filesystems/locking.rst
Original file line number Diff line number Diff line change
Expand Up @@ -264,9 +264,6 @@ prototypes::
int (*launder_folio)(struct folio *);
bool (*is_partially_uptodate)(struct folio *, size_t from, size_t count);
int (*error_remove_folio)(struct address_space *, struct folio *);
int (*swap_activate)(struct swap_info_struct *sis, struct file *f, sector_t *span)
int (*swap_deactivate)(struct file *);
int (*swap_rw)(struct kiocb *iocb, struct iov_iter *iter);

locking rules:
All except dirty_folio and free_folio may block
Expand All @@ -289,9 +286,6 @@ migrate_folio: yes (both)
launder_folio: yes
is_partially_uptodate: yes
error_remove_folio: yes
swap_activate: no
swap_deactivate: no
swap_rw: yes, unlocks
====================== ======================== ========= ===============

->write_begin(), ->write_end() and ->read_folio() may be called from
Expand Down Expand Up @@ -350,19 +344,6 @@ cleaned, or an error value if not. Note that in order to prevent the folio
getting mapped back in and redirtied, it needs to be kept locked
across the entire operation.

->swap_activate() will be called to prepare the given file for swap. It
should perform any validation and preparation necessary to ensure that
writes can be performed with minimal memory allocation. It should call
add_swap_extent(), or the helper iomap_swapfile_activate(), and return
the number of extents added. If IO should be submitted through
->swap_rw(), it should set SWP_FS_OPS, otherwise IO will be submitted
directly to the block device ``sis->bdev``.

->swap_deactivate() will be called in the sys_swapoff()
path after ->swap_activate() returned success.

->swap_rw will be called for swap IO if SWP_FS_OPS was set by ->swap_activate().

file_lock_operations
====================

Expand Down Expand Up @@ -503,6 +484,9 @@ prototypes::
struct file *file_out, loff_t pos_out,
loff_t len, unsigned int remap_flags);
int (*fadvise)(struct file *, loff_t, loff_t, int);
int (*swap_activate)(struct file *file, struct swap_info_struct *sis);
int (*swap_deactivate)(struct file *);
int (*swap_rw)(struct kiocb *iocb, struct iov_iter *iter);

locking rules:
All may block.
Expand Down Expand Up @@ -555,6 +539,19 @@ used. To block changes to file contents via a memory mapping during the
operation, the filesystem must take mapping->invalidate_lock to coordinate
with ->page_mkwrite.

->swap_activate() is called to prepare the given file for swap. It should
perform any validation and preparation necessary to ensure that writes can be
performed with minimal memory allocation. It should call add_swap_extent(),
or the helper iomap_swap_activate(), and return the number of extents added.
If IO should be submitted through ->swap_rw(), the file system must set
SWP_FS_OPS from ->swap_activate(), otherwise IO will be submitted directly to
the block device ``sis->bdev``.

->swap_deactivate() is called from the swapoff path to disable a swapfile
successfully activated using ->swap_activate().

->swap_rw will be called for swap IO if SWP_FS_OPS was set by ->swap_activate().

dquot_operations
================

Expand Down
40 changes: 20 additions & 20 deletions Documentation/filesystems/vfs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -774,9 +774,6 @@ cache in your filesystem. The following members are defined:
size_t count);
void (*is_dirty_writeback)(struct folio *, bool *, bool *);
int (*error_remove_folio)(struct mapping *mapping, struct folio *);
int (*swap_activate)(struct swap_info_struct *sis, struct file *f, sector_t *span)
int (*swap_deactivate)(struct file *);
int (*swap_rw)(struct kiocb *iocb, struct iov_iter *iter);
};

``read_folio``
Expand Down Expand Up @@ -970,23 +967,6 @@ cache in your filesystem. The following members are defined:
Setting this implies you deal with pages going away under you,
unless you have them locked or reference counts increased.

``swap_activate``

Called to prepare the given file for swap. It should perform
any validation and preparation necessary to ensure that writes
can be performed with minimal memory allocation. It should call
add_swap_extent(), or the helper iomap_swapfile_activate(), and
return the number of extents added. If IO should be submitted
through ->swap_rw(), it should set SWP_FS_OPS, otherwise IO will
be submitted directly to the block device ``sis->bdev``.

``swap_deactivate``
Called during swapoff on files where swap_activate was
successful.

``swap_rw``
Called to read or write swap pages when SWP_FS_OPS is set.

The File Object
===============

Expand Down Expand Up @@ -1046,6 +1026,9 @@ This describes how the VFS can manipulate an open file. As of kernel
int (*uring_cmd_iopoll)(struct io_uring_cmd *, struct io_comp_batch *,
unsigned int poll_flags);
int (*mmap_prepare)(struct vm_area_desc *);
int (*swap_activate)(struct file *file, struct swap_info_struct *sis);
int (*swap_deactivate)(struct file *);
int (*swap_rw)(struct kiocb *iocb, struct iov_iter *iter);
};

Again, all methods are called without any locks being held, unless
Expand Down Expand Up @@ -1175,6 +1158,23 @@ otherwise noted.
this can be specified by the vm_area_desc->action field and related
parameters.

``swap_activate``

Called to prepare the given file for swap. It should perform
any validation and preparation necessary to ensure that writes
can be performed with minimal memory allocation. It should call
add_swap_extent(), or the helper iomap_swap_activate(), and
return the number of extents added. If IO should be submitted
through ->swap_rw(), it should set SWP_FS_OPS, otherwise IO will
be submitted directly to the block device ``sis->bdev``.

``swap_deactivate``
Called during swapoff on files where swap_activate was
successful.

``swap_rw``
Called to read or write swap pages when SWP_FS_OPS is set.

Note that the file operations are implemented by the specific
filesystem in which the inode resides. When opening a device node
(character or block special) most filesystems will call special
Expand Down
15 changes: 15 additions & 0 deletions block/fops.c
Original file line number Diff line number Diff line change
Expand Up @@ -949,6 +949,20 @@ static int blkdev_mmap_prepare(struct vm_area_desc *desc)
return generic_file_mmap_prepare(desc);
}

static int blkdev_swap_activate(struct file *file, struct swap_info_struct *sis)
{
struct block_device *bdev = I_BDEV(file->f_mapping->host);
loff_t isize = i_size_read(bdev_file_inode(file));

/*
* The swap code performs arbitrary overwrites, which are not supported
* on zones with sequential write constraints.
*/
if (bdev_is_zoned(bdev))
return -EINVAL;
return add_swap_extent(sis, div_u64(isize, PAGE_SIZE), bdev, 0);
}

const struct file_operations def_blk_fops = {
.open = blkdev_open,
.release = blkdev_release,
Expand All @@ -965,6 +979,7 @@ const struct file_operations def_blk_fops = {
.splice_read = filemap_splice_read,
.splice_write = iter_file_splice_write,
.fallocate = blkdev_fallocate,
.swap_activate = blkdev_swap_activate,
.uring_cmd = blkdev_uring_cmd,
.fop_flags = FOP_BUFFER_RASYNC,
};
Expand Down
3 changes: 3 additions & 0 deletions fs/btrfs/btrfs_inode.h
Original file line number Diff line number Diff line change
Expand Up @@ -670,4 +670,7 @@ struct extent_map *btrfs_create_io_em(struct btrfs_inode *inode, u64 start,
const struct btrfs_file_extent *file_extent,
int type);

int btrfs_swap_activate(struct file *file, struct swap_info_struct *sis);
void btrfs_swap_deactivate(struct file *file);

#endif
4 changes: 4 additions & 0 deletions fs/btrfs/file.c
Original file line number Diff line number Diff line change
Expand Up @@ -3867,6 +3867,10 @@ const struct file_operations btrfs_file_operations = {
.uring_cmd = btrfs_uring_cmd,
.fop_flags = FOP_BUFFER_RASYNC | FOP_BUFFER_WASYNC,
.setlease = generic_setlease,
#ifdef CONFIG_SWAP
.swap_activate = btrfs_swap_activate,
.swap_deactivate = btrfs_swap_deactivate,
#endif
};

int btrfs_fdatawrite_range(struct btrfs_inode *inode, loff_t start, loff_t end)
Expand Down
72 changes: 9 additions & 63 deletions fs/btrfs/inode.c
Original file line number Diff line number Diff line change
Expand Up @@ -10201,66 +10201,33 @@ static void btrfs_free_swapfile_pins(struct inode *inode)
}

struct btrfs_swap_info {
struct btrfs_device *device;
u64 start;
u64 block_start;
u64 block_len;
u64 lowest_ppage;
u64 highest_ppage;
unsigned long nr_pages;
int nr_extents;
};

static int btrfs_add_swap_extent(struct swap_info_struct *sis,
struct btrfs_swap_info *bsi)
{
unsigned long nr_pages;
unsigned long max_pages;
u64 first_ppage, first_ppage_reported, next_ppage;
int ret;
u64 first_ppage, next_ppage;

/*
* Our swapfile may have had its size extended after the swap header was
* written. In that case activating the swapfile should not go beyond
* the max size set in the swap header.
*/
if (bsi->nr_pages >= sis->max)
return 0;

max_pages = sis->max - bsi->nr_pages;
first_ppage = PAGE_ALIGN(bsi->block_start) >> PAGE_SHIFT;
next_ppage = PAGE_ALIGN_DOWN(bsi->block_start + bsi->block_len) >> PAGE_SHIFT;

if (first_ppage >= next_ppage)
return 0;
nr_pages = next_ppage - first_ppage;
nr_pages = min(nr_pages, max_pages);

first_ppage_reported = first_ppage;
if (bsi->start == 0)
first_ppage_reported++;
if (bsi->lowest_ppage > first_ppage_reported)
bsi->lowest_ppage = first_ppage_reported;
if (bsi->highest_ppage < (next_ppage - 1))
bsi->highest_ppage = next_ppage - 1;

ret = add_swap_extent(sis, bsi->nr_pages, nr_pages, first_ppage);
if (ret < 0)
return ret;
bsi->nr_extents += ret;
bsi->nr_pages += nr_pages;
return 0;
return add_swap_extent(sis, next_ppage - first_ppage, bsi->device->bdev,
first_ppage);
}

static void btrfs_swap_deactivate(struct file *file)
void btrfs_swap_deactivate(struct file *file)
{
struct inode *inode = file_inode(file);

btrfs_free_swapfile_pins(inode);
atomic_dec(&BTRFS_I(inode)->root->nr_swapfiles);
}

static int btrfs_swap_activate(struct swap_info_struct *sis, struct file *file,
sector_t *span)
int btrfs_swap_activate(struct file *file, struct swap_info_struct *sis)
{
struct inode *inode = file_inode(file);
struct btrfs_root *root = BTRFS_I(inode)->root;
Expand All @@ -10269,9 +10236,7 @@ static int btrfs_swap_activate(struct swap_info_struct *sis, struct file *file,
struct extent_state *cached_state = NULL;
struct btrfs_chunk_map *map = NULL;
struct btrfs_device *device = NULL;
struct btrfs_swap_info bsi = {
.lowest_ppage = (sector_t)-1ULL,
};
struct btrfs_swap_info bsi = {};
struct btrfs_backref_share_check_ctx *backref_ctx = NULL;
struct btrfs_path *path = NULL;
int ret = 0;
Expand Down Expand Up @@ -10540,6 +10505,7 @@ static int btrfs_swap_activate(struct swap_info_struct *sis, struct file *file,
bsi.start = key.offset;
bsi.block_start = physical_block_start;
bsi.block_len = len;
bsi.device = device;
}

if (fatal_signal_pending(current)) {
Expand Down Expand Up @@ -10570,25 +10536,7 @@ static int btrfs_swap_activate(struct swap_info_struct *sis, struct file *file,
up_write(&BTRFS_I(inode)->i_mmap_lock);
btrfs_free_backref_share_ctx(backref_ctx);
btrfs_free_path(path);
if (ret)
return ret;

if (device)
sis->bdev = device->bdev;
*span = bsi.highest_ppage - bsi.lowest_ppage + 1;
sis->max = bsi.nr_pages;
sis->pages = bsi.nr_pages - 1;
return bsi.nr_extents;
}
#else
static void btrfs_swap_deactivate(struct file *file)
{
}

static int btrfs_swap_activate(struct swap_info_struct *sis, struct file *file,
sector_t *span)
{
return -EOPNOTSUPP;
return ret;
}
#endif

Expand Down Expand Up @@ -10736,8 +10684,6 @@ static const struct address_space_operations btrfs_aops = {
.migrate_folio = btrfs_migrate_folio,
.dirty_folio = filemap_dirty_folio,
.error_remove_folio = generic_error_remove_folio,
.swap_activate = btrfs_swap_activate,
.swap_deactivate = btrfs_swap_deactivate,
};

static const struct inode_operations btrfs_file_inode_operations = {
Expand Down
6 changes: 6 additions & 0 deletions fs/ext4/file.c
Original file line number Diff line number Diff line change
Expand Up @@ -971,6 +971,11 @@ loff_t ext4_llseek(struct file *file, loff_t offset, int whence)
return vfs_setpos(file, offset, maxbytes);
}

static int ext4_swap_activate(struct file *file, struct swap_info_struct *sis)
{
return iomap_swap_activate(file, sis, &ext4_iomap_report_ops);
}

const struct file_operations ext4_file_operations = {
.llseek = ext4_llseek,
.read_iter = ext4_file_read_iter,
Expand All @@ -992,6 +997,7 @@ const struct file_operations ext4_file_operations = {
FOP_DIO_PARALLEL_WRITE |
FOP_DONTCACHE,
.setlease = generic_setlease,
.swap_activate = ext4_swap_activate,
};

const struct inode_operations ext4_file_inode_operations = {
Expand Down
11 changes: 0 additions & 11 deletions fs/ext4/inode.c
Original file line number Diff line number Diff line change
Expand Up @@ -3939,13 +3939,6 @@ static bool ext4_dirty_folio(struct address_space *mapping, struct folio *folio)
return block_dirty_folio(mapping, folio);
}

static int ext4_iomap_swap_activate(struct swap_info_struct *sis,
struct file *file, sector_t *span)
{
return iomap_swapfile_activate(sis, file, span,
&ext4_iomap_report_ops);
}

static const struct address_space_operations ext4_aops = {
.read_folio = ext4_read_folio,
.readahead = ext4_readahead,
Expand All @@ -3959,7 +3952,6 @@ static const struct address_space_operations ext4_aops = {
.migrate_folio = buffer_migrate_folio,
.is_partially_uptodate = block_is_partially_uptodate,
.error_remove_folio = generic_error_remove_folio,
.swap_activate = ext4_iomap_swap_activate,
};

static const struct address_space_operations ext4_journalled_aops = {
Expand All @@ -3975,7 +3967,6 @@ static const struct address_space_operations ext4_journalled_aops = {
.migrate_folio = buffer_migrate_folio_norefs,
.is_partially_uptodate = block_is_partially_uptodate,
.error_remove_folio = generic_error_remove_folio,
.swap_activate = ext4_iomap_swap_activate,
};

static const struct address_space_operations ext4_da_aops = {
Expand All @@ -3991,14 +3982,12 @@ static const struct address_space_operations ext4_da_aops = {
.migrate_folio = buffer_migrate_folio,
.is_partially_uptodate = block_is_partially_uptodate,
.error_remove_folio = generic_error_remove_folio,
.swap_activate = ext4_iomap_swap_activate,
};

static const struct address_space_operations ext4_dax_aops = {
.writepages = ext4_dax_writepages,
.dirty_folio = noop_dirty_folio,
.bmap = ext4_bmap,
.swap_activate = ext4_iomap_swap_activate,
};

void ext4_set_aops(struct inode *inode)
Expand Down
Loading