btrfs: add fscrypt support#828
Open
blktests-ci[bot] wants to merge 43 commits into
Open
Conversation
This adds the code necessary for per-extent encryption. We will store a nonce for every extent we create, and then use the inode's policy and the extents nonce to derive a per-extent key. This is meant to be flexible, if we choose to expand the on-disk extent information in the future we have a version number we can use to change what exists on disk. The file system indicates it wants to use per-extent encryption by setting s_cop->has_per_extent_encryption. This also requires the use of inline block encryption. The support is relatively straightforward, the only "extra" bit is we're deriving a per-extent key to use for the encryption, the inode still controls the policy and access to the master key. Since extent based encryption uses a lot of keys, we're requiring the use of inline block crypto if you use extent-based encryption. This enables us to take advantage of the built in pooling and reclamation of the crypto structures that underpin all of the encryption. The different encryption related options for fscrypt are too numerous to support for extent based encryption. Support for a few of these options could possibly be added, but since they're niche options simply reject them for file systems using extent based encryption. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
Instead of requiring -o inlinecrypt to enable inline encryption, allow having s_cop->has_per_extent_encryption to indicate that this file system supports inline encryption. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
We have fscrypt_file_open() which is meant to be called on files being opened so that their key is loaded when we start reading data from them. However for btrfs send we are opening the inode directly without a filp, so we need a different helper to make sure we can load the fscrypt context for the inode before reading its contents. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
…r is done Previously we were wiping the master key secret when we do FS_IOC_REMOVE_ENCRYPTION_KEY, and then using the fact that it was cleared as the mechanism from keeping new users from being setup. This works with inode based encryption, as the per-inode key is derived at setup time, so the secret disappearing doesn't affect any currently open files from being able to continue working. However for extent based encryption we do our key derivation at page writeout and readpage time, which means we need the master key secret to be available while we still have our file open. Since the master key lifetime is controlled by a flag, move the clearing of the secret to the mk_active_users cleanup stage if we have extent based encryption enabled on this super block. This counter represents the actively open files that still exist on the file system, and thus should still be able to operate normally. Once the last user is closed we can clear the secret. Until then no new users are allowed, and this allows currently open files to continue to operate until they're closed. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
Btrfs does checksumming, and the checksums need to match the bytes on disk. In order to facilitate this add a process bio callback for the blk-crypto layer. This allows the file system to specify a callback and then can process the encrypted bio as necessary. For btrfs, writes will have the checksums calculated and saved into our relevant data structures for storage once the write completes. For reads we will validate the checksums match what is on disk and error out if there is a mismatch. This is incompatible with native encryption obviously, so make sure we don't use native encryption if this callback is set. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
This will allow file systems to set a process_bio hook for inline encryption. This will be utilized by btrfs in order to make sure the checksumming work is done on the encrypted bio's. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
btrfs stores its data structures, including filenames in directories, in its own buffer implementation, struct extent_buffer, composed of several non-contiguous pages. We could copy filenames into a temporary buffer and use fscrypt_match_name() against that buffer, such extensive memcpying would be expensive. Instead, exposing fscrypt_nokey_name as in this change allows btrfs to recapitulate fscrypt_match_name() using methods on struct extent_buffer instead of dealing with a raw byte array. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
Add a couple of sections to the fscrypt documentation about per-extent encryption. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
When we add fscrypt support we're going to have fscrypt objects hanging off of extent_maps. This includes a block key, which if we're the last one freeing the key we may have to unregister it from the block layer. This requires taking a semaphore in the block layer, which means we can't free em's under the extent map tree lock. Thankfully we only do this in two places, one where we're dropping a range of extent maps, and when we're freeing logged extents. Add a free_extent_map_safe() which will add the em to a list in the em_tree if we free'd the object. Currently this is unconditional but will be changed to conditional on the fscrypt object we will add in a later patch. To process these delayed objects add a free_pending_extent_maps() that is called after the lock has been dropped on the em_tree. This will process the extent maps on the freed list and do the appropriate freeing work in a safe manner. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Daniel Vacek <neelx@suse.com>
In order to appropriately encrypt, create, open, rename, and various symlink operations must call fscrypt hooks. These determine whether the inode should be encrypted and do other preparatory actions. The superblock must have fscrypt operations registered, so implement the minimal set also, and introduce the new fscrypt.[ch] files to hold the fscrypt-specific functionality. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
fscrypt stores a context item with encrypted inodes that contains the related encryption information. fscrypt provides an arbitrary blob for the filesystem to store, and it does not clearly fit into an existing structure, so this goes in a new item type. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
As encrypted files will be incompatible with older filesystem versions, new filesystems should be created with an incompat flag for fscrypt, which will gate access to the encryption ioctls. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
Deleting an encrypted file must always be permitted, even if the user does not have the appropriate key. Therefore, for listing an encrypted directory, so-called 'nokey' names are provided, and these nokey names must be sufficient to look up and delete the appropriate encrypted files. See 'struct fscrypt_nokey_name' for more information on the format of these names. The first part of supporting nokey names is allowing lookups by nokey name. Only a few entry points need to support these: deleting a directory, file, or subvolume -- each of these call fscrypt_setup_filename() with a '1' argument, indicating that the key is not required and therefore a nokey name may be provided. If a nokey name is provided, the fscrypt_name returned by fscrypt_setup_filename() will not have its disk_name field populated, but will have various other fields set. This change alters the relevant codepaths to pass a complete fscrypt_name anywhere that it might contain a nokey name. When it does contain a nokey name, the first time the name is successfully matched to a stored name populates the disk name field of the fscrypt_name, allowing the caller to use the normal disk name codepaths afterward. Otherwise, the matching functionality is in close analogue to the function fscrypt_match_name(). Functions where most callers are providing a fscrypt_str are duplicated and adapted for a fscrypt_name, and functions where most callers are providing a fscrypt_name are changed to so require at all callsites. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
For encrypted or unencrypted names, we calculate the offset for the dir item by hashing the name for the dir item. However, this doesn't work for a long nokey name, where we do not have the complete ciphertext. Instead, fscrypt stores the filesystem-provided hash in the nokey name, and we can extract it from the fscrypt_name structure in such a case. Additionally, for nokey names, if we find the nokey name on disk we can update the fscrypt_name with the disk name, so add that to searching for diritems. Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
These ioctls allow encryption to actually be used. The set_encryption_policy ioctl is the thing which actually turns on encryption, and therefore sets the ENCRYPT flag in the superblock. This prevents the filesystem from being loaded on older kernels. fscrypt provides CONFIG_FS_ENCRYPTION-disabled versions of all these functions which just return -EOPNOTSUPP, so the ioctls don't need to be compiled out if CONFIG_FS_ENCRYPTION isn't enabled. We could instead gate this ioctl on the superblock having the flag set, if we wanted to require mkfs with the encrypt flag in order to have a filesystem with any encryption. Since this is a disk format change and is currently in development, gate the fscrypt ioctls for btrfs behind BTRFS_EXPERIMENTAL. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
We need this to make sure the appropriate encryption algorithms are turned on in our config if we have FS_ENCRYPTION enabled, and additionally we only support inline encryption with the fallback block crypto, so we need to make sure we pull in those dependencies. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
Since extent encryption requires inline encryption, even though we expect to use the inlinecrypt software fallback most of the time, we need to enumerate all the devices in use by btrfs. Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
This puts the long-preserved 1-byte encryption field to work, storing whether the extent is encrypted. Update the tree-checker to allow for the encryption bit to be set to our valid types. Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
Each extent_map will end up with a pointer to its associated fscrypt_info if any, which should have the same lifetime as the extent_map. We are also going to need to track the encryption_type for the file extent items. Add the fscrypt_info to the extent_map and the subsequent code for transferring it in the split and merge cases as well as the code necessary to free them. A future patch will add the code to load them as appropriate. Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
We're going to need these to update the file extent items once the writes are complete. Add them and add the pieces necessary to assign them and free everything. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
We're going to be getting fscrypt_info from the extent maps. Pass the fscrypt_info to helpers using the file_extent structure and use that to set the encryption type on the ordered extent. For prealloc extents we create an em, since we already have the context loaded from the original prealloc extent creation we need to pre- populate the extent map fscrypt info so it can be read properly later if the pages are evicted. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
The fscrypt_extent_info will be tied to the extent_map lifetime, so it will be created when we create the IO em, or it'll already exist in the NOCOW case. Use this fscrypt_info when creating the ordered extent to make sure everything is passed through properly. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
We keep track of this information in the ordered extent for writes, but we need it for reads as well. Add fscrypt_extent_info and orig_start to the dio_data so we can populate this on reads. This will be used later when we attach the fscrypt context to the bios. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
The fscrypt encryption context will be stored as a new tree item type. This gives us flexibility to include different things in the future. Also update the tree-checker to validate the new item type. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
Now that we have the fscrypt_extnet_info in all of the supporting structures, pass this through and set the file extent encryption bit accordingly from the supporting structures. In subsequent patches code will be added to populate these appropriately. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
This patch implements the necessary hooks from fscrypt to support per-extent encryption. There's two main entry points btrfs_fscrypt_load_extent_info btrfs_fscrypt_save_extent_info btrfs_fscrypt_load_extent_info gets called when we create the extent maps from the file extent item at btrfs_get_extent() time. We read the extent context, and pass it into fscrypt to create the appropriate fscrypt_extent_info structure. This is then used on the bio's to make sure the encryption is done properly. btrfs_fscrypt_save_extent_info is used to generate the fscrypt context from fscrypt and save it into the tree item when we create a new file extent item. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
New extents for encrypted inodes must have a fscrypt_extent_info, which has the necessary keys and does all the registration at the block layer for them. This is passed through all of the infrastructure we've previously added to make sure the context gets saved properly with the file extents. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
For extent encryption we have to use a logical block nr as input for the IV. For btrfs we're using the offset into the extent we're operating on. For most ordered extents this is the same as the file_offset, however for prealloc and NOCOW we have to use the original offset. Add this as an argument and plumb it through everywhere, this will be used when setting up the bio. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
Now that we have the fscrypt_info plumbed through everywhere, add the code to setup the bio encryption context from the extent context. We use the per-extent fscrypt_extent_info for encryption/decryption. We use the offset into the extent as the lblk for fscrypt. So the start of the extent has the lblk of 0, 4k into the extent has the lblk of 4k, etc. This is done to allow things like relocation to continue to work properly. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
We only ever needed the bbio in btrfs_csum_one_bio, since that has the bio embedded in it. However with encryption we'll have a different bio with the encrypted data in it, and the original bbio. Update btrfs_csum_one_bio to take the bio we're going to csum as an argument, which will allow us to csum the encrypted bio and stuff the csums into the corresponding bbio to be used later when the IO completes. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
For the fallback encrypted writes it allocates a bounce buffer to encrypt, and if the bio is larger than 256 segments it splits the bio we send down. This wreaks havoc on us because we need our actual original bio to be passed into the process_cb callback in order to get at our ordered extent. With the split we'll get some cloned bio that has none of our information. Handle this by returning the length we need to be truncated to and then splitting ourselves, which will handle giving us the correct btrfs_bio that we need. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
We are going to be checksumming the encrypted data, so we have to implement the ->process_bio fscrypt callback. This will provide us with the original bio and the encrypted bio to do work on. For WRITE's this will happen after the encrypted bio has been encrypted. For READ's this will happen after the read has completed and before the decryption step is done. For write's this is straightforward, we can just pass in the encrypted bio to btrfs_csum_one_bio and then the csums will be added to the bbio as normal. For read's this is relatively straightforward, but requires some care. We assume (because that's how it works currently) that the encrypted bio match the original bio, this is important because we save the iter of the bio before we submit. If this changes in the future we'll need a hook to give us the bi_iter of the decryption bio before it's submitted. We check the csums before decryption. If it doesn't match we simply error out and we let the normal path handle the repair work. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
In order to do read repair we will allocate sectorsize bio's and read them one at a time, repairing any sectors that don't match their csum. In order to do this we re-submit the IO's after it's failed, and at this point we still need the fscrypt_extent_info for these new bio's. Add the fscrypt_extent_info to the read part of the union in the btrfs_bio, and then pass this through all the places where we do reads. Additionally add the orig_start, because we need to be able to put the correct extent offset for the encryption context. With these in place we can utilize the normal read repair path. The only exception is that the actual repair of the bad copies has to be triggered from the ->process_bio callback, because this is the encrypted data. If we waited until the end_io we would have the decrypted data and we don't want to write that to the disk. This is the only change to the normal read repair path, we trigger the fixup of the broken sectors in ->process_bio, and then we skip that part if we successfully repair the sector in ->process_bio once we get to the endio. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
In order to enable more thorough testing of fscrypt enable the test_dummy_encryption mount option. This is used by fscrypt users to easily enable fscrypt on the file system for testing without needing to do the key setup and everything. The only deviation from other file systems we make is we only support the fsparam_flag version of this mount option, as it defaults to v2. We don't want to have to bother with rejecting v1 related mount options. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
We use this helper for inode-resolve and path resolution in send, so update this helper to properly decrypt any encrypted names it finds. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
Send needs to send the decrypted value of the symlinks, handle the case where the inode is encrypted and decrypt the symlink name into a buffer and copy this buffer into our fs_path struct. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
In send we're going to be looking up file names from back references and putting them into the send stream. If we are encrypted use the helper for decrypting names and copy the decrypted name into the buffer. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
For send we will read the pages and copy them into our buffer. Use the fscrypt_inode_open helper to make sure the key is loaded properly before trying to read from the inode so the contents are properly decrypted. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
btrfs/330 uncovered a problem where we were accidentally turning off the free space tree when we do the transition from ro->rw. This happens because we don't update Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
Log replay needs a few tweaks in order to make sure everything works with encryption. 1. Copy in the fscrypt context if we find one in the log. 2. Set NEEDS_FULL_SYNC when we update the inode context so all of the items are copied into the log at fsync time. This makes replay of encrypted files work properly. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
We will drop the inode and re-look it up to do defrag with auto defrag, which means we could lose the encryption policy. Auto defrag needs to be reworked to just hold onto the inode for scheduling later so we don't lose the context. For now just disable it if the file is encrypted. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
The RAID5/6 code will re-arrange bios and submit them through a different mechanism. This is problematic with inline encryption as we have to get the bio and csum it after it's been encrypted, and the raid5/6 bio's don't have the btrfs_bio embedded, so we have no way to get the csums put on disk. This isn't an unsolvable problem, but would require a bit of reworking. Since we discourage users from using this code currently simply don't allow encryption on RAID5/6 setups. If there's sufficient demand in the future we can add the support for this. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com>
send needs to track the dir item values to see if files were renamed when doing an incremental send. There is code to decrypt the names, but this breaks the code that checks to see if something was overwritten. Until this gap is closed we need to disable send on encrypted file systems. Fixing this is straightforward, but a medium sized project. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Daniel Vacek <neelx@suse.com>
Author
|
Upstream branch: aa54b1d |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull request for series with
subject: btrfs: add fscrypt support
version: 7
url: https://patchwork.kernel.org/project/linux-block/list/?series=1094040