[Deepin-Kernel-SIG] [linux 6.18-y] [Upstream] arm64: mm: Add PTE_DIRTY back to PAGE_KERNEL* to fix kexec/hibernation#1557
Open
opsiff wants to merge 1 commit intodeepin-community:linux-6.18.yfrom
Conversation
Commit 143937c ("arm64, mm: avoid always making PTE dirty in pte_mkwrite()") changed pte_mkwrite_novma() to only clear PTE_RDONLY when PTE_DIRTY is set. This was to allow writable-clean PTEs for swap pages that haven't actually been written. However, this broke kexec and hibernation for some platforms. Both go through trans_pgd_create_copy() -> _copy_pte(), which calls pte_mkwrite_novma() to make the temporary linear-map copy fully writable. With the updated pte_mkwrite_novma(), read-only kernel pages (without PTE_DIRTY) remain read-only in the temporary mapping. While such behaviour is fine for user pages where hardware DBM or trapping will make them writeable, subsequent in-kernel writes by the kexec relocation code will fault. Add PTE_DIRTY back to all _PAGE_KERNEL* protection definitions. This was the case prior to 5.4, commit aa57157 ("arm64: Ensure VM_WRITE|VM_SHARED ptes are clean by default"). With the kernel linear-map PTEs always having PTE_DIRTY set, pte_mkwrite_novma() correctly clears PTE_RDONLY. Fixes: 143937c ("arm64, mm: avoid always making PTE dirty in pte_mkwrite()") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: stable@vger.kernel.org Reported-by: Jianpeng Chang <jianpeng.chang.cn@windriver.com> Link: https://lore.kernel.org/r/20251204062722.3367201-1-jianpeng.chang.cn@windriver.com Cc: Will Deacon <will@kernel.org> Cc: Huang, Ying <ying.huang@linux.alibaba.com> Cc: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit c25c4aa3f79a488cc270507935a29c07dc6bddfc) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Reviewer's guide (collapsed on small PRs)Reviewer's GuideThis PR updates the arm64 kernel page table protection macros so that all kernel linear-map PAGE_KERNEL* PTEs are created with the PTE_DIRTY bit set, restoring pre-5.4 behavior to ensure pte_mkwrite_novma() can correctly make temporary mappings writable for kexec/hibernation paths. Sequence diagram for kexec temporary mapping with updated PAGE_KERNEL PTE_DIRTY behaviorsequenceDiagram
participant CPU
participant kexec_reloc as kexec_relocation_code
participant trans_pgd as trans_pgd_create_copy
participant copy_pte as _copy_pte
participant pte_mkwrite as pte_mkwrite_novma
CPU->>kexec_reloc: Start kexec or hibernation restore
kexec_reloc->>trans_pgd: Create temporary linear map
trans_pgd->>copy_pte: Copy kernel PTE into trans PGD
copy_pte->>pte_mkwrite: Make PTE writable
alt Before_fix_PAGE_KERNEL_without_PTE_DIRTY
pte_mkwrite-->>copy_pte: Sees PTE without PTE_DIRTY
note over pte_mkwrite: Only clears PTE_RDONLY if PTE_DIRTY is set
pte_mkwrite-->>copy_pte: PTE remains read_only
copy_pte-->>trans_pgd: Temporary PTE is still read_only
trans_pgd-->>kexec_reloc: Map used for relocation
kexec_reloc->>CPU: Attempt in_kernel write to temp mapping
CPU-->>kexec_reloc: Page fault due to read_only PTE
else After_fix_PAGE_KERNEL_with_PTE_DIRTY
pte_mkwrite-->>copy_pte: Sees PTE with PTE_DIRTY set
pte_mkwrite-->>copy_pte: Clears PTE_RDONLY, keeps PTE_DIRTY
copy_pte-->>trans_pgd: Temporary PTE is writable
trans_pgd-->>kexec_reloc: Map used for relocation
kexec_reloc->>CPU: In_kernel write to temp mapping
CPU-->>kexec_reloc: Write succeeds, no fault
end
Flow diagram for pte_mkwrite_novma behavior with PAGE_KERNEL PTE_DIRTY changeflowchart TD
A[Start: _copy_pte calls pte_mkwrite_novma] --> B[Input PTE from PAGE_KERNEL* mapping]
B --> C{Does PTE have PTE_DIRTY set?}
subgraph Before_fix_PAGE_KERNEL_without_PTE_DIRTY
direction TB
C -- No --> D[Leave PTE_RDONLY set]
D --> E[Result: PTE stays read_only]
E --> F[Temporary linear map may fault on in_kernel writes]
end
subgraph After_fix_PAGE_KERNEL_with_PTE_DIRTY
direction TB
C -- Yes --> G[Clear PTE_RDONLY bit]
G --> H[Keep PTE_DIRTY set]
H --> I[Result: PTE becomes writable]
I --> J[Temporary linear map allows kexec/hibernation writes]
end
F --> K[End]
J --> K[End]
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Commit 143937c ("arm64, mm: avoid always making PTE dirty in pte_mkwrite()") changed pte_mkwrite_novma() to only clear PTE_RDONLY when PTE_DIRTY is set. This was to allow writable-clean PTEs for swap pages that haven't actually been written.
However, this broke kexec and hibernation for some platforms. Both go through trans_pgd_create_copy() -> _copy_pte(), which calls pte_mkwrite_novma() to make the temporary linear-map copy fully writable. With the updated pte_mkwrite_novma(), read-only kernel pages (without PTE_DIRTY) remain read-only in the temporary mapping. While such behaviour is fine for user pages where hardware DBM or trapping will make them writeable, subsequent in-kernel writes by the kexec relocation code will fault.
Add PTE_DIRTY back to all _PAGE_KERNEL* protection definitions. This was the case prior to 5.4, commit aa57157 ("arm64: Ensure VM_WRITE|VM_SHARED ptes are clean by default"). With the kernel linear-map PTEs always having PTE_DIRTY set, pte_mkwrite_novma() correctly clears PTE_RDONLY.
Fixes: 143937c ("arm64, mm: avoid always making PTE dirty in pte_mkwrite()")
Cc: stable@vger.kernel.org
Reported-by: Jianpeng Chang jianpeng.chang.cn@windriver.com
Link: https://lore.kernel.org/r/20251204062722.3367201-1-jianpeng.chang.cn@windriver.com
Cc: Will Deacon will@kernel.org
Cc: Huang, Ying ying.huang@linux.alibaba.com
Cc: Guenter Roeck linux@roeck-us.net
Reviewed-by: Huang Ying ying.huang@linux.alibaba.com
(cherry picked from commit c25c4aa3f79a488cc270507935a29c07dc6bddfc)
Summary by Sourcery
Bug Fixes: