diff --git a/linux-tkg-patches/6.0/0010-lru_6.0.patch b/linux-tkg-patches/6.0/0010-lru_6.0.patch index 9b42954..b97022b 100644 --- a/linux-tkg-patches/6.0/0010-lru_6.0.patch +++ b/linux-tkg-patches/6.0/0010-lru_6.0.patch @@ -1,9 +1,13 @@ -* [PATCH v14 00/14] Multi-Gen LRU Framework -@ 2022-08-15 7:13 Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 01/14] mm: x86, arm64: add arch_has_hw_pte_young() Yu Zhao - ` (13 more replies) - 0 siblings, 14 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) +linux-kernel.vger.kernel.org archive mirror + + help / color / mirror / Atom feed + +* [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework +@ 2022-09-18 7:59 Yu Zhao + 2022-09-18 7:59 ` [PATCH mm-unstable v15 01/14] mm: x86, arm64: add arch_has_hw_pte_young() Yu Zhao + ` (14 more replies) + 0 siblings, 15 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 7:59 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -14,7 +18,16 @@ From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) What's new ========== -Retested on v6.0-rc1; rebased to the latest mm-unstable. +1. OpenWrt, in addition to Android, Arch Linux Zen, Armbian, ChromeOS, + Liquorix, post-factum and XanMod, is now shipping MGLRU on 5.15. +2. Fixed long-tailed direct reclaim latency seen on high-memory (TBs) + machines. The old direct reclaim backoff, which tries to enforce a + minimum fairness among all eligible memcgs, over-swapped by about + (total_mem>>DEF_PRIORITY)-nr_to_reclaim. The new backoff, which + pulls the plug on swapping once the target is met, trades some + fairness for curtailed latency: + https://lore.kernel.org/r/20220918080010.2920238-10-yuzhao@google.com/ +3. Fixed minior build warnings and conflicts. More comments and nits. TLDR ==== @@ -26,7 +39,7 @@ straightforward. Patchset overview ================= The design and implementation overview is in patch 14: -https://lore.kernel.org/r/20220815071332.627393-15-yuzhao@google.com/ +https://lore.kernel.org/r/20220918080010.2920238-15-yuzhao@google.com/ 01. mm: x86, arm64: add arch_has_hw_pte_young() 02. mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG @@ -208,7 +221,7 @@ Daniel from Michigan Tech reported [14]: Large-scale deployments ----------------------- -We've rolled out MGLRU to tens of millions of Chrome OS users and +We've rolled out MGLRU to tens of millions of ChromeOS users and about a million Android users. Google's fleetwide profiling [15] shows an overall 40% decrease in kswapd CPU usage, in addition to improvements in other UX metrics, e.g., an 85% decrease in the number @@ -219,10 +232,11 @@ The downstream kernels that have been using MGLRU include: 1. Android [16] 2. Arch Linux Zen [17] 3. Armbian [18] -4. Chrome OS [19] +4. ChromeOS [19] 5. Liquorix [20] -6. post-factum [21] -7. XanMod [22] +6. OpenWrt [21] +7. post-factum [22] +8. XanMod [23] [11] https://lore.kernel.org/r/140226722f2032c86301fbd326d91baefe3d7d23.camel@yandex.ru/ [12] https://lore.kernel.org/r/87czj3mux0.fsf@vajain21.in.ibm.com/ @@ -234,8 +248,9 @@ The downstream kernels that have been using MGLRU include: [18] https://armbian.com [19] https://chromium.org [20] https://liquorix.net -[21] https://codeberg.org/pf-kernel -[22] https://xanmod.org +[21] https://openwrt.org +[22] https://codeberg.org/pf-kernel +[23] https://xanmod.org Summery ======= @@ -275,7 +290,7 @@ Yu Zhao (14): mm: multi-gen LRU: design doc Documentation/admin-guide/mm/index.rst | 1 + - Documentation/admin-guide/mm/multigen_lru.rst | 156 + + Documentation/admin-guide/mm/multigen_lru.rst | 162 + Documentation/mm/index.rst | 1 + Documentation/mm/multigen_lru.rst | 159 + arch/Kconfig | 8 + @@ -289,7 +304,7 @@ Yu Zhao (14): include/linux/memcontrol.h | 36 + include/linux/mm.h | 5 + include/linux/mm_inline.h | 231 +- - include/linux/mm_types.h | 77 + + include/linux/mm_types.h | 76 + include/linux/mmzone.h | 214 ++ include/linux/nodemask.h | 1 + include/linux/page-flags-layout.h | 16 +- @@ -311,27 +326,27 @@ Yu Zhao (14): mm/mmzone.c | 2 + mm/rmap.c | 6 + mm/swap.c | 54 +- - mm/vmscan.c | 2972 ++++++++++++++++- + mm/vmscan.c | 2995 ++++++++++++++++- mm/workingset.c | 110 +- - 39 files changed, 4095 insertions(+), 155 deletions(-) + 39 files changed, 4122 insertions(+), 156 deletions(-) create mode 100644 Documentation/admin-guide/mm/multigen_lru.rst create mode 100644 Documentation/mm/multigen_lru.rst -base-commit: d2af7b221349ff6241e25fa8c67bcfae2b360700 +base-commit: 6cf215f1d5dac59a5a09514138ca37aed2719d0a -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 01/14] mm: x86, arm64: add arch_has_hw_pte_young() - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 02/14] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG Yu Zhao - ` (12 subsequent siblings) - 13 siblings, 0 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + +* [PATCH mm-unstable v15 01/14] mm: x86, arm64: add arch_has_hw_pte_young() + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao +@ 2022-09-18 7:59 ` Yu Zhao + 2022-09-18 7:59 ` [PATCH mm-unstable v15 02/14] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG Yu Zhao + ` (13 subsequent siblings) + 14 siblings, 0 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 7:59 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -431,7 +446,7 @@ index 44e2d6f1dbaa..dc5f7d8ef68a 100644 #ifdef CONFIG_PAGE_TABLE_CHECK diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h -index 014ee8f0fbaa..95f408df4695 100644 +index d13b4f7cc5be..375e8e7e64f4 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -260,6 +260,19 @@ static inline int pmdp_clear_flush_young(struct vm_area_struct *vma, @@ -455,7 +470,7 @@ index 014ee8f0fbaa..95f408df4695 100644 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c -index b994784158f5..46071cf00b47 100644 +index e38f9245470c..3a9b00c765c2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -126,18 +126,6 @@ int randomize_va_space __read_mostly = @@ -487,19 +502,19 @@ index b994784158f5..46071cf00b47 100644 vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl); -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 02/14] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 01/14] mm: x86, arm64: add arch_has_hw_pte_young() Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 03/14] mm/vmscan.c: refactor shrink_node() Yu Zhao - ` (11 subsequent siblings) - 13 siblings, 0 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + +* [PATCH mm-unstable v15 02/14] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao + 2022-09-18 7:59 ` [PATCH mm-unstable v15 01/14] mm: x86, arm64: add arch_has_hw_pte_young() Yu Zhao +@ 2022-09-18 7:59 ` Yu Zhao + 2022-09-18 8:00 ` [PATCH mm-unstable v15 03/14] mm/vmscan.c: refactor shrink_node() Yu Zhao + ` (12 subsequent siblings) + 14 siblings, 0 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 7:59 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -556,10 +571,10 @@ Tested-by: Vaibhav Jain 5 files changed, 17 insertions(+), 4 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig -index f330410da63a..ebea10a4513e 100644 +index 5dbf11a5ba4e..1c2599618eeb 100644 --- a/arch/Kconfig +++ b/arch/Kconfig -@@ -1416,6 +1416,14 @@ config DYNAMIC_SIGFRAME +@@ -1415,6 +1415,14 @@ config DYNAMIC_SIGFRAME config HAVE_ARCH_NODE_DEV_GROUP bool @@ -624,7 +639,7 @@ index a932d7712d85..8525f2876fb4 100644 unsigned long addr, pud_t *pudp) { diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h -index 95f408df4695..d9095251bffd 100644 +index 375e8e7e64f4..a108b60a6962 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -213,7 +213,7 @@ static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, @@ -646,20 +661,20 @@ index 95f408df4695..d9095251bffd 100644 #ifndef __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 03/14] mm/vmscan.c: refactor shrink_node() - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 01/14] mm: x86, arm64: add arch_has_hw_pte_young() Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 02/14] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 04/14] Revert "include/linux/mm_inline.h: fold __update_lru_size() into its sole caller" Yu Zhao - ` (10 subsequent siblings) - 13 siblings, 0 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + +* [PATCH mm-unstable v15 03/14] mm/vmscan.c: refactor shrink_node() + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao + 2022-09-18 7:59 ` [PATCH mm-unstable v15 01/14] mm: x86, arm64: add arch_has_hw_pte_young() Yu Zhao + 2022-09-18 7:59 ` [PATCH mm-unstable v15 02/14] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG Yu Zhao +@ 2022-09-18 8:00 ` Yu Zhao + 2022-09-18 8:00 ` [PATCH mm-unstable v15 04/14] Revert "include/linux/mm_inline.h: fold __update_lru_size() into its sole caller" Yu Zhao + ` (11 subsequent siblings) + 14 siblings, 0 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 8:00 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -695,7 +710,7 @@ Tested-by: Vaibhav Jain 1 file changed, 104 insertions(+), 94 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c -index b84cacc83a1e..e12715202ca7 100644 +index 992ba6a0bf10..0869cee13a90 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2728,6 +2728,109 @@ enum scan_balance { @@ -920,20 +935,20 @@ index b84cacc83a1e..e12715202ca7 100644 shrink_node_memcgs(pgdat, sc); -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 04/14] Revert "include/linux/mm_inline.h: fold __update_lru_size() into its sole caller" - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao + +* [PATCH mm-unstable v15 04/14] Revert "include/linux/mm_inline.h: fold __update_lru_size() into its sole caller" + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao ` (2 preceding siblings ...) - 2022-08-15 7:13 ` [PATCH v14 03/14] mm/vmscan.c: refactor shrink_node() Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 05/14] mm: multi-gen LRU: groundwork Yu Zhao - ` (9 subsequent siblings) - 13 siblings, 0 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + 2022-09-18 8:00 ` [PATCH mm-unstable v15 03/14] mm/vmscan.c: refactor shrink_node() Yu Zhao +@ 2022-09-18 8:00 ` Yu Zhao + 2022-09-18 8:00 ` [PATCH mm-unstable v15 05/14] mm: multi-gen LRU: groundwork Yu Zhao + ` (10 subsequent siblings) + 14 siblings, 0 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 8:00 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -998,20 +1013,20 @@ index 7b25b53c474a..fb8aadb81cd6 100644 mem_cgroup_update_lru_size(lruvec, lru, zid, nr_pages); #endif -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 05/14] mm: multi-gen LRU: groundwork - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao + +* [PATCH mm-unstable v15 05/14] mm: multi-gen LRU: groundwork + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao ` (3 preceding siblings ...) - 2022-08-15 7:13 ` [PATCH v14 04/14] Revert "include/linux/mm_inline.h: fold __update_lru_size() into its sole caller" Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 06/14] mm: multi-gen LRU: minimal implementation Yu Zhao - ` (8 subsequent siblings) - 13 siblings, 0 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + 2022-09-18 8:00 ` [PATCH mm-unstable v15 04/14] Revert "include/linux/mm_inline.h: fold __update_lru_size() into its sole caller" Yu Zhao +@ 2022-09-18 8:00 ` Yu Zhao + 2022-09-18 8:00 ` [PATCH mm-unstable v15 06/14] mm: multi-gen LRU: minimal implementation Yu Zhao + ` (9 subsequent siblings) + 14 siblings, 0 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 8:00 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -1344,7 +1359,7 @@ index fb8aadb81cd6..2ff703900fd0 100644 list_del(&folio->lru); update_lru_size(lruvec, lru, folio_zonenum(folio), diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h -index 025754b0bc09..86147f04bf76 100644 +index 18cf0fc5ce67..6f4ea078d90f 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -317,6 +317,102 @@ enum lruvec_flags { @@ -1564,10 +1579,10 @@ index 9795d75b09b2..5ee60777d8e4 100644 return 0; diff --git a/mm/Kconfig b/mm/Kconfig -index 0331f1461f81..d95f07cd6dcf 100644 +index e3fbd0788878..378306aee622 100644 --- a/mm/Kconfig +++ b/mm/Kconfig -@@ -1124,6 +1124,14 @@ config PTE_MARKER_UFFD_WP +@@ -1118,6 +1118,14 @@ config PTE_MARKER_UFFD_WP purposes. It is required to enable userfaultfd write protection on file-backed memory types like shmem and hugetlbfs. @@ -1583,10 +1598,10 @@ index 0331f1461f81..d95f07cd6dcf 100644 endmenu diff --git a/mm/huge_memory.c b/mm/huge_memory.c -index 9afc4c3b4d49..83c47a989260 100644 +index f4a656b279b1..949d7c325133 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c -@@ -2443,7 +2443,8 @@ static void __split_huge_page_tail(struct page *head, int tail, +@@ -2444,7 +2444,8 @@ static void __split_huge_page_tail(struct page *head, int tail, #ifdef CONFIG_64BIT (1L << PG_arch_2) | #endif @@ -1597,10 +1612,10 @@ index 9afc4c3b4d49..83c47a989260 100644 /* ->mapping in first tail page is compound_mapcount */ VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING, diff --git a/mm/memcontrol.c b/mm/memcontrol.c -index b69979c9ced5..5fd38d12149c 100644 +index 403af5f7a2b9..937141d48221 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c -@@ -5170,6 +5170,7 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg) +@@ -5175,6 +5175,7 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg) static void mem_cgroup_free(struct mem_cgroup *memcg) { @@ -1608,7 +1623,7 @@ index b69979c9ced5..5fd38d12149c 100644 memcg_wb_domain_exit(memcg); __mem_cgroup_free(memcg); } -@@ -5228,6 +5229,7 @@ static struct mem_cgroup *mem_cgroup_alloc(void) +@@ -5233,6 +5234,7 @@ static struct mem_cgroup *mem_cgroup_alloc(void) memcg->deferred_split_queue.split_queue_len = 0; #endif idr_replace(&mem_cgroup_idr, memcg, memcg->id.id); @@ -1617,10 +1632,10 @@ index b69979c9ced5..5fd38d12149c 100644 fail: mem_cgroup_id_remove(memcg); diff --git a/mm/memory.c b/mm/memory.c -index 46071cf00b47..f9abc10ea7e2 100644 +index 3a9b00c765c2..63832dab15d3 100644 --- a/mm/memory.c +++ b/mm/memory.c -@@ -5111,6 +5111,27 @@ static inline void mm_account_fault(struct pt_regs *regs, +@@ -5117,6 +5117,27 @@ static inline void mm_account_fault(struct pt_regs *regs, perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address); } @@ -1648,7 +1663,7 @@ index 46071cf00b47..f9abc10ea7e2 100644 /* * By the time we get here, we already hold the mm semaphore * -@@ -5142,11 +5163,15 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, +@@ -5148,11 +5169,15 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, if (flags & FAULT_FLAG_USER) mem_cgroup_enter_user_fault(); @@ -1737,7 +1752,7 @@ index 9cee7f6a3809..0e423b7d458b 100644 folio_get(folio); diff --git a/mm/vmscan.c b/mm/vmscan.c -index e12715202ca7..ed9e149b13c3 100644 +index 0869cee13a90..8d41c4ef430e 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3050,6 +3050,81 @@ static bool can_age_anon_pages(struct pglist_data *pgdat, @@ -1823,20 +1838,20 @@ index e12715202ca7..ed9e149b13c3 100644 { unsigned long nr[NR_LRU_LISTS]; -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 06/14] mm: multi-gen LRU: minimal implementation - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao + +* [PATCH mm-unstable v15 06/14] mm: multi-gen LRU: minimal implementation + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao ` (4 preceding siblings ...) - 2022-08-15 7:13 ` [PATCH v14 05/14] mm: multi-gen LRU: groundwork Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 07/14] mm: multi-gen LRU: exploit locality in rmap Yu Zhao - ` (7 subsequent siblings) - 13 siblings, 0 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + 2022-09-18 8:00 ` [PATCH mm-unstable v15 05/14] mm: multi-gen LRU: groundwork Yu Zhao +@ 2022-09-18 8:00 ` Yu Zhao + 2022-09-18 8:00 ` [PATCH mm-unstable v15 07/14] mm: multi-gen LRU: exploit locality in rmap Yu Zhao + ` (8 subsequent siblings) + 14 siblings, 0 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 8:00 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -2000,7 +2015,7 @@ Client benchmark results: CPU: single Snapdragon 7c Mem: total 4G - Chrome OS MemoryPressure [1] + ChromeOS MemoryPressure [1] [1] https://chromium.googlesource.com/chromiumos/platform/tast-tests/ @@ -2024,9 +2039,9 @@ Tested-by: Vaibhav Jain kernel/bounds.c | 2 + mm/Kconfig | 11 + mm/swap.c | 39 ++ - mm/vmscan.c | 815 +++++++++++++++++++++++++++++- - mm/workingset.c | 110 +++- - 8 files changed, 1049 insertions(+), 10 deletions(-) + mm/vmscan.c | 792 +++++++++++++++++++++++++++++- + mm/workingset.c | 110 ++++- + 8 files changed, 1025 insertions(+), 11 deletions(-) diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 2ff703900fd0..f2b2296a42f9 100644 @@ -2083,7 +2098,7 @@ index 2ff703900fd0..f2b2296a42f9 100644 static inline bool lru_gen_add_folio(struct lruvec *lruvec, struct folio *folio, bool reclaiming) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h -index 86147f04bf76..019d7c8ee834 100644 +index 6f4ea078d90f..7e343420bfb1 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -350,6 +350,28 @@ enum lruvec_flags { @@ -2120,7 +2135,7 @@ index 86147f04bf76..019d7c8ee834 100644 }; +#define MIN_LRU_BATCH BITS_PER_LONG -+#define MAX_LRU_BATCH (MIN_LRU_BATCH * 128) ++#define MAX_LRU_BATCH (MIN_LRU_BATCH * 64) + +/* whether to keep historical stats from evicted generations */ +#ifdef CONFIG_LRU_GEN_STATS @@ -2180,10 +2195,10 @@ index 5ee60777d8e4..b529182e8b04 100644 /* End of constants */ diff --git a/mm/Kconfig b/mm/Kconfig -index d95f07cd6dcf..5101dca8f21c 100644 +index 378306aee622..5c5dcbdcfe34 100644 --- a/mm/Kconfig +++ b/mm/Kconfig -@@ -1124,6 +1124,7 @@ config PTE_MARKER_UFFD_WP +@@ -1118,6 +1118,7 @@ config PTE_MARKER_UFFD_WP purposes. It is required to enable userfaultfd write protection on file-backed memory types like shmem and hugetlbfs. @@ -2191,7 +2206,7 @@ index d95f07cd6dcf..5101dca8f21c 100644 config LRU_GEN bool "Multi-Gen LRU" depends on MMU -@@ -1132,6 +1133,16 @@ config LRU_GEN +@@ -1126,6 +1127,16 @@ config LRU_GEN help A high performance LRU implementation to overcommit memory. @@ -2266,7 +2281,7 @@ index 0e423b7d458b..f74fd51fa9e1 100644 folio_set_referenced(folio); } else if (folio_test_unevictable(folio)) { diff --git a/mm/vmscan.c b/mm/vmscan.c -index ed9e149b13c3..4c57fb749a74 100644 +index 8d41c4ef430e..d1e60feea8ab 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1334,9 +1334,11 @@ static int __remove_mapping(struct address_space *mapping, struct folio *folio, @@ -2310,7 +2325,7 @@ index ed9e149b13c3..4c57fb749a74 100644 #define for_each_gen_type_zone(gen, type, zone) \ for ((gen) = 0; (gen) < MAX_NR_GENS; (gen)++) \ for ((type) = 0; (type) < ANON_AND_FILE; (type)++) \ -@@ -3081,6 +3097,769 @@ static struct lruvec __maybe_unused *get_lruvec(struct mem_cgroup *memcg, int ni +@@ -3081,6 +3097,745 @@ static struct lruvec __maybe_unused *get_lruvec(struct mem_cgroup *memcg, int ni return pgdat ? &pgdat->__lruvec : NULL; } @@ -2525,7 +2540,7 @@ index ed9e149b13c3..4c57fb749a74 100644 + if (max_seq != lrugen->max_seq) + goto unlock; + -+ for (type = 0; type < ANON_AND_FILE; type++) { ++ for (type = ANON_AND_FILE - 1; type >= 0; type--) { + if (get_nr_gens(lruvec, type) != MAX_NR_GENS) + continue; + @@ -2566,14 +2581,15 @@ index ed9e149b13c3..4c57fb749a74 100644 + spin_unlock_irq(&lruvec->lru_lock); +} + -+static unsigned long get_nr_evictable(struct lruvec *lruvec, unsigned long max_seq, -+ unsigned long *min_seq, bool can_swap, bool *need_aging) ++static bool should_run_aging(struct lruvec *lruvec, unsigned long max_seq, unsigned long *min_seq, ++ struct scan_control *sc, bool can_swap, unsigned long *nr_to_scan) +{ + int gen, type, zone; + unsigned long old = 0; + unsigned long young = 0; + unsigned long total = 0; + struct lru_gen_struct *lrugen = &lruvec->lrugen; ++ struct mem_cgroup *memcg = lruvec_memcg(lruvec); + + for (type = !can_swap; type < ANON_AND_FILE; type++) { + unsigned long seq; @@ -2589,35 +2605,37 @@ index ed9e149b13c3..4c57fb749a74 100644 + total += size; + if (seq == max_seq) + young += size; -+ if (seq + MIN_NR_GENS == max_seq) ++ else if (seq + MIN_NR_GENS == max_seq) + old += size; + } + } + ++ /* try to scrape all its memory if this memcg was deleted */ ++ *nr_to_scan = mem_cgroup_online(memcg) ? (total >> sc->priority) : total; ++ + /* -+ * The aging tries to be lazy to reduce the overhead. On the other hand, -+ * the eviction stalls when the number of generations reaches -+ * MIN_NR_GENS. So ideally, there should be MIN_NR_GENS+1 generations, -+ * hence the first two if's. -+ * -+ * Also it's ideal to spread pages out evenly, meaning 1/(MIN_NR_GENS+1) -+ * of the total number of pages for each generation. A reasonable range -+ * for this average portion is [1/MIN_NR_GENS, 1/(MIN_NR_GENS+2)]. The -+ * eviction cares about the lower bound of cold pages, whereas the aging -+ * cares about the upper bound of hot pages. ++ * The aging tries to be lazy to reduce the overhead, while the eviction ++ * stalls when the number of generations reaches MIN_NR_GENS. Hence, the ++ * ideal number of generations is MIN_NR_GENS+1. + */ + if (min_seq[!can_swap] + MIN_NR_GENS > max_seq) -+ *need_aging = true; -+ else if (min_seq[!can_swap] + MIN_NR_GENS < max_seq) -+ *need_aging = false; -+ else if (young * MIN_NR_GENS > total) -+ *need_aging = true; -+ else if (old * (MIN_NR_GENS + 2) < total) -+ *need_aging = true; -+ else -+ *need_aging = false; ++ return true; ++ if (min_seq[!can_swap] + MIN_NR_GENS < max_seq) ++ return false; + -+ return total; ++ /* ++ * It's also ideal to spread pages out evenly, i.e., 1/(MIN_NR_GENS+1) ++ * of the total number of pages for each generation. A reasonable range ++ * for this average portion is [1/MIN_NR_GENS, 1/(MIN_NR_GENS+2)]. The ++ * aging cares about the upper bound of hot pages, while the eviction ++ * cares about the lower bound of cold pages. ++ */ ++ if (young * MIN_NR_GENS > total) ++ return true; ++ if (old * (MIN_NR_GENS + 2) < total) ++ return true; ++ ++ return false; +} + +static void age_lruvec(struct lruvec *lruvec, struct scan_control *sc) @@ -2636,13 +2654,8 @@ index ed9e149b13c3..4c57fb749a74 100644 + if (mem_cgroup_below_min(memcg)) + return; + -+ nr_to_scan = get_nr_evictable(lruvec, max_seq, min_seq, swappiness, &need_aging); -+ if (!nr_to_scan) -+ return; -+ -+ nr_to_scan >>= mem_cgroup_online(memcg) ? sc->priority : 0; -+ -+ if (nr_to_scan && need_aging) ++ need_aging = should_run_aging(lruvec, max_seq, min_seq, sc, swappiness, &nr_to_scan); ++ if (need_aging) + inc_max_seq(lruvec, max_seq, swappiness); +} + @@ -2987,45 +3000,24 @@ index ed9e149b13c3..4c57fb749a74 100644 +} + +static unsigned long get_nr_to_scan(struct lruvec *lruvec, struct scan_control *sc, -+ bool can_swap, unsigned long reclaimed) ++ bool can_swap) +{ -+ int priority; + bool need_aging; + unsigned long nr_to_scan; + struct mem_cgroup *memcg = lruvec_memcg(lruvec); + DEFINE_MAX_SEQ(lruvec); + DEFINE_MIN_SEQ(lruvec); + -+ if (fatal_signal_pending(current)) { -+ sc->nr_reclaimed += MIN_LRU_BATCH; -+ return 0; -+ } -+ + if (mem_cgroup_below_min(memcg) || + (mem_cgroup_below_low(memcg) && !sc->memcg_low_reclaim)) + return 0; + -+ nr_to_scan = get_nr_evictable(lruvec, max_seq, min_seq, can_swap, &need_aging); -+ if (!nr_to_scan) -+ return 0; -+ -+ /* adjust priority if memcg is offline or the target is met */ -+ if (!mem_cgroup_online(memcg)) -+ priority = 0; -+ else if (sc->nr_reclaimed - reclaimed >= sc->nr_to_reclaim) -+ priority = DEF_PRIORITY; -+ else -+ priority = sc->priority; -+ -+ nr_to_scan >>= priority; -+ if (!nr_to_scan) -+ return 0; -+ ++ need_aging = should_run_aging(lruvec, max_seq, min_seq, sc, can_swap, &nr_to_scan); + if (!need_aging) + return nr_to_scan; + + /* skip the aging path at the default priority */ -+ if (priority == DEF_PRIORITY) ++ if (sc->priority == DEF_PRIORITY) + goto done; + + /* leave the work to lru_gen_age_node() */ @@ -3041,7 +3033,6 @@ index ed9e149b13c3..4c57fb749a74 100644 +{ + struct blk_plug plug; + unsigned long scanned = 0; -+ unsigned long reclaimed = sc->nr_reclaimed; + + lru_add_drain(); + @@ -3059,7 +3050,7 @@ index ed9e149b13c3..4c57fb749a74 100644 + else + swappiness = 0; + -+ nr_to_scan = get_nr_to_scan(lruvec, sc, swappiness, reclaimed); ++ nr_to_scan = get_nr_to_scan(lruvec, sc, swappiness); + if (!nr_to_scan) + break; + @@ -3080,7 +3071,7 @@ index ed9e149b13c3..4c57fb749a74 100644 /****************************************************************************** * initialization ******************************************************************************/ -@@ -3123,6 +3902,16 @@ static int __init init_lru_gen(void) +@@ -3123,6 +3878,16 @@ static int __init init_lru_gen(void) }; late_initcall(init_lru_gen); @@ -3097,7 +3088,7 @@ index ed9e149b13c3..4c57fb749a74 100644 #endif /* CONFIG_LRU_GEN */ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc) -@@ -3136,6 +3925,11 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc) +@@ -3136,6 +3901,11 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc) bool proportional_reclaim; struct blk_plug plug; @@ -3109,7 +3100,7 @@ index ed9e149b13c3..4c57fb749a74 100644 get_scan_count(lruvec, sc, nr); /* Record the original scan target for proportional adjustments later */ -@@ -3640,6 +4434,9 @@ static void snapshot_refaults(struct mem_cgroup *target_memcg, pg_data_t *pgdat) +@@ -3640,6 +4410,9 @@ static void snapshot_refaults(struct mem_cgroup *target_memcg, pg_data_t *pgdat) struct lruvec *target_lruvec; unsigned long refaults; @@ -3119,13 +3110,13 @@ index ed9e149b13c3..4c57fb749a74 100644 target_lruvec = mem_cgroup_lruvec(target_memcg, pgdat); refaults = lruvec_page_state(target_lruvec, WORKINGSET_ACTIVATE_ANON); target_lruvec->refaults[WORKINGSET_ANON] = refaults; -@@ -4006,12 +4803,17 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg, +@@ -4006,12 +4779,16 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg, } #endif -static void age_active_anon(struct pglist_data *pgdat, -+static void kswapd_age_node(struct pglist_data *pgdat, - struct scan_control *sc) +- struct scan_control *sc) ++static void kswapd_age_node(struct pglist_data *pgdat, struct scan_control *sc) { struct mem_cgroup *memcg; struct lruvec *lruvec; @@ -3138,7 +3129,7 @@ index ed9e149b13c3..4c57fb749a74 100644 if (!can_age_anon_pages(pgdat, sc)) return; -@@ -4331,12 +5133,11 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) +@@ -4331,12 +5108,11 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) sc.may_swap = !nr_boost_reclaim; /* @@ -3306,20 +3297,20 @@ index a5e84862fc86..ae7e984b23c6 100644 rcu_read_lock(); /* -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 07/14] mm: multi-gen LRU: exploit locality in rmap - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao + +* [PATCH mm-unstable v15 07/14] mm: multi-gen LRU: exploit locality in rmap + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao ` (5 preceding siblings ...) - 2022-08-15 7:13 ` [PATCH v14 06/14] mm: multi-gen LRU: minimal implementation Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 08/14] mm: multi-gen LRU: support page table walks Yu Zhao - ` (6 subsequent siblings) - 13 siblings, 0 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + 2022-09-18 8:00 ` [PATCH mm-unstable v15 06/14] mm: multi-gen LRU: minimal implementation Yu Zhao +@ 2022-09-18 8:00 ` Yu Zhao + 2022-09-18 8:00 ` [PATCH mm-unstable v15 08/14] mm: multi-gen LRU: support page table walks Yu Zhao + ` (7 subsequent siblings) + 14 siblings, 0 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 8:00 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -3413,10 +3404,10 @@ Tested-by: Vaibhav Jain 8 files changed, 236 insertions(+), 2 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h -index 4d31ce55b1c0..47829f378fcb 100644 +index a2461f9a8738..9b8ab121d948 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h -@@ -444,6 +444,7 @@ static inline struct obj_cgroup *__folio_objcg(struct folio *folio) +@@ -445,6 +445,7 @@ static inline struct obj_cgroup *__folio_objcg(struct folio *folio) * - LRU isolation * - lock_page_memcg() * - exclusive reference @@ -3424,7 +3415,7 @@ index 4d31ce55b1c0..47829f378fcb 100644 * * For a kmem folio a caller should hold an rcu read lock to protect memcg * associated with a kmem folio from being released. -@@ -505,6 +506,7 @@ static inline struct mem_cgroup *folio_memcg_rcu(struct folio *folio) +@@ -506,6 +507,7 @@ static inline struct mem_cgroup *folio_memcg_rcu(struct folio *folio) * - LRU isolation * - lock_page_memcg() * - exclusive reference @@ -3432,7 +3423,7 @@ index 4d31ce55b1c0..47829f378fcb 100644 * * For a kmem page a caller should hold an rcu read lock to protect memcg * associated with a kmem page from being released. -@@ -959,6 +961,23 @@ void unlock_page_memcg(struct page *page); +@@ -960,6 +962,23 @@ void unlock_page_memcg(struct page *page); void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val); @@ -3456,7 +3447,7 @@ index 4d31ce55b1c0..47829f378fcb 100644 /* idx can be of type enum memcg_stat_item or node_stat_item */ static inline void mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) -@@ -1422,6 +1441,18 @@ static inline void folio_memcg_unlock(struct folio *folio) +@@ -1434,6 +1453,18 @@ static inline void folio_memcg_unlock(struct folio *folio) { } @@ -3476,7 +3467,7 @@ index 4d31ce55b1c0..47829f378fcb 100644 { } diff --git a/include/linux/mm.h b/include/linux/mm.h -index fbe2e72e7bca..8ff7227c6cb1 100644 +index 8a5ad9d050bf..7cc9ffc19e7f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1490,6 +1490,11 @@ static inline unsigned long folio_pfn(struct folio *folio) @@ -3492,7 +3483,7 @@ index fbe2e72e7bca..8ff7227c6cb1 100644 { return &folio_page(folio, 1)->compound_pincount; diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h -index 019d7c8ee834..850c6171af68 100644 +index 7e343420bfb1..9ef5aa37c60c 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -375,6 +375,7 @@ enum lruvec_flags { @@ -3535,7 +3526,7 @@ index 4df67b6b8cce..0082d5fdddac 100644 void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, unsigned long floor, unsigned long ceiling); diff --git a/mm/memcontrol.c b/mm/memcontrol.c -index 5fd38d12149c..882180866e31 100644 +index 937141d48221..4ea49113b0dd 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2789,6 +2789,7 @@ static void commit_charge(struct folio *folio, struct mem_cgroup *memcg) @@ -3547,7 +3538,7 @@ index 5fd38d12149c..882180866e31 100644 folio->memcg_data = (unsigned long)memcg; } diff --git a/mm/rmap.c b/mm/rmap.c -index 28aef434ea41..7dc6d77ae865 100644 +index 131def40e4f0..2ff17b9aabd9 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -825,6 +825,12 @@ static bool folio_referenced_one(struct folio *folio, @@ -3586,7 +3577,7 @@ index f74fd51fa9e1..0a3871a70952 100644 struct lruvec *lruvec; diff --git a/mm/vmscan.c b/mm/vmscan.c -index 4c57fb749a74..f365386eb441 100644 +index d1e60feea8ab..33a1bdfc04bd 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1635,6 +1635,11 @@ static unsigned int shrink_page_list(struct list_head *page_list, @@ -3687,7 +3678,7 @@ index 4c57fb749a74..f365386eb441 100644 static void inc_min_seq(struct lruvec *lruvec, int type) { struct lru_gen_struct *lrugen = &lruvec->lrugen; -@@ -3445,6 +3515,114 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) +@@ -3443,6 +3513,114 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL))); } @@ -3750,7 +3741,7 @@ index 4c57fb749a74..f365386eb441 100644 + continue; + + if (!ptep_test_and_clear_young(pvmw->vma, addr, pte + i)) -+ continue; ++ VM_WARN_ON_ONCE(true); + + if (pte_dirty(pte[i]) && !folio_test_dirty(folio) && + !(folio_test_anon(folio) && folio_test_swapbacked(folio) && @@ -3802,7 +3793,7 @@ index 4c57fb749a74..f365386eb441 100644 /****************************************************************************** * the eviction ******************************************************************************/ -@@ -3481,6 +3659,12 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, int tier_idx) +@@ -3479,6 +3657,12 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, int tier_idx) return true; } @@ -3816,20 +3807,22 @@ index 4c57fb749a74..f365386eb441 100644 if (tier > tier_idx) { int hist = lru_hist_from_seq(lrugen->min_seq[type]); -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 08/14] mm: multi-gen LRU: support page table walks - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao + +* [PATCH mm-unstable v15 08/14] mm: multi-gen LRU: support page table walks + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao ` (6 preceding siblings ...) - 2022-08-15 7:13 ` [PATCH v14 07/14] mm: multi-gen LRU: exploit locality in rmap Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 09/14] mm: multi-gen LRU: optimize multiple memcgs Yu Zhao - ` (5 subsequent siblings) - 13 siblings, 0 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + 2022-09-18 8:00 ` [PATCH mm-unstable v15 07/14] mm: multi-gen LRU: exploit locality in rmap Yu Zhao +@ 2022-09-18 8:00 ` Yu Zhao + 2022-09-18 8:17 ` Yu Zhao + 2022-09-28 19:36 ` Yu Zhao + 2022-09-18 8:00 ` [PATCH mm-unstable v15 09/14] mm: multi-gen LRU: optimize multiple memcgs Yu Zhao + ` (6 subsequent siblings) + 14 siblings, 2 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 8:00 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -3949,18 +3942,18 @@ Tested-by: Vaibhav Jain --- fs/exec.c | 2 + include/linux/memcontrol.h | 5 + - include/linux/mm_types.h | 77 +++ + include/linux/mm_types.h | 76 +++ include/linux/mmzone.h | 56 +- include/linux/swap.h | 4 + kernel/exit.c | 1 + kernel/fork.c | 9 + kernel/sched/core.c | 1 + mm/memcontrol.c | 25 + - mm/vmscan.c | 1012 +++++++++++++++++++++++++++++++++++- - 10 files changed, 1175 insertions(+), 17 deletions(-) + mm/vmscan.c | 1010 +++++++++++++++++++++++++++++++++++- + 10 files changed, 1172 insertions(+), 17 deletions(-) diff --git a/fs/exec.c b/fs/exec.c -index f793221f4eb6..5fd98ca569ff 100644 +index 9a5ca7b82bfc..507a317d54db 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1014,6 +1014,7 @@ static int exec_mmap(struct mm_struct *mm) @@ -3980,7 +3973,7 @@ index f793221f4eb6..5fd98ca569ff 100644 mmap_read_unlock(old_mm); BUG_ON(active_mm != old_mm); diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h -index 47829f378fcb..ea6a78bb896c 100644 +index 9b8ab121d948..344022f102c2 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -350,6 +350,11 @@ struct mem_cgroup { @@ -3996,26 +3989,10 @@ index 47829f378fcb..ea6a78bb896c 100644 }; diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h -index cf97f3884fda..1952e70ec099 100644 +index cf97f3884fda..e1797813cc2c 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h -@@ -3,6 +3,7 @@ - #define _LINUX_MM_TYPES_H - - #include -+#include - - #include - #include -@@ -17,6 +18,7 @@ - #include - #include - #include -+#include - - #include - -@@ -672,6 +674,22 @@ struct mm_struct { +@@ -672,6 +672,22 @@ struct mm_struct { */ unsigned long ksm_merging_pages; #endif @@ -4038,7 +4015,7 @@ index cf97f3884fda..1952e70ec099 100644 } __randomize_layout; /* -@@ -698,6 +716,65 @@ static inline cpumask_t *mm_cpumask(struct mm_struct *mm) +@@ -698,6 +714,66 @@ static inline cpumask_t *mm_cpumask(struct mm_struct *mm) return (struct cpumask *)&mm->cpu_bitmap; } @@ -4068,11 +4045,12 @@ index cf97f3884fda..1952e70ec099 100644 + +static inline void lru_gen_use_mm(struct mm_struct *mm) +{ -+ /* unlikely but not a bug when racing with lru_gen_migrate_mm() */ -+ VM_WARN_ON_ONCE(list_empty(&mm->lru_gen.list)); -+ -+ if (!(current->flags & PF_KTHREAD)) -+ WRITE_ONCE(mm->lru_gen.bitmap, -1); ++ /* ++ * When the bitmap is set, page reclaim knows this mm_struct has been ++ * used since the last time it cleared the bitmap. So it might be worth ++ * walking the page tables of this mm_struct to clear the accessed bit. ++ */ ++ WRITE_ONCE(mm->lru_gen.bitmap, -1); +} + +#else /* !CONFIG_LRU_GEN */ @@ -4105,7 +4083,7 @@ index cf97f3884fda..1952e70ec099 100644 extern void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm); extern void tlb_gather_mmu_fullmm(struct mmu_gather *tlb, struct mm_struct *mm); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h -index 850c6171af68..51e521465742 100644 +index 9ef5aa37c60c..b1635c4020dc 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -408,7 +408,7 @@ enum { @@ -4180,7 +4158,7 @@ index 850c6171af68..51e521465742 100644 #endif #ifdef CONFIG_MEMCG struct pglist_data *pgdat; -@@ -1174,6 +1223,11 @@ typedef struct pglist_data { +@@ -1176,6 +1225,11 @@ typedef struct pglist_data { unsigned long flags; @@ -4266,10 +4244,10 @@ index 8fccd8721bb8..2c605bdede47 100644 if (!prev->mm) { // from kernel /* will mmdrop() in finish_task_switch(). */ diff --git a/mm/memcontrol.c b/mm/memcontrol.c -index 882180866e31..2121a9bcbb54 100644 +index 4ea49113b0dd..392b1fd1e8c4 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c -@@ -6199,6 +6199,30 @@ static void mem_cgroup_move_task(void) +@@ -6204,6 +6204,30 @@ static void mem_cgroup_move_task(void) } #endif @@ -4287,7 +4265,7 @@ index 882180866e31..2121a9bcbb54 100644 + return; + + task_lock(task); -+ if (task->mm && task->mm->owner == task) ++ if (task->mm && READ_ONCE(task->mm->owner) == task) + lru_gen_migrate_mm(task->mm); + task_unlock(task); +} @@ -4300,7 +4278,7 @@ index 882180866e31..2121a9bcbb54 100644 static int seq_puts_memcg_tunable(struct seq_file *m, unsigned long value) { if (value == PAGE_COUNTER_MAX) -@@ -6604,6 +6628,7 @@ struct cgroup_subsys memory_cgrp_subsys = { +@@ -6609,6 +6633,7 @@ struct cgroup_subsys memory_cgrp_subsys = { .css_reset = mem_cgroup_css_reset, .css_rstat_flush = mem_cgroup_css_rstat_flush, .can_attach = mem_cgroup_can_attach, @@ -4309,7 +4287,7 @@ index 882180866e31..2121a9bcbb54 100644 .post_attach = mem_cgroup_move_task, .dfl_cftypes = memory_files, diff --git a/mm/vmscan.c b/mm/vmscan.c -index f365386eb441..d1dfc0a77b6f 100644 +index 33a1bdfc04bd..c579b254fec7 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -49,6 +49,8 @@ @@ -4330,7 +4308,7 @@ index f365386eb441..d1dfc0a77b6f 100644 { struct pglist_data *pgdat = NODE_DATA(nid); -@@ -3127,6 +3129,372 @@ static bool __maybe_unused seq_is_valid(struct lruvec *lruvec) +@@ -3127,6 +3129,371 @@ static bool __maybe_unused seq_is_valid(struct lruvec *lruvec) get_nr_gens(lruvec, LRU_GEN_ANON) <= MAX_NR_GENS; } @@ -4433,15 +4411,17 @@ index f365386eb441..d1dfc0a77b6f 100644 +void lru_gen_migrate_mm(struct mm_struct *mm) +{ + struct mem_cgroup *memcg; ++ struct task_struct *task = rcu_dereference_protected(mm->owner, true); + -+ lockdep_assert_held(&mm->owner->alloc_lock); ++ VM_WARN_ON_ONCE(task->mm != mm); ++ lockdep_assert_held(&task->alloc_lock); + + /* for mm_update_next_owner() */ + if (mem_cgroup_disabled()) + return; + + rcu_read_lock(); -+ memcg = mem_cgroup_from_task(rcu_dereference(mm->owner)); ++ memcg = mem_cgroup_from_task(task); + rcu_read_unlock(); + if (memcg == mm->lru_gen.memcg) + return; @@ -4588,9 +4568,6 @@ index f365386eb441..d1dfc0a77b6f 100644 + if (size < MIN_LRU_BATCH) + return true; + -+ if (mm_is_oom_victim(mm)) -+ return true; -+ + return !mmget_not_zero(mm); +} + @@ -4703,7 +4680,7 @@ index f365386eb441..d1dfc0a77b6f 100644 /****************************************************************************** * refault feedback loop ******************************************************************************/ -@@ -3277,6 +3645,118 @@ static int folio_inc_gen(struct lruvec *lruvec, struct folio *folio, bool reclai +@@ -3277,6 +3644,118 @@ static int folio_inc_gen(struct lruvec *lruvec, struct folio *folio, bool reclai return new_gen; } @@ -4822,7 +4799,7 @@ index f365386eb441..d1dfc0a77b6f 100644 static unsigned long get_pte_pfn(pte_t pte, struct vm_area_struct *vma, unsigned long addr) { unsigned long pfn = pte_pfn(pte); -@@ -3295,8 +3775,28 @@ static unsigned long get_pte_pfn(pte_t pte, struct vm_area_struct *vma, unsigned +@@ -3295,8 +3774,28 @@ static unsigned long get_pte_pfn(pte_t pte, struct vm_area_struct *vma, unsigned return pfn; } @@ -4852,7 +4829,7 @@ index f365386eb441..d1dfc0a77b6f 100644 { struct folio *folio; -@@ -3311,9 +3811,378 @@ static struct folio *get_pfn_folio(unsigned long pfn, struct mem_cgroup *memcg, +@@ -3311,9 +3810,375 @@ static struct folio *get_pfn_folio(unsigned long pfn, struct mem_cgroup *memcg, if (folio_memcg_rcu(folio) != memcg) return NULL; @@ -4916,7 +4893,7 @@ index f365386eb441..d1dfc0a77b6f 100644 + continue; + + if (!ptep_test_and_clear_young(args->vma, addr, pte + i)) -+ continue; ++ VM_WARN_ON_ONCE(true); + + young++; + walk->mm_stats[MM_LEAF_YOUNG]++; @@ -5132,9 +5109,6 @@ index f365386eb441..d1dfc0a77b6f 100644 + + walk_pmd_range(&val, addr, next, args); + -+ if (mm_is_oom_victim(args->mm)) -+ return 1; -+ + /* a racy check to curtail the waiting time */ + if (wq_has_sleeper(&walk->lruvec->mm_state.wait)) + return 1; @@ -5231,7 +5205,7 @@ index f365386eb441..d1dfc0a77b6f 100644 static void inc_min_seq(struct lruvec *lruvec, int type) { struct lru_gen_struct *lrugen = &lruvec->lrugen; -@@ -3365,7 +4234,7 @@ static bool try_to_inc_min_seq(struct lruvec *lruvec, bool can_swap) +@@ -3365,7 +4230,7 @@ static bool try_to_inc_min_seq(struct lruvec *lruvec, bool can_swap) return success; } @@ -5240,17 +5214,17 @@ index f365386eb441..d1dfc0a77b6f 100644 { int prev, next; int type, zone; -@@ -3375,9 +4244,6 @@ static void inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, bool can_s +@@ -3375,9 +4240,6 @@ static void inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, bool can_s VM_WARN_ON_ONCE(!seq_is_valid(lruvec)); - if (max_seq != lrugen->max_seq) - goto unlock; - - for (type = 0; type < ANON_AND_FILE; type++) { + for (type = ANON_AND_FILE - 1; type >= 0; type--) { if (get_nr_gens(lruvec, type) != MAX_NR_GENS) continue; -@@ -3415,10 +4281,76 @@ static void inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, bool can_s +@@ -3415,10 +4277,76 @@ static void inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, bool can_s /* make sure preceding modifications appear */ smp_store_release(&lrugen->max_seq, lrugen->max_seq + 1); @@ -5306,7 +5280,7 @@ index f365386eb441..d1dfc0a77b6f 100644 + } while (mm); +done: + if (!success) { -+ if (sc->priority < DEF_PRIORITY - 2) ++ if (sc->priority <= DEF_PRIORITY - 2) + wait_event_killable(lruvec->mm_state.wait, + max_seq < READ_ONCE(lrugen->max_seq)); + @@ -5325,19 +5299,19 @@ index f365386eb441..d1dfc0a77b6f 100644 + return true; +} + - static unsigned long get_nr_evictable(struct lruvec *lruvec, unsigned long max_seq, - unsigned long *min_seq, bool can_swap, bool *need_aging) + static bool should_run_aging(struct lruvec *lruvec, unsigned long max_seq, unsigned long *min_seq, + struct scan_control *sc, bool can_swap, unsigned long *nr_to_scan) { -@@ -3496,7 +4428,7 @@ static void age_lruvec(struct lruvec *lruvec, struct scan_control *sc) - nr_to_scan >>= mem_cgroup_online(memcg) ? sc->priority : 0; +@@ -3494,7 +4422,7 @@ static void age_lruvec(struct lruvec *lruvec, struct scan_control *sc) - if (nr_to_scan && need_aging) + need_aging = should_run_aging(lruvec, max_seq, min_seq, sc, swappiness, &nr_to_scan); + if (need_aging) - inc_max_seq(lruvec, max_seq, swappiness); + try_to_inc_max_seq(lruvec, max_seq, sc, swappiness); } static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) -@@ -3505,6 +4437,8 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) +@@ -3503,6 +4431,8 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) VM_WARN_ON_ONCE(!current_is_kswapd()); @@ -5346,7 +5320,7 @@ index f365386eb441..d1dfc0a77b6f 100644 memcg = mem_cgroup_iter(NULL, NULL, NULL); do { struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat); -@@ -3513,11 +4447,16 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) +@@ -3511,11 +4441,16 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) cond_resched(); } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL))); @@ -5364,7 +5338,7 @@ index f365386eb441..d1dfc0a77b6f 100644 */ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) { -@@ -3526,6 +4465,8 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) +@@ -3524,6 +4459,8 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) unsigned long start; unsigned long end; unsigned long addr; @@ -5373,15 +5347,17 @@ index f365386eb441..d1dfc0a77b6f 100644 unsigned long bitmap[BITS_TO_LONGS(MIN_LRU_BATCH)] = {}; struct folio *folio = pfn_folio(pvmw->pfn); struct mem_cgroup *memcg = folio_memcg(folio); -@@ -3555,6 +4496,7 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) - } +@@ -3538,6 +4475,9 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) + if (spin_is_contended(pvmw->ptl)) + return; - pte = pvmw->pte - (pvmw->address - start) / PAGE_SIZE; ++ /* avoid taking the LRU lock under the PTL when possible */ + walk = current->reclaim_state ? current->reclaim_state->mm_walk : NULL; ++ + start = max(pvmw->address & PMD_MASK, pvmw->vma->vm_start); + end = min(pvmw->address | ~PMD_MASK, pvmw->vma->vm_end - 1) + 1; - rcu_read_lock(); - arch_enter_lazy_mmu_mode(); -@@ -3569,13 +4511,15 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) +@@ -3567,13 +4507,15 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) if (!pte_young(pte[i])) continue; @@ -5391,14 +5367,14 @@ index f365386eb441..d1dfc0a77b6f 100644 continue; if (!ptep_test_and_clear_young(pvmw->vma, addr, pte + i)) - continue; + VM_WARN_ON_ONCE(true); + young++; + if (pte_dirty(pte[i]) && !folio_test_dirty(folio) && !(folio_test_anon(folio) && folio_test_swapbacked(folio) && !folio_test_swapcache(folio))) -@@ -3591,7 +4535,11 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) +@@ -3589,7 +4531,11 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) arch_leave_lazy_mmu_mode(); rcu_read_unlock(); @@ -5411,7 +5387,7 @@ index f365386eb441..d1dfc0a77b6f 100644 for_each_set_bit(i, bitmap, MIN_LRU_BATCH) { folio = pfn_folio(pte_pfn(pte[i])); folio_activate(folio); -@@ -3603,8 +4551,10 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) +@@ -3601,8 +4547,10 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) if (!mem_cgroup_trylock_pages(memcg)) return; @@ -5424,7 +5400,7 @@ index f365386eb441..d1dfc0a77b6f 100644 for_each_set_bit(i, bitmap, MIN_LRU_BATCH) { folio = pfn_folio(pte_pfn(pte[i])); -@@ -3615,10 +4565,14 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) +@@ -3613,10 +4561,14 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) if (old_gen < 0 || old_gen == new_gen) continue; @@ -5441,7 +5417,7 @@ index f365386eb441..d1dfc0a77b6f 100644 mem_cgroup_unlock_pages(); } -@@ -3901,6 +4855,7 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap +@@ -3899,6 +4851,7 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap struct folio *folio; enum vm_event_item item; struct reclaim_stat stat; @@ -5449,7 +5425,7 @@ index f365386eb441..d1dfc0a77b6f 100644 struct mem_cgroup *memcg = lruvec_memcg(lruvec); struct pglist_data *pgdat = lruvec_pgdat(lruvec); -@@ -3937,6 +4892,10 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap +@@ -3935,6 +4888,10 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap move_pages_to_lru(lruvec, &list); @@ -5460,7 +5436,7 @@ index f365386eb441..d1dfc0a77b6f 100644 item = current_is_kswapd() ? PGSTEAL_KSWAPD : PGSTEAL_DIRECT; if (!cgroup_reclaim(sc)) __count_vm_events(item, reclaimed); -@@ -3953,6 +4912,11 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap +@@ -3951,6 +4908,11 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap return scanned; } @@ -5470,9 +5446,9 @@ index f365386eb441..d1dfc0a77b6f 100644 + * reclaim. + */ static unsigned long get_nr_to_scan(struct lruvec *lruvec, struct scan_control *sc, - bool can_swap, unsigned long reclaimed) + bool can_swap) { -@@ -3999,7 +4963,8 @@ static unsigned long get_nr_to_scan(struct lruvec *lruvec, struct scan_control * +@@ -3976,7 +4938,8 @@ static unsigned long get_nr_to_scan(struct lruvec *lruvec, struct scan_control * if (current_is_kswapd()) return 0; @@ -5482,7 +5458,7 @@ index f365386eb441..d1dfc0a77b6f 100644 done: return min_seq[!can_swap] + MIN_NR_GENS <= max_seq ? nr_to_scan : 0; } -@@ -4014,6 +4979,8 @@ static void lru_gen_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc +@@ -3990,6 +4953,8 @@ static void lru_gen_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc blk_start_plug(&plug); @@ -5491,7 +5467,7 @@ index f365386eb441..d1dfc0a77b6f 100644 while (true) { int delta; int swappiness; -@@ -4041,6 +5008,8 @@ static void lru_gen_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc +@@ -4017,6 +4982,8 @@ static void lru_gen_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc cond_resched(); } @@ -5500,7 +5476,7 @@ index f365386eb441..d1dfc0a77b6f 100644 blk_finish_plug(&plug); } -@@ -4057,15 +5026,21 @@ void lru_gen_init_lruvec(struct lruvec *lruvec) +@@ -4033,15 +5000,21 @@ void lru_gen_init_lruvec(struct lruvec *lruvec) for_each_gen_type_zone(gen, type, zone) INIT_LIST_HEAD(&lrugen->lists[gen][type][zone]); @@ -5522,7 +5498,7 @@ index f365386eb441..d1dfc0a77b6f 100644 int nid; for_each_node(nid) { -@@ -4073,6 +5048,11 @@ void lru_gen_exit_memcg(struct mem_cgroup *memcg) +@@ -4049,6 +5022,11 @@ void lru_gen_exit_memcg(struct mem_cgroup *memcg) VM_WARN_ON_ONCE(memchr_inv(lruvec->lrugen.nr_pages, 0, sizeof(lruvec->lrugen.nr_pages))); @@ -5535,20 +5511,21 @@ index f365386eb441..d1dfc0a77b6f 100644 } #endif -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 09/14] mm: multi-gen LRU: optimize multiple memcgs - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao + +* [PATCH mm-unstable v15 09/14] mm: multi-gen LRU: optimize multiple memcgs + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao ` (7 preceding siblings ...) - 2022-08-15 7:13 ` [PATCH v14 08/14] mm: multi-gen LRU: support page table walks Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 10/14] mm: multi-gen LRU: kill switch Yu Zhao - ` (4 subsequent siblings) - 13 siblings, 0 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + 2022-09-18 8:00 ` [PATCH mm-unstable v15 08/14] mm: multi-gen LRU: support page table walks Yu Zhao +@ 2022-09-18 8:00 ` Yu Zhao + 2022-09-28 18:46 ` Yu Zhao + 2022-09-18 8:00 ` [PATCH mm-unstable v15 10/14] mm: multi-gen LRU: kill switch Yu Zhao + ` (5 subsequent siblings) + 14 siblings, 1 reply; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 8:00 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -5561,12 +5538,12 @@ From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) Holger Hoffstätte, Konstantin Kharlamov, Shuang Zhai, Sofia Trinh, Vaibhav Jain -When multiple memcgs are available, it is possible to make better -choices based on generations and tiers and therefore improve the -overall performance under global memory pressure. This patch adds a -rudimentary optimization to select memcgs that can drop single-use -unmapped clean pages first. Doing so reduces the chance of going into -the aging path or swapping. These two decisions can be costly. +When multiple memcgs are available, it is possible to use generations +as a frame of reference to make better choices and improve overall +performance under global memory pressure. This patch adds a basic +optimization to select memcgs that can drop single-use unmapped clean +pages first. Doing so reduces the chance of going into the aging path +or swapping, which can be costly. A typical example that benefits from this optimization is a server running mixed types of workloads, e.g., heavy anon workload in one @@ -5574,7 +5551,17 @@ memcg and heavy buffered I/O workload in the other. Though this optimization can be applied to both kswapd and direct reclaim, it is only added to kswapd to keep the patchset manageable. -Later improvements will cover the direct reclaim path. +Later improvements may cover the direct reclaim path. + +While ensuring certain fairness to all eligible memcgs, proportional +scans of individual memcgs also require proper backoff to avoid +overshooting their aggregate reclaim target by too much. Otherwise it +can cause high direct reclaim latency. The conditions for backoff are: +1. At low priorities, for direct reclaim, if aging fairness or direct + reclaim latency is at risk, i.e., aging one memcg multiple times or + swapping after the target is met. +2. At high priorities, for global reclaim, if per-zone free pages are + above respective watermarks. Server benchmark results: Mixed workloads: @@ -5651,51 +5638,47 @@ Tested-by: Shuang Zhai Tested-by: Sofia Trinh Tested-by: Vaibhav Jain --- - mm/vmscan.c | 55 ++++++++++++++++++++++++++++++++++++++++++++--------- - 1 file changed, 46 insertions(+), 9 deletions(-) + mm/vmscan.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++----- + 1 file changed, 96 insertions(+), 9 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c -index d1dfc0a77b6f..ee51c752a3af 100644 +index c579b254fec7..3f83325fdc71 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c -@@ -131,6 +131,13 @@ struct scan_control { +@@ -131,6 +131,12 @@ struct scan_control { /* Always discard instead of demoting to lower tier memory */ unsigned int no_demotion:1; +#ifdef CONFIG_LRU_GEN -+ /* help make better choices when multiple memcgs are available */ ++ /* help kswapd make better choices among multiple memcgs */ + unsigned int memcgs_need_aging:1; -+ unsigned int memcgs_need_swapping:1; -+ unsigned int memcgs_avoid_swapping:1; ++ unsigned long last_reclaimed; +#endif + /* Allocation order */ s8 order; -@@ -4437,6 +4444,22 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) +@@ -4431,6 +4437,19 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) VM_WARN_ON_ONCE(!current_is_kswapd()); ++ sc->last_reclaimed = sc->nr_reclaimed; ++ + /* -+ * To reduce the chance of going into the aging path or swapping, which -+ * can be costly, optimistically skip them unless their corresponding -+ * flags were cleared in the eviction path. This improves the overall -+ * performance when multiple memcgs are available. ++ * To reduce the chance of going into the aging path, which can be ++ * costly, optimistically skip it if the flag below was cleared in the ++ * eviction path. This improves the overall performance when multiple ++ * memcgs are available. + */ + if (!sc->memcgs_need_aging) { + sc->memcgs_need_aging = true; -+ sc->memcgs_avoid_swapping = !sc->memcgs_need_swapping; -+ sc->memcgs_need_swapping = true; + return; + } -+ -+ sc->memcgs_need_swapping = true; -+ sc->memcgs_avoid_swapping = true; + set_mm_walk(pgdat); memcg = mem_cgroup_iter(NULL, NULL, NULL); -@@ -4846,7 +4869,8 @@ static int isolate_folios(struct lruvec *lruvec, struct scan_control *sc, int sw +@@ -4842,7 +4861,8 @@ static int isolate_folios(struct lruvec *lruvec, struct scan_control *sc, int sw return scanned; } @@ -5705,61 +5688,113 @@ index d1dfc0a77b6f..ee51c752a3af 100644 { int type; int scanned; -@@ -4909,6 +4933,9 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap +@@ -4905,6 +4925,9 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap sc->nr_reclaimed += reclaimed; -+ if (type == LRU_GEN_ANON && need_swapping) ++ if (need_swapping && type == LRU_GEN_ANON) + *need_swapping = true; + return scanned; } -@@ -4918,10 +4945,9 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap +@@ -4914,9 +4937,8 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap * reclaim. */ static unsigned long get_nr_to_scan(struct lruvec *lruvec, struct scan_control *sc, -- bool can_swap, unsigned long reclaimed) -+ bool can_swap, unsigned long reclaimed, bool *need_aging) +- bool can_swap) ++ bool can_swap, bool *need_aging) { - int priority; - bool need_aging; unsigned long nr_to_scan; struct mem_cgroup *memcg = lruvec_memcg(lruvec); DEFINE_MAX_SEQ(lruvec); -@@ -4936,7 +4962,7 @@ static unsigned long get_nr_to_scan(struct lruvec *lruvec, struct scan_control * +@@ -4926,8 +4948,8 @@ static unsigned long get_nr_to_scan(struct lruvec *lruvec, struct scan_control * (mem_cgroup_below_low(memcg) && !sc->memcg_low_reclaim)) return 0; -- nr_to_scan = get_nr_evictable(lruvec, max_seq, min_seq, can_swap, &need_aging); -+ nr_to_scan = get_nr_evictable(lruvec, max_seq, min_seq, can_swap, need_aging); - if (!nr_to_scan) - return 0; - -@@ -4952,7 +4978,7 @@ static unsigned long get_nr_to_scan(struct lruvec *lruvec, struct scan_control * - if (!nr_to_scan) - return 0; - +- need_aging = should_run_aging(lruvec, max_seq, min_seq, sc, can_swap, &nr_to_scan); - if (!need_aging) ++ *need_aging = should_run_aging(lruvec, max_seq, min_seq, sc, can_swap, &nr_to_scan); + if (!*need_aging) return nr_to_scan; /* skip the aging path at the default priority */ -@@ -4972,6 +4998,8 @@ static unsigned long get_nr_to_scan(struct lruvec *lruvec, struct scan_control * +@@ -4944,10 +4966,68 @@ static unsigned long get_nr_to_scan(struct lruvec *lruvec, struct scan_control * + return min_seq[!can_swap] + MIN_NR_GENS <= max_seq ? nr_to_scan : 0; + } + ++static bool should_abort_scan(struct lruvec *lruvec, unsigned long seq, ++ struct scan_control *sc, bool need_swapping) ++{ ++ int i; ++ DEFINE_MAX_SEQ(lruvec); ++ ++ if (!current_is_kswapd()) { ++ /* age each memcg once to ensure fairness */ ++ if (max_seq - seq > 1) ++ return true; ++ ++ /* over-swapping can increase allocation latency */ ++ if (sc->nr_reclaimed >= sc->nr_to_reclaim && need_swapping) ++ return true; ++ ++ /* give this thread a chance to exit and free its memory */ ++ if (fatal_signal_pending(current)) { ++ sc->nr_reclaimed += MIN_LRU_BATCH; ++ return true; ++ } ++ ++ if (cgroup_reclaim(sc)) ++ return false; ++ } else if (sc->nr_reclaimed - sc->last_reclaimed < sc->nr_to_reclaim) ++ return false; ++ ++ /* keep scanning at low priorities to ensure fairness */ ++ if (sc->priority > DEF_PRIORITY - 2) ++ return false; ++ ++ /* ++ * A minimum amount of work was done under global memory pressure. For ++ * kswapd, it may be overshooting. For direct reclaim, the target isn't ++ * met, and yet the allocation may still succeed, since kswapd may have ++ * caught up. In either case, it's better to stop now, and restart if ++ * necessary. ++ */ ++ for (i = 0; i <= sc->reclaim_idx; i++) { ++ unsigned long wmark; ++ struct zone *zone = lruvec_pgdat(lruvec)->node_zones + i; ++ ++ if (!managed_zone(zone)) ++ continue; ++ ++ wmark = current_is_kswapd() ? high_wmark_pages(zone) : low_wmark_pages(zone); ++ if (wmark > zone_page_state(zone, NR_FREE_PAGES)) ++ return false; ++ } ++ ++ sc->nr_reclaimed += MIN_LRU_BATCH; ++ ++ return true; ++} ++ static void lru_gen_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc) { struct blk_plug plug; + bool need_aging = false; + bool need_swapping = false; unsigned long scanned = 0; - unsigned long reclaimed = sc->nr_reclaimed; ++ unsigned long reclaimed = sc->nr_reclaimed; ++ DEFINE_MAX_SEQ(lruvec); -@@ -4993,21 +5021,30 @@ static void lru_gen_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc + lru_add_drain(); + +@@ -4967,21 +5047,28 @@ static void lru_gen_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc else swappiness = 0; -- nr_to_scan = get_nr_to_scan(lruvec, sc, swappiness, reclaimed); -+ nr_to_scan = get_nr_to_scan(lruvec, sc, swappiness, reclaimed, &need_aging); +- nr_to_scan = get_nr_to_scan(lruvec, sc, swappiness); ++ nr_to_scan = get_nr_to_scan(lruvec, sc, swappiness, &need_aging); if (!nr_to_scan) - break; + goto done; @@ -5774,36 +5809,34 @@ index d1dfc0a77b6f..ee51c752a3af 100644 if (scanned >= nr_to_scan) break; -+ if (sc->memcgs_avoid_swapping && swappiness < 200 && need_swapping) ++ if (should_abort_scan(lruvec, max_seq, sc, need_swapping)) + break; + cond_resched(); } + /* see the comment in lru_gen_age_node() */ -+ if (!need_aging) ++ if (sc->nr_reclaimed - reclaimed >= MIN_LRU_BATCH && !need_aging) + sc->memcgs_need_aging = false; -+ if (!need_swapping) -+ sc->memcgs_need_swapping = false; +done: clear_mm_walk(); blk_finish_plug(&plug); -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 10/14] mm: multi-gen LRU: kill switch - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao + +* [PATCH mm-unstable v15 10/14] mm: multi-gen LRU: kill switch + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao ` (8 preceding siblings ...) - 2022-08-15 7:13 ` [PATCH v14 09/14] mm: multi-gen LRU: optimize multiple memcgs Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 11/14] mm: multi-gen LRU: thrashing prevention Yu Zhao - ` (3 subsequent siblings) - 13 siblings, 0 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + 2022-09-18 8:00 ` [PATCH mm-unstable v15 09/14] mm: multi-gen LRU: optimize multiple memcgs Yu Zhao +@ 2022-09-18 8:00 ` Yu Zhao + 2022-09-18 8:00 ` [PATCH mm-unstable v15 11/14] mm: multi-gen LRU: thrashing prevention Yu Zhao + ` (4 subsequent siblings) + 14 siblings, 0 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 8:00 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -5870,11 +5903,11 @@ Tested-by: Vaibhav Jain include/linux/mmzone.h | 9 ++ kernel/cgroup/cgroup-internal.h | 1 - mm/Kconfig | 6 + - mm/vmscan.c | 231 +++++++++++++++++++++++++++++++- - 6 files changed, 268 insertions(+), 9 deletions(-) + mm/vmscan.c | 228 +++++++++++++++++++++++++++++++- + 6 files changed, 265 insertions(+), 9 deletions(-) diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h -index ed53bfe7c46c..dee80e670291 100644 +index ac5d0515680e..9179463c3c9f 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -432,6 +432,18 @@ static inline void cgroup_put(struct cgroup *cgrp) @@ -5950,7 +5983,7 @@ index f2b2296a42f9..4949eda9a9a2 100644 /* * There are three common cases for this page: diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h -index 51e521465742..7f8c529b46ad 100644 +index b1635c4020dc..95c58c7fbdff 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -387,6 +387,13 @@ enum { @@ -5965,7 +5998,7 @@ index 51e521465742..7f8c529b46ad 100644 +}; + #define MIN_LRU_BATCH BITS_PER_LONG - #define MAX_LRU_BATCH (MIN_LRU_BATCH * 128) + #define MAX_LRU_BATCH (MIN_LRU_BATCH * 64) @@ -428,6 +435,8 @@ struct lru_gen_struct { /* can be modified without holding the LRU lock */ @@ -5989,10 +6022,10 @@ index 36b740cb3d59..63dc3e82be4f 100644 extern struct cgroup_subsys *cgroup_subsys[]; extern struct list_head cgroup_roots; diff --git a/mm/Kconfig b/mm/Kconfig -index 5101dca8f21c..6c86849c4db9 100644 +index 5c5dcbdcfe34..ab6ef5115eb8 100644 --- a/mm/Kconfig +++ b/mm/Kconfig -@@ -1133,6 +1133,12 @@ config LRU_GEN +@@ -1127,6 +1127,12 @@ config LRU_GEN help A high performance LRU implementation to overcommit memory. @@ -6006,7 +6039,7 @@ index 5101dca8f21c..6c86849c4db9 100644 bool "Full stats for debugging" depends on LRU_GEN diff --git a/mm/vmscan.c b/mm/vmscan.c -index ee51c752a3af..5502c553e32e 100644 +index 3f83325fdc71..10f31f3c5054 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -51,6 +51,7 @@ @@ -6017,7 +6050,7 @@ index ee51c752a3af..5502c553e32e 100644 #include #include -@@ -3071,6 +3072,14 @@ static bool can_age_anon_pages(struct pglist_data *pgdat, +@@ -3070,6 +3071,14 @@ static bool can_age_anon_pages(struct pglist_data *pgdat, #ifdef CONFIG_LRU_GEN @@ -6032,7 +6065,7 @@ index ee51c752a3af..5502c553e32e 100644 /****************************************************************************** * shorthand helpers ******************************************************************************/ -@@ -3948,7 +3957,8 @@ static void walk_pmd_range_locked(pud_t *pud, unsigned long next, struct vm_area +@@ -3946,7 +3955,8 @@ static void walk_pmd_range_locked(pud_t *pud, unsigned long next, struct vm_area goto next; if (!pmd_trans_huge(pmd[i])) { @@ -6042,7 +6075,7 @@ index ee51c752a3af..5502c553e32e 100644 pmdp_test_and_clear_young(vma, addr, pmd + i); goto next; } -@@ -4046,10 +4056,12 @@ static void walk_pmd_range(pud_t *pud, unsigned long start, unsigned long end, +@@ -4044,10 +4054,12 @@ static void walk_pmd_range(pud_t *pud, unsigned long start, unsigned long end, walk->mm_stats[MM_NONLEAF_TOTAL]++; #ifdef CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG @@ -6058,7 +6091,7 @@ index ee51c752a3af..5502c553e32e 100644 #endif if (!walk->force_scan && !test_bloom_filter(walk->lruvec, walk->max_seq, pmd + i)) continue; -@@ -4314,7 +4326,7 @@ static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, +@@ -4309,7 +4321,7 @@ static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, * handful of PTEs. Spreading the work out over a period of time usually * is less efficient, but it avoids bursty page faults. */ @@ -6067,7 +6100,7 @@ index ee51c752a3af..5502c553e32e 100644 success = iterate_mm_list_nowalk(lruvec, max_seq); goto done; } -@@ -5050,6 +5062,211 @@ static void lru_gen_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc +@@ -5074,6 +5086,208 @@ static void lru_gen_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc blk_finish_plug(&plug); } @@ -6092,9 +6125,6 @@ index ee51c752a3af..5502c553e32e 100644 + for_each_gen_type_zone(gen, type, zone) { + if (!list_empty(&lrugen->lists[gen][type][zone])) + return false; -+ -+ /* unlikely but not a bug when reset_batch_size() is pending */ -+ VM_WARN_ON_ONCE(lrugen->nr_pages[gen][type][zone]); + } + } + @@ -6279,7 +6309,7 @@ index ee51c752a3af..5502c553e32e 100644 /****************************************************************************** * initialization ******************************************************************************/ -@@ -5060,6 +5277,7 @@ void lru_gen_init_lruvec(struct lruvec *lruvec) +@@ -5084,6 +5298,7 @@ void lru_gen_init_lruvec(struct lruvec *lruvec) struct lru_gen_struct *lrugen = &lruvec->lrugen; lrugen->max_seq = MIN_NR_GENS + 1; @@ -6287,7 +6317,7 @@ index ee51c752a3af..5502c553e32e 100644 for_each_gen_type_zone(gen, type, zone) INIT_LIST_HEAD(&lrugen->lists[gen][type][zone]); -@@ -5099,6 +5317,9 @@ static int __init init_lru_gen(void) +@@ -5123,6 +5338,9 @@ static int __init init_lru_gen(void) BUILD_BUG_ON(MIN_NR_GENS + 1 >= MAX_NR_GENS); BUILD_BUG_ON(BIT(LRU_GEN_WIDTH) <= MAX_NR_GENS); @@ -6298,20 +6328,20 @@ index ee51c752a3af..5502c553e32e 100644 }; late_initcall(init_lru_gen); -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 11/14] mm: multi-gen LRU: thrashing prevention - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao + +* [PATCH mm-unstable v15 11/14] mm: multi-gen LRU: thrashing prevention + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao ` (9 preceding siblings ...) - 2022-08-15 7:13 ` [PATCH v14 10/14] mm: multi-gen LRU: kill switch Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 12/14] mm: multi-gen LRU: debugfs interface Yu Zhao - ` (2 subsequent siblings) - 13 siblings, 0 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + 2022-09-18 8:00 ` [PATCH mm-unstable v15 10/14] mm: multi-gen LRU: kill switch Yu Zhao +@ 2022-09-18 8:00 ` Yu Zhao + 2022-09-18 8:00 ` [PATCH mm-unstable v15 12/14] mm: multi-gen LRU: debugfs interface Yu Zhao + ` (3 subsequent siblings) + 14 siblings, 0 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 8:00 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -6358,11 +6388,11 @@ Tested-by: Sofia Trinh Tested-by: Vaibhav Jain --- include/linux/mmzone.h | 2 ++ - mm/vmscan.c | 75 +++++++++++++++++++++++++++++++++++++++--- - 2 files changed, 73 insertions(+), 4 deletions(-) + mm/vmscan.c | 74 ++++++++++++++++++++++++++++++++++++++++-- + 2 files changed, 73 insertions(+), 3 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h -index 7f8c529b46ad..2558b57a05bc 100644 +index 95c58c7fbdff..87347945270b 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -422,6 +422,8 @@ struct lru_gen_struct { @@ -6375,10 +6405,10 @@ index 7f8c529b46ad..2558b57a05bc 100644 struct list_head lists[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES]; /* the multi-gen LRU sizes, eventually consistent */ diff --git a/mm/vmscan.c b/mm/vmscan.c -index 5502c553e32e..08727f3b7171 100644 +index 10f31f3c5054..9ef2ec3d3c0c 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c -@@ -4298,6 +4298,7 @@ static void inc_max_seq(struct lruvec *lruvec, bool can_swap) +@@ -4293,6 +4293,7 @@ static void inc_max_seq(struct lruvec *lruvec, bool can_swap) for (type = 0; type < ANON_AND_FILE; type++) reset_ctrl_pos(lruvec, type, false); @@ -6386,8 +6416,8 @@ index 5502c553e32e..08727f3b7171 100644 /* make sure preceding modifications appear */ smp_store_release(&lrugen->max_seq, lrugen->max_seq + 1); -@@ -4424,7 +4425,7 @@ static unsigned long get_nr_evictable(struct lruvec *lruvec, unsigned long max_s - return total; +@@ -4422,7 +4423,7 @@ static bool should_run_aging(struct lruvec *lruvec, unsigned long max_seq, unsig + return false; } -static void age_lruvec(struct lruvec *lruvec, struct scan_control *sc) @@ -6395,20 +6425,15 @@ index 5502c553e32e..08727f3b7171 100644 { bool need_aging; unsigned long nr_to_scan; -@@ -4438,21 +4439,40 @@ static void age_lruvec(struct lruvec *lruvec, struct scan_control *sc) +@@ -4436,16 +4437,36 @@ static void age_lruvec(struct lruvec *lruvec, struct scan_control *sc) mem_cgroup_calculate_protection(NULL, memcg); if (mem_cgroup_below_min(memcg)) - return; + return false; - nr_to_scan = get_nr_evictable(lruvec, max_seq, min_seq, swappiness, &need_aging); - if (!nr_to_scan) -- return; -+ return false; - - nr_to_scan >>= mem_cgroup_online(memcg) ? sc->priority : 0; - + need_aging = should_run_aging(lruvec, max_seq, min_seq, sc, swappiness, &nr_to_scan); ++ + if (min_ttl) { + int gen = lru_gen_from_seq(min_seq[LRU_GEN_FILE]); + unsigned long birth = READ_ONCE(lruvec->lrugen.timestamps[gen]); @@ -6421,7 +6446,7 @@ index 5502c553e32e..08727f3b7171 100644 + return false; + } + - if (nr_to_scan && need_aging) + if (need_aging) try_to_inc_max_seq(lruvec, max_seq, sc, swappiness); + + return true; @@ -6438,7 +6463,7 @@ index 5502c553e32e..08727f3b7171 100644 VM_WARN_ON_ONCE(!current_is_kswapd()); -@@ -4478,12 +4498,32 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) +@@ -4468,12 +4489,32 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) do { struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat); @@ -6472,7 +6497,7 @@ index 5502c553e32e..08727f3b7171 100644 } /* -@@ -5210,6 +5250,28 @@ static void lru_gen_change_state(bool enabled) +@@ -5231,6 +5272,28 @@ static void lru_gen_change_state(bool enabled) * sysfs interface ******************************************************************************/ @@ -6501,7 +6526,7 @@ index 5502c553e32e..08727f3b7171 100644 static ssize_t show_enabled(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { unsigned int caps = 0; -@@ -5258,6 +5320,7 @@ static struct kobj_attribute lru_gen_enabled_attr = __ATTR( +@@ -5279,6 +5342,7 @@ static struct kobj_attribute lru_gen_enabled_attr = __ATTR( ); static struct attribute *lru_gen_attrs[] = { @@ -6509,7 +6534,7 @@ index 5502c553e32e..08727f3b7171 100644 &lru_gen_enabled_attr.attr, NULL }; -@@ -5273,12 +5336,16 @@ static struct attribute_group lru_gen_attr_group = { +@@ -5294,12 +5358,16 @@ static struct attribute_group lru_gen_attr_group = { void lru_gen_init_lruvec(struct lruvec *lruvec) { @@ -6527,20 +6552,20 @@ index 5502c553e32e..08727f3b7171 100644 INIT_LIST_HEAD(&lrugen->lists[gen][type][zone]); -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 12/14] mm: multi-gen LRU: debugfs interface - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao + +* [PATCH mm-unstable v15 12/14] mm: multi-gen LRU: debugfs interface + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao ` (10 preceding siblings ...) - 2022-08-15 7:13 ` [PATCH v14 11/14] mm: multi-gen LRU: thrashing prevention Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 13/14] mm: multi-gen LRU: admin guide Yu Zhao - 2022-08-15 7:13 ` [PATCH v14 14/14] mm: multi-gen LRU: design doc Yu Zhao - 13 siblings, 0 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + 2022-09-18 8:00 ` [PATCH mm-unstable v15 11/14] mm: multi-gen LRU: thrashing prevention Yu Zhao +@ 2022-09-18 8:00 ` Yu Zhao + 2022-09-18 8:00 ` [PATCH mm-unstable v15 13/14] mm: multi-gen LRU: admin guide Yu Zhao + ` (2 subsequent siblings) + 14 siblings, 0 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 8:00 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -6601,7 +6626,7 @@ index 4b71a96190a8..3a0eec9f2faa 100644 #define nr_online_nodes 1U diff --git a/mm/vmscan.c b/mm/vmscan.c -index 08727f3b7171..509989fb39ef 100644 +index 9ef2ec3d3c0c..7657d54c9c42 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -52,6 +52,7 @@ @@ -6612,7 +6637,7 @@ index 08727f3b7171..509989fb39ef 100644 #include #include -@@ -4202,12 +4203,40 @@ static void clear_mm_walk(void) +@@ -4197,12 +4198,40 @@ static void clear_mm_walk(void) kfree(walk); } @@ -6654,7 +6679,7 @@ index 08727f3b7171..509989fb39ef 100644 } static bool try_to_inc_min_seq(struct lruvec *lruvec, bool can_swap) -@@ -4253,7 +4282,7 @@ static bool try_to_inc_min_seq(struct lruvec *lruvec, bool can_swap) +@@ -4248,7 +4277,7 @@ static bool try_to_inc_min_seq(struct lruvec *lruvec, bool can_swap) return success; } @@ -6663,7 +6688,7 @@ index 08727f3b7171..509989fb39ef 100644 { int prev, next; int type, zone; -@@ -4267,9 +4296,13 @@ static void inc_max_seq(struct lruvec *lruvec, bool can_swap) +@@ -4262,9 +4291,13 @@ static void inc_max_seq(struct lruvec *lruvec, bool can_swap) if (get_nr_gens(lruvec, type) != MAX_NR_GENS) continue; @@ -6679,7 +6704,7 @@ index 08727f3b7171..509989fb39ef 100644 } /* -@@ -4306,7 +4339,7 @@ static void inc_max_seq(struct lruvec *lruvec, bool can_swap) +@@ -4301,7 +4334,7 @@ static void inc_max_seq(struct lruvec *lruvec, bool can_swap) } static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, @@ -6688,7 +6713,7 @@ index 08727f3b7171..509989fb39ef 100644 { bool success; struct lru_gen_mm_walk *walk; -@@ -4327,7 +4360,7 @@ static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, +@@ -4322,7 +4355,7 @@ static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, * handful of PTEs. Spreading the work out over a period of time usually * is less efficient, but it avoids bursty page faults. */ @@ -6697,7 +6722,7 @@ index 08727f3b7171..509989fb39ef 100644 success = iterate_mm_list_nowalk(lruvec, max_seq); goto done; } -@@ -4341,7 +4374,7 @@ static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, +@@ -4336,7 +4369,7 @@ static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, walk->lruvec = lruvec; walk->max_seq = max_seq; walk->can_swap = can_swap; @@ -6706,7 +6731,7 @@ index 08727f3b7171..509989fb39ef 100644 do { success = iterate_mm_list(lruvec, walk, &mm); -@@ -4361,7 +4394,7 @@ static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, +@@ -4356,7 +4389,7 @@ static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, VM_WARN_ON_ONCE(max_seq != READ_ONCE(lrugen->max_seq)); @@ -6715,16 +6740,16 @@ index 08727f3b7171..509989fb39ef 100644 /* either this sees any waiters or they will see updated max_seq */ if (wq_has_sleeper(&lruvec->mm_state.wait)) wake_up_all(&lruvec->mm_state.wait); -@@ -4460,7 +4493,7 @@ static bool age_lruvec(struct lruvec *lruvec, struct scan_control *sc, unsigned +@@ -4454,7 +4487,7 @@ static bool age_lruvec(struct lruvec *lruvec, struct scan_control *sc, unsigned } - if (nr_to_scan && need_aging) + if (need_aging) - try_to_inc_max_seq(lruvec, max_seq, sc, swappiness); + try_to_inc_max_seq(lruvec, max_seq, sc, swappiness, false); return true; } -@@ -5041,7 +5074,7 @@ static unsigned long get_nr_to_scan(struct lruvec *lruvec, struct scan_control * +@@ -5013,7 +5046,7 @@ static unsigned long get_nr_to_scan(struct lruvec *lruvec, struct scan_control * if (current_is_kswapd()) return 0; @@ -6733,7 +6758,7 @@ index 08727f3b7171..509989fb39ef 100644 return nr_to_scan; done: return min_seq[!can_swap] + MIN_NR_GENS <= max_seq ? nr_to_scan : 0; -@@ -5330,6 +5363,361 @@ static struct attribute_group lru_gen_attr_group = { +@@ -5352,6 +5385,361 @@ static struct attribute_group lru_gen_attr_group = { .attrs = lru_gen_attrs, }; @@ -7095,7 +7120,7 @@ index 08727f3b7171..509989fb39ef 100644 /****************************************************************************** * initialization ******************************************************************************/ -@@ -5387,6 +5775,9 @@ static int __init init_lru_gen(void) +@@ -5409,6 +5797,9 @@ static int __init init_lru_gen(void) if (sysfs_create_group(mm_kobj, &lru_gen_attr_group)) pr_err("lru_gen: failed to create sysfs group\n"); @@ -7106,21 +7131,21 @@ index 08727f3b7171..509989fb39ef 100644 }; late_initcall(init_lru_gen); -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 13/14] mm: multi-gen LRU: admin guide - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao + +* [PATCH mm-unstable v15 13/14] mm: multi-gen LRU: admin guide + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao ` (11 preceding siblings ...) - 2022-08-15 7:13 ` [PATCH v14 12/14] mm: multi-gen LRU: debugfs interface Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 9:06 ` Bagas Sanjaya - 2022-08-15 9:12 ` Mike Rapoport - 2022-08-15 7:13 ` [PATCH v14 14/14] mm: multi-gen LRU: design doc Yu Zhao - 13 siblings, 2 replies; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + 2022-09-18 8:00 ` [PATCH mm-unstable v15 12/14] mm: multi-gen LRU: debugfs interface Yu Zhao +@ 2022-09-18 8:00 ` Yu Zhao + 2022-09-18 8:26 ` Mike Rapoport + 2022-09-18 8:00 ` [PATCH mm-unstable v15 14/14] mm: multi-gen LRU: design doc Yu Zhao + 2022-09-19 2:08 ` [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Bagas Sanjaya + 14 siblings, 1 reply; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 8:00 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -7150,10 +7175,10 @@ Tested-by: Sofia Trinh Tested-by: Vaibhav Jain --- Documentation/admin-guide/mm/index.rst | 1 + - Documentation/admin-guide/mm/multigen_lru.rst | 156 ++++++++++++++++++ + Documentation/admin-guide/mm/multigen_lru.rst | 162 ++++++++++++++++++ mm/Kconfig | 3 +- mm/vmscan.c | 4 + - 4 files changed, 163 insertions(+), 1 deletion(-) + 4 files changed, 169 insertions(+), 1 deletion(-) create mode 100644 Documentation/admin-guide/mm/multigen_lru.rst diff --git a/Documentation/admin-guide/mm/index.rst b/Documentation/admin-guide/mm/index.rst @@ -7170,10 +7195,10 @@ index 1bd11118dfb1..d1064e0ba34a 100644 numaperf diff --git a/Documentation/admin-guide/mm/multigen_lru.rst b/Documentation/admin-guide/mm/multigen_lru.rst new file mode 100644 -index 000000000000..6355f2b5019d +index 000000000000..33e068830497 --- /dev/null +++ b/Documentation/admin-guide/mm/multigen_lru.rst -@@ -0,0 +1,156 @@ +@@ -0,0 +1,162 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============= @@ -7295,15 +7320,18 @@ index 000000000000..6355f2b5019d +and ``max_gen_nr`` contains the hottest pages, since ``age_in_ms`` of +the former is the largest and that of the latter is the smallest. + -+Users can write ``+ memcg_id node_id max_gen_nr -+[can_swap [force_scan]]`` to ``lru_gen`` to create a new generation -+``max_gen_nr+1``. ``can_swap`` defaults to the swap setting and, if it -+is set to ``1``, it forces the scan of anon pages when swap is off, -+and vice versa. ``force_scan`` defaults to ``1`` and, if it is set to -+``0``, it employs heuristics to reduce the overhead, which is likely -+to reduce the coverage as well. ++Users can write the following command to ``lru_gen`` to create a new ++generation ``max_gen_nr+1``: + -+A typical use case is that a job scheduler writes to ``lru_gen`` at a ++ ``+ memcg_id node_id max_gen_nr [can_swap [force_scan]]`` ++ ++``can_swap`` defaults to the swap setting and, if it is set to ``1``, ++it forces the scan of anon pages when swap is off, and vice versa. ++``force_scan`` defaults to ``1`` and, if it is set to ``0``, it ++employs heuristics to reduce the overhead, which is likely to reduce ++the coverage as well. ++ ++A typical use case is that a job scheduler runs this command at a +certain time interval to create new generations, and it ranks the +servers it manages based on the sizes of their cold pages defined by +this time interval. @@ -7313,28 +7341,31 @@ index 000000000000..6355f2b5019d +Proactive reclaim induces page reclaim when there is no memory +pressure. It usually targets cold pages only. E.g., when a new job +comes in, the job scheduler wants to proactively reclaim cold pages on -+the server it selected to improve the chance of successfully landing ++the server it selected, to improve the chance of successfully landing +this new job. + -+Users can write ``- memcg_id node_id min_gen_nr [swappiness -+[nr_to_reclaim]]`` to ``lru_gen`` to evict generations less than or -+equal to ``min_gen_nr``. Note that ``min_gen_nr`` should be less than -+``max_gen_nr-1`` as ``max_gen_nr`` and ``max_gen_nr-1`` are not fully -+aged and therefore cannot be evicted. ``swappiness`` overrides the -+default value in ``/proc/sys/vm/swappiness``. ``nr_to_reclaim`` limits -+the number of pages to evict. ++Users can write the following command to ``lru_gen`` to evict ++generations less than or equal to ``min_gen_nr``. + -+A typical use case is that a job scheduler writes to ``lru_gen`` -+before it tries to land a new job on a server. If it fails to -+materialize enough cold pages because of the overestimation, it -+retries on the next server according to the ranking result obtained -+from the working set estimation step. This less forceful approach -+limits the impacts on the existing jobs. ++ ``- memcg_id node_id min_gen_nr [swappiness [nr_to_reclaim]]`` ++ ++``min_gen_nr`` should be less than ``max_gen_nr-1``, since ++``max_gen_nr`` and ``max_gen_nr-1`` are not fully aged (equivalent to ++the active list) and therefore cannot be evicted. ``swappiness`` ++overrides the default value in ``/proc/sys/vm/swappiness``. ++``nr_to_reclaim`` limits the number of pages to evict. ++ ++A typical use case is that a job scheduler runs this command before it ++tries to land a new job on a server. If it fails to materialize enough ++cold pages because of the overestimation, it retries on the next ++server according to the ranking result obtained from the working set ++estimation step. This less forceful approach limits the impacts on the ++existing jobs. diff --git a/mm/Kconfig b/mm/Kconfig -index 6c86849c4db9..96cd3ae25c6f 100644 +index ab6ef5115eb8..ceec438c0741 100644 --- a/mm/Kconfig +++ b/mm/Kconfig -@@ -1131,7 +1131,8 @@ config LRU_GEN +@@ -1125,7 +1125,8 @@ config LRU_GEN # make sure folio->flags has enough spare bits depends on 64BIT || !SPARSEMEM || SPARSEMEM_VMEMMAP help @@ -7345,10 +7376,10 @@ index 6c86849c4db9..96cd3ae25c6f 100644 config LRU_GEN_ENABLED bool "Enable by default" diff --git a/mm/vmscan.c b/mm/vmscan.c -index 509989fb39ef..f693720047db 100644 +index 7657d54c9c42..1456f133f256 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c -@@ -5288,6 +5288,7 @@ static ssize_t show_min_ttl(struct kobject *kobj, struct kobj_attribute *attr, c +@@ -5310,6 +5310,7 @@ static ssize_t show_min_ttl(struct kobject *kobj, struct kobj_attribute *attr, c return sprintf(buf, "%u\n", jiffies_to_msecs(READ_ONCE(lru_gen_min_ttl))); } @@ -7356,7 +7387,7 @@ index 509989fb39ef..f693720047db 100644 static ssize_t store_min_ttl(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t len) { -@@ -5321,6 +5322,7 @@ static ssize_t show_enabled(struct kobject *kobj, struct kobj_attribute *attr, c +@@ -5343,6 +5344,7 @@ static ssize_t show_enabled(struct kobject *kobj, struct kobj_attribute *attr, c return snprintf(buf, PAGE_SIZE, "0x%04x\n", caps); } @@ -7364,7 +7395,7 @@ index 509989fb39ef..f693720047db 100644 static ssize_t store_enabled(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t len) { -@@ -5468,6 +5470,7 @@ static void lru_gen_seq_show_full(struct seq_file *m, struct lruvec *lruvec, +@@ -5490,6 +5492,7 @@ static void lru_gen_seq_show_full(struct seq_file *m, struct lruvec *lruvec, seq_putc(m, '\n'); } @@ -7372,7 +7403,7 @@ index 509989fb39ef..f693720047db 100644 static int lru_gen_seq_show(struct seq_file *m, void *v) { unsigned long seq; -@@ -5626,6 +5629,7 @@ static int run_cmd(char cmd, int memcg_id, int nid, unsigned long seq, +@@ -5648,6 +5651,7 @@ static int run_cmd(char cmd, int memcg_id, int nid, unsigned long seq, return err; } @@ -7381,19 +7412,19 @@ index 509989fb39ef..f693720047db 100644 size_t len, loff_t *pos) { -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog -^ permalink raw reply [flat|nested] 18+ messages in thread -* [PATCH v14 14/14] mm: multi-gen LRU: design doc - 2022-08-15 7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao + +* [PATCH mm-unstable v15 14/14] mm: multi-gen LRU: design doc + 2022-09-18 7:59 [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Yu Zhao ` (12 preceding siblings ...) - 2022-08-15 7:13 ` [PATCH v14 13/14] mm: multi-gen LRU: admin guide Yu Zhao -@ 2022-08-15 7:13 ` Yu Zhao - 2022-08-15 9:07 ` Bagas Sanjaya - 13 siblings, 1 reply; 18+ messages in thread -From: Yu Zhao @ 2022-08-15 7:13 UTC (permalink / raw) + 2022-09-18 8:00 ` [PATCH mm-unstable v15 13/14] mm: multi-gen LRU: admin guide Yu Zhao +@ 2022-09-18 8:00 ` Yu Zhao + 2022-09-19 2:08 ` [PATCH mm-unstable v15 00/14] Multi-Gen LRU Framework Bagas Sanjaya + 14 siblings, 0 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-18 8:00 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, @@ -7605,4 +7636,120 @@ index 000000000000..d7062c6a8946 +Within the eviction, the PID controller uses refaults as the feedback +to select types to evict and tiers to protect. -- -2.37.1.595.g718a3a8f04-goog +2.37.3.968.ga6b4b080e4-goog + + + +* Re: [PATCH mm-unstable v15 09/14] mm: multi-gen LRU: optimize multiple memcgs + 2022-09-18 8:00 ` [PATCH mm-unstable v15 09/14] mm: multi-gen LRU: optimize multiple memcgs Yu Zhao +@ 2022-09-28 18:46 ` Yu Zhao + 0 siblings, 0 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-28 18:46 UTC (permalink / raw) + To: Andrew Morton + Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, + Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, + Linus Torvalds, Matthew Wilcox, Mel Gorman, Michael Larabel, + Michal Hocko, Mike Rapoport, Peter Zijlstra, Tejun Heo, + Vlastimil Babka, Will Deacon, linux-arm-kernel, linux-doc, + linux-kernel, linux-mm, x86, page-reclaim, Brian Geffon, + Jan Alexander Steffens, Oleksandr Natalenko, Steven Barrett, + Suleiman Souhlal, Daniel Byrne, Donald Carr, + Holger Hoffstätte, Konstantin Kharlamov, Shuang Zhai, + Sofia Trinh, Vaibhav Jain + +Hi Andrew, + +Can you please take this fixlet? Thanks. + +Fix imprecise comments. + +Signed-off-by: Yu Zhao +--- + mm/vmscan.c | 9 ++++----- + 1 file changed, 4 insertions(+), 5 deletions(-) + +diff --git a/mm/vmscan.c b/mm/vmscan.c +index a8fd6300fa7e..5b565470286b 100644 +--- a/mm/vmscan.c ++++ b/mm/vmscan.c +@@ -5078,7 +5078,7 @@ static bool should_abort_scan(struct lruvec *lruvec, unsigned long seq, + DEFINE_MAX_SEQ(lruvec); + + if (!current_is_kswapd()) { +- /* age each memcg once to ensure fairness */ ++ /* age each memcg at most once to ensure fairness */ + if (max_seq - seq > 1) + return true; + +@@ -5103,10 +5103,9 @@ static bool should_abort_scan(struct lruvec *lruvec, unsigned long seq, + + /* + * A minimum amount of work was done under global memory pressure. For +- * kswapd, it may be overshooting. For direct reclaim, the target isn't +- * met, and yet the allocation may still succeed, since kswapd may have +- * caught up. In either case, it's better to stop now, and restart if +- * necessary. ++ * kswapd, it may be overshooting. For direct reclaim, the allocation ++ * may succeed if all suitable zones are somewhat safe. In either case, ++ * it's better to stop now, and restart later if necessary. + */ + for (i = 0; i <= sc->reclaim_idx; i++) { + unsigned long wmark; +-- +2.37.3.998.g577e59143f-goog + + + + +* Re: [PATCH mm-unstable v15 08/14] mm: multi-gen LRU: support page table walks + 2022-09-18 8:00 ` [PATCH mm-unstable v15 08/14] mm: multi-gen LRU: support page table walks Yu Zhao + 2022-09-18 8:17 ` Yu Zhao +@ 2022-09-28 19:36 ` Yu Zhao + 1 sibling, 0 replies; 23+ messages in thread +From: Yu Zhao @ 2022-09-28 19:36 UTC (permalink / raw) + To: Andrew Morton + Cc: Andi Kleen, Aneesh Kumar, Catalin Marinas, Dave Hansen, + Hillf Danton, Jens Axboe, Johannes Weiner, Jonathan Corbet, + Linus Torvalds, Matthew Wilcox, Mel Gorman, Michael Larabel, + Michal Hocko, Mike Rapoport, Peter Zijlstra, Tejun Heo, + Vlastimil Babka, Will Deacon, linux-arm-kernel, linux-doc, + linux-kernel, linux-mm, x86, page-reclaim, Brian Geffon, + Jan Alexander Steffens, Oleksandr Natalenko, Steven Barrett, + Suleiman Souhlal, Daniel Byrne, Donald Carr, + Holger Hoffstätte, Konstantin Kharlamov, Shuang Zhai, + Sofia Trinh, Vaibhav Jain + +Hi Andrew, + +Can you please take another fixlet? Thanks. + +Don't sync disk for each aging cycle. + +wakeup_flusher_threads() was added under the assumption that if a +system runs out of clean cold pages, it might want to write back dirty +pages more aggressively so that they can become clean and be dropped. + +However, doing so can breach the rate limit a system wants to impose +on writeback, resulting in early SSD wearout. + +Reported-by: Axel Rasmussen +Signed-off-by: Yu Zhao +--- + mm/vmscan.c | 2 -- + 1 file changed, 2 deletions(-) + +diff --git a/mm/vmscan.c b/mm/vmscan.c +index 5b565470286b..0317d4cf4884 100644 +--- a/mm/vmscan.c ++++ b/mm/vmscan.c +@@ -4413,8 +4413,6 @@ static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, + if (wq_has_sleeper(&lruvec->mm_state.wait)) + wake_up_all(&lruvec->mm_state.wait); + +- wakeup_flusher_threads(WB_REASON_VMSCAN); +- + return true; + } + +-- +2.37.3.998.g577e59143f-goog