Version Bump
This commit is contained in:
parent
4a10695673
commit
185ec9aad6
@ -0,0 +1,247 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED,
|
||||||
|
DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,
|
||||||
|
HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH,
|
||||||
|
MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham
|
||||||
|
autolearn_force=no version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id 4888EC433C1
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Sat, 27 Mar 2021 13:07:07 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 0E6E861981
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Sat, 27 Mar 2021 13:07:07 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S230295AbhC0NGh (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Sat, 27 Mar 2021 09:06:37 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59740 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S229582AbhC0NGT (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Sat, 27 Mar 2021 09:06:19 -0400
|
||||||
|
Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82262C0613B1
|
||||||
|
for <linux-kernel@vger.kernel.org>; Sat, 27 Mar 2021 06:06:19 -0700 (PDT)
|
||||||
|
Received: by mail-pg1-x529.google.com with SMTP id v10so6405578pgs.12
|
||||||
|
for <linux-kernel@vger.kernel.org>; Sat, 27 Mar 2021 06:06:19 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=gmail.com; s=20161025;
|
||||||
|
h=from:to:cc:subject:date:message-id:in-reply-to:references
|
||||||
|
:mime-version:content-transfer-encoding;
|
||||||
|
bh=/43es5lmfTvSMg9V9lh/7OQVghMj1iNxFqwqD88gyCk=;
|
||||||
|
b=JA8+yZao+x/DmyoiRUpwr0wP9XgaNgDVez40dXm+yEd6Wlgs1dQvO3DkU8n7trJWcL
|
||||||
|
TCj7NqBp0z4pf3pSHrTxX7rWZX4yRyZJAXo7fqTPqfN2R0PkRIp5gnvcDv+7/BRM4nqx
|
||||||
|
3pI6ubgKZ+rxYph8XNAuO94/oOjxgItIhOqYGbLPHwa2eoI60mUbrF/ukBsw8OwQ+Vli
|
||||||
|
0siGyaoTCPP/h+9uuHJqQJ1yw6CCkCAxMwZXD79abtLytL6WkhuvoFJ6exRYGHawcHMs
|
||||||
|
bel32ifzIlv+7ULbcTI2uVNhxvdrD51tRSNrAZ77n+Tk8RivXMeSqSzPVngWZCs0uk6s
|
||||||
|
JryA==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
|
||||||
|
:references:mime-version:content-transfer-encoding;
|
||||||
|
bh=/43es5lmfTvSMg9V9lh/7OQVghMj1iNxFqwqD88gyCk=;
|
||||||
|
b=fAhjI90TZfQpcQBqM4rN69d8uN92OH3j+lhm/dYYlmqdchK6ZZsPD3wt6VW8/ObU+0
|
||||||
|
BpTic3inOmn0aVasSmAkbNxaVAUJ339klb/WnO9RfaemBLXDCBMgGjVr+ofhpIbfKxiZ
|
||||||
|
0aBswW4Dc2uY39zmxm7wtJ2sRHHwj/Ltdt7B+NYes7Kzohvfg98YLvm8I5mloimR02U9
|
||||||
|
HRlPKK2YbMcZ5i2Y8Q3faX8356caUUU7l91utK4EXdrVFCbNftXBEmRej6gXSZudCBga
|
||||||
|
7w6Rgymaox0hfMZzYLWtJJp2fo3BcKA4+TD6bJ1yrxIdPmK59QMGoyMUIKqTIZIjN2c/
|
||||||
|
gvpg==
|
||||||
|
X-Gm-Message-State: AOAM531lA6V8bOmQPsuLmZx3iv59gcixbI4HEH5eqWzOJ/N3DRaX/hb9
|
||||||
|
NavPhvckezEkR22O7uWWvZAUxOplQlRwSsX5
|
||||||
|
X-Google-Smtp-Source: ABdhPJyaSIYZWu4pp8j7TnxkxYd0BP77HzgDaIZFIDeoL910Tkv+L4VuoQLEw0GNu+5Zxi80enV/YQ==
|
||||||
|
X-Received: by 2002:a65:498b:: with SMTP id r11mr16491362pgs.364.1616850378733;
|
||||||
|
Sat, 27 Mar 2021 06:06:18 -0700 (PDT)
|
||||||
|
Received: from johnchen902-arch-ryzen.. (2001-b011-3815-3a1f-9afa-9bff-fe6e-3ce2.dynamic-ip6.hinet.net. [2001:b011:3815:3a1f:9afa:9bff:fe6e:3ce2])
|
||||||
|
by smtp.gmail.com with ESMTPSA id ot17sm6413787pjb.50.2021.03.27.06.06.17
|
||||||
|
(version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
|
||||||
|
Sat, 27 Mar 2021 06:06:18 -0700 (PDT)
|
||||||
|
From: John Chen <johnchen902@gmail.com>
|
||||||
|
To: linux-kernel@vger.kernel.org
|
||||||
|
Cc: Rohit Pidaparthi <rohitpid@gmail.com>,
|
||||||
|
RicardoEPRodrigues <ricardo.e.p.rodrigues@gmail.com>,
|
||||||
|
Jiri Kosina <jikos@kernel.org>,
|
||||||
|
Benjamin Tissoires <benjamin.tissoires@redhat.com>,
|
||||||
|
John Chen <johnchen902@gmail.com>
|
||||||
|
Subject: [PATCH 1/4] HID: magicmouse: add Apple Magic Mouse 2 support
|
||||||
|
Date: Sat, 27 Mar 2021 21:05:05 +0800
|
||||||
|
Message-Id: <20210327130508.24849-2-johnchen902@gmail.com>
|
||||||
|
X-Mailer: git-send-email 2.31.0
|
||||||
|
In-Reply-To: <20210327130508.24849-1-johnchen902@gmail.com>
|
||||||
|
References: <20210327130508.24849-1-johnchen902@gmail.com>
|
||||||
|
MIME-Version: 1.0
|
||||||
|
Content-Transfer-Encoding: 8bit
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210327130508.24849-2-johnchen902@gmail.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
Bluetooth device
|
||||||
|
Vendor 004c (Apple)
|
||||||
|
Device 0269 (Magic Mouse 2)
|
||||||
|
|
||||||
|
Add support for Apple Magic Mouse 2, putting the device in multi-touch
|
||||||
|
mode.
|
||||||
|
|
||||||
|
Co-authored-by: Rohit Pidaparthi <rohitpid@gmail.com>
|
||||||
|
Co-authored-by: RicardoEPRodrigues <ricardo.e.p.rodrigues@gmail.com>
|
||||||
|
Signed-off-by: John Chen <johnchen902@gmail.com>
|
||||||
|
---
|
||||||
|
drivers/hid/hid-ids.h | 1 +
|
||||||
|
drivers/hid/hid-magicmouse.c | 53 ++++++++++++++++++++++++++++++++----
|
||||||
|
2 files changed, 49 insertions(+), 5 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
|
||||||
|
index e42aaae3138f..fa0edf03570a 100644
|
||||||
|
--- a/drivers/hid/hid-ids.h
|
||||||
|
+++ b/drivers/hid/hid-ids.h
|
||||||
|
@@ -93,6 +93,7 @@
|
||||||
|
#define BT_VENDOR_ID_APPLE 0x004c
|
||||||
|
#define USB_DEVICE_ID_APPLE_MIGHTYMOUSE 0x0304
|
||||||
|
#define USB_DEVICE_ID_APPLE_MAGICMOUSE 0x030d
|
||||||
|
+#define USB_DEVICE_ID_APPLE_MAGICMOUSE2 0x0269
|
||||||
|
#define USB_DEVICE_ID_APPLE_MAGICTRACKPAD 0x030e
|
||||||
|
#define USB_DEVICE_ID_APPLE_MAGICTRACKPAD2 0x0265
|
||||||
|
#define USB_DEVICE_ID_APPLE_FOUNTAIN_ANSI 0x020e
|
||||||
|
diff --git a/drivers/hid/hid-magicmouse.c b/drivers/hid/hid-magicmouse.c
|
||||||
|
index abd86903875f..7aad6ca56780 100644
|
||||||
|
--- a/drivers/hid/hid-magicmouse.c
|
||||||
|
+++ b/drivers/hid/hid-magicmouse.c
|
||||||
|
@@ -54,6 +54,7 @@ MODULE_PARM_DESC(report_undeciphered, "Report undeciphered multi-touch state fie
|
||||||
|
#define TRACKPAD2_USB_REPORT_ID 0x02
|
||||||
|
#define TRACKPAD2_BT_REPORT_ID 0x31
|
||||||
|
#define MOUSE_REPORT_ID 0x29
|
||||||
|
+#define MOUSE2_REPORT_ID 0x12
|
||||||
|
#define DOUBLE_REPORT_ID 0xf7
|
||||||
|
/* These definitions are not precise, but they're close enough. (Bits
|
||||||
|
* 0x03 seem to indicate the aspect ratio of the touch, bits 0x70 seem
|
||||||
|
@@ -195,7 +196,8 @@ static void magicmouse_emit_touch(struct magicmouse_sc *msc, int raw_id, u8 *tda
|
||||||
|
int id, x, y, size, orientation, touch_major, touch_minor, state, down;
|
||||||
|
int pressure = 0;
|
||||||
|
|
||||||
|
- if (input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE) {
|
||||||
|
+ if (input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE ||
|
||||||
|
+ input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE2) {
|
||||||
|
id = (tdata[6] << 2 | tdata[5] >> 6) & 0xf;
|
||||||
|
x = (tdata[1] << 28 | tdata[0] << 20) >> 20;
|
||||||
|
y = -((tdata[2] << 24 | tdata[1] << 16) >> 20);
|
||||||
|
@@ -296,7 +298,8 @@ static void magicmouse_emit_touch(struct magicmouse_sc *msc, int raw_id, u8 *tda
|
||||||
|
input_report_abs(input, ABS_MT_PRESSURE, pressure);
|
||||||
|
|
||||||
|
if (report_undeciphered) {
|
||||||
|
- if (input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE)
|
||||||
|
+ if (input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE ||
|
||||||
|
+ input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE2)
|
||||||
|
input_event(input, EV_MSC, MSC_RAW, tdata[7]);
|
||||||
|
else if (input->id.product !=
|
||||||
|
USB_DEVICE_ID_APPLE_MAGICTRACKPAD2)
|
||||||
|
@@ -380,6 +383,34 @@ static int magicmouse_raw_event(struct hid_device *hdev,
|
||||||
|
* ts = data[3] >> 6 | data[4] << 2 | data[5] << 10;
|
||||||
|
*/
|
||||||
|
break;
|
||||||
|
+ case MOUSE2_REPORT_ID:
|
||||||
|
+ /* Size is either 8 or (14 + 8 * N) */
|
||||||
|
+ if (size != 8 && (size < 14 || (size - 14) % 8 != 0))
|
||||||
|
+ return 0;
|
||||||
|
+ npoints = (size - 14) / 8;
|
||||||
|
+ if (npoints > 15) {
|
||||||
|
+ hid_warn(hdev, "invalid size value (%d) for MOUSE2_REPORT_ID\n",
|
||||||
|
+ size);
|
||||||
|
+ return 0;
|
||||||
|
+ }
|
||||||
|
+ msc->ntouches = 0;
|
||||||
|
+ for (ii = 0; ii < npoints; ii++)
|
||||||
|
+ magicmouse_emit_touch(msc, ii, data + ii * 8 + 14);
|
||||||
|
+
|
||||||
|
+ /* When emulating three-button mode, it is important
|
||||||
|
+ * to have the current touch information before
|
||||||
|
+ * generating a click event.
|
||||||
|
+ */
|
||||||
|
+ x = (int)((data[3] << 24) | (data[2] << 16)) >> 16;
|
||||||
|
+ y = (int)((data[5] << 24) | (data[4] << 16)) >> 16;
|
||||||
|
+ clicks = data[1];
|
||||||
|
+
|
||||||
|
+ /* The following bits provide a device specific timestamp. They
|
||||||
|
+ * are unused here.
|
||||||
|
+ *
|
||||||
|
+ * ts = data[11] >> 6 | data[12] << 2 | data[13] << 10;
|
||||||
|
+ */
|
||||||
|
+ break;
|
||||||
|
case DOUBLE_REPORT_ID:
|
||||||
|
/* Sometimes the trackpad sends two touch reports in one
|
||||||
|
* packet.
|
||||||
|
@@ -392,7 +423,8 @@ static int magicmouse_raw_event(struct hid_device *hdev,
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
- if (input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE) {
|
||||||
|
+ if (input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE ||
|
||||||
|
+ input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE2) {
|
||||||
|
magicmouse_emit_buttons(msc, clicks & 3);
|
||||||
|
input_report_rel(input, REL_X, x);
|
||||||
|
input_report_rel(input, REL_Y, y);
|
||||||
|
@@ -415,7 +447,8 @@ static int magicmouse_setup_input(struct input_dev *input, struct hid_device *hd
|
||||||
|
|
||||||
|
__set_bit(EV_KEY, input->evbit);
|
||||||
|
|
||||||
|
- if (input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE) {
|
||||||
|
+ if (input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE ||
|
||||||
|
+ input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE2) {
|
||||||
|
__set_bit(BTN_LEFT, input->keybit);
|
||||||
|
__set_bit(BTN_RIGHT, input->keybit);
|
||||||
|
if (emulate_3button)
|
||||||
|
@@ -480,7 +513,8 @@ static int magicmouse_setup_input(struct input_dev *input, struct hid_device *hd
|
||||||
|
* the origin at the same position, and just uses the additive
|
||||||
|
* inverse of the reported Y.
|
||||||
|
*/
|
||||||
|
- if (input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE) {
|
||||||
|
+ if (input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE ||
|
||||||
|
+ input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE2) {
|
||||||
|
input_set_abs_params(input, ABS_MT_ORIENTATION, -31, 32, 1, 0);
|
||||||
|
input_set_abs_params(input, ABS_MT_POSITION_X,
|
||||||
|
MOUSE_MIN_X, MOUSE_MAX_X, 4, 0);
|
||||||
|
@@ -586,6 +620,7 @@ static int magicmouse_probe(struct hid_device *hdev,
|
||||||
|
{
|
||||||
|
const u8 *feature;
|
||||||
|
const u8 feature_mt[] = { 0xD7, 0x01 };
|
||||||
|
+ const u8 feature_mt_mouse2[] = { 0xF1, 0x02, 0x01 };
|
||||||
|
const u8 feature_mt_trackpad2_usb[] = { 0x02, 0x01 };
|
||||||
|
const u8 feature_mt_trackpad2_bt[] = { 0xF1, 0x02, 0x01 };
|
||||||
|
u8 *buf;
|
||||||
|
@@ -631,6 +666,9 @@ static int magicmouse_probe(struct hid_device *hdev,
|
||||||
|
if (id->product == USB_DEVICE_ID_APPLE_MAGICMOUSE)
|
||||||
|
report = hid_register_report(hdev, HID_INPUT_REPORT,
|
||||||
|
MOUSE_REPORT_ID, 0);
|
||||||
|
+ else if (id->product == USB_DEVICE_ID_APPLE_MAGICMOUSE2)
|
||||||
|
+ report = hid_register_report(hdev, HID_INPUT_REPORT,
|
||||||
|
+ MOUSE2_REPORT_ID, 0);
|
||||||
|
else if (id->product == USB_DEVICE_ID_APPLE_MAGICTRACKPAD2) {
|
||||||
|
if (id->vendor == BT_VENDOR_ID_APPLE)
|
||||||
|
report = hid_register_report(hdev, HID_INPUT_REPORT,
|
||||||
|
@@ -660,6 +698,9 @@ static int magicmouse_probe(struct hid_device *hdev,
|
||||||
|
feature_size = sizeof(feature_mt_trackpad2_usb);
|
||||||
|
feature = feature_mt_trackpad2_usb;
|
||||||
|
}
|
||||||
|
+ } else if (id->product == USB_DEVICE_ID_APPLE_MAGICMOUSE2) {
|
||||||
|
+ feature_size = sizeof(feature_mt_mouse2);
|
||||||
|
+ feature = feature_mt_mouse2;
|
||||||
|
} else {
|
||||||
|
feature_size = sizeof(feature_mt);
|
||||||
|
feature = feature_mt;
|
||||||
|
@@ -696,6 +737,8 @@ static int magicmouse_probe(struct hid_device *hdev,
|
||||||
|
static const struct hid_device_id magic_mice[] = {
|
||||||
|
{ HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_APPLE,
|
||||||
|
USB_DEVICE_ID_APPLE_MAGICMOUSE), .driver_data = 0 },
|
||||||
|
+ { HID_BLUETOOTH_DEVICE(BT_VENDOR_ID_APPLE,
|
||||||
|
+ USB_DEVICE_ID_APPLE_MAGICMOUSE2), .driver_data = 0 },
|
||||||
|
{ HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_APPLE,
|
||||||
|
USB_DEVICE_ID_APPLE_MAGICTRACKPAD), .driver_data = 0 },
|
||||||
|
{ HID_BLUETOOTH_DEVICE(BT_VENDOR_ID_APPLE,
|
||||||
|
--
|
||||||
|
2.31.0
|
||||||
|
|
||||||
|
|
@ -0,0 +1,134 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED,
|
||||||
|
DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,
|
||||||
|
HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH,
|
||||||
|
MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham
|
||||||
|
autolearn_force=no version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id 06C18C433E1
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Sat, 27 Mar 2021 13:07:08 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id D1CE16193D
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Sat, 27 Mar 2021 13:07:07 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S230328AbhC0NGi (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Sat, 27 Mar 2021 09:06:38 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59770 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S230266AbhC0NG1 (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Sat, 27 Mar 2021 09:06:27 -0400
|
||||||
|
Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5086BC0613B1
|
||||||
|
for <linux-kernel@vger.kernel.org>; Sat, 27 Mar 2021 06:06:27 -0700 (PDT)
|
||||||
|
Received: by mail-pl1-x634.google.com with SMTP id h8so2235029plt.7
|
||||||
|
for <linux-kernel@vger.kernel.org>; Sat, 27 Mar 2021 06:06:27 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=gmail.com; s=20161025;
|
||||||
|
h=from:to:cc:subject:date:message-id:in-reply-to:references
|
||||||
|
:mime-version:content-transfer-encoding;
|
||||||
|
bh=NeWUvZBV3NAy1b0eckELIbBZ7sti/n1sLYnD4r2cjaU=;
|
||||||
|
b=V7uM0AaI1Vy/mmqpuTVu5F6+98YPDzOa3QS6tRkWeJqhrflMONfCXtOxXVR+CeiPil
|
||||||
|
OOfaxOtAMeVEW9wE0EU3U/8aNghtzuUvVN+0Tj57+W+4g0ilQOODiDLDu4ZqAo1Q5eDZ
|
||||||
|
gA+He13KWVwNYaYTNUNParLXG5GYDbblaqABSUDurI1FTjn1US0ZZytlzdZy1GfL9eTj
|
||||||
|
6AiiVM3A4YdUGUWE7qQQE8jI92o4qKYvaNjn1M+d5ypKCue3NJWeRTSPKLu0QD2qL02+
|
||||||
|
QPga2RPtmLpztA8/lPGTRpgVNY3C5jdCBZyWgFtvZg5dNoDfe5bQnAmF2J2ka+A7JBSD
|
||||||
|
VHtw==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
|
||||||
|
:references:mime-version:content-transfer-encoding;
|
||||||
|
bh=NeWUvZBV3NAy1b0eckELIbBZ7sti/n1sLYnD4r2cjaU=;
|
||||||
|
b=OQek2lJ5JINezfYdN/FzSPFL1N9Hrs+KstU7K4gEHavdffvSAOBebg2MG5VSzkf93H
|
||||||
|
o1iOiAOoXY7cx7j7Vx5CFZUuJOLilpC6gPTJpZlaP8YtEFfGkPaUPPh5FSTyM463Sir8
|
||||||
|
n6DupTSrFUI1y44GOBZ2bM2pf9hRN1Yj1oiCT6upmfoHw0/PaKEZt5aOEI8se7HRJp94
|
||||||
|
td6+SEZok3uxKEglKEqAG8cnj7Pt4tKVQlg+MI1AQDLQ/ytdYJlMPmrqVyNpnsv44wYa
|
||||||
|
dxBf0TaMvqn9SYDIDcGct3toAVm5DfVUqXm1nkYcYMOdvPrmLoH52NtCyi5cYC+2TR6i
|
||||||
|
jUpA==
|
||||||
|
X-Gm-Message-State: AOAM532sXgN0NNpKjilSMBewUXwwXz+MOfd7J5FRI6zAWA5st7gy5LmE
|
||||||
|
Sw/QHj4cm3zT07LU1kWYSO9puwFV+yK0Hquf
|
||||||
|
X-Google-Smtp-Source: ABdhPJyDnhcP7BeBHXX2rPqMXwkOQiZdussDPATmYqyQnp7HAsi0OqWSUVIloMNi3QBpMsmjXTtyew==
|
||||||
|
X-Received: by 2002:a17:903:2285:b029:e6:faf5:eaff with SMTP id b5-20020a1709032285b02900e6faf5eaffmr19574014plh.70.1616850386727;
|
||||||
|
Sat, 27 Mar 2021 06:06:26 -0700 (PDT)
|
||||||
|
Received: from johnchen902-arch-ryzen.. (2001-b011-3815-3a1f-9afa-9bff-fe6e-3ce2.dynamic-ip6.hinet.net. [2001:b011:3815:3a1f:9afa:9bff:fe6e:3ce2])
|
||||||
|
by smtp.gmail.com with ESMTPSA id ot17sm6413787pjb.50.2021.03.27.06.06.25
|
||||||
|
(version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
|
||||||
|
Sat, 27 Mar 2021 06:06:26 -0700 (PDT)
|
||||||
|
From: John Chen <johnchen902@gmail.com>
|
||||||
|
To: linux-kernel@vger.kernel.org
|
||||||
|
Cc: Rohit Pidaparthi <rohitpid@gmail.com>,
|
||||||
|
RicardoEPRodrigues <ricardo.e.p.rodrigues@gmail.com>,
|
||||||
|
Jiri Kosina <jikos@kernel.org>,
|
||||||
|
Benjamin Tissoires <benjamin.tissoires@redhat.com>,
|
||||||
|
John Chen <johnchen902@gmail.com>
|
||||||
|
Subject: [PATCH 2/4] HID: magicmouse: fix 3 button emulation of Mouse 2
|
||||||
|
Date: Sat, 27 Mar 2021 21:05:06 +0800
|
||||||
|
Message-Id: <20210327130508.24849-3-johnchen902@gmail.com>
|
||||||
|
X-Mailer: git-send-email 2.31.0
|
||||||
|
In-Reply-To: <20210327130508.24849-1-johnchen902@gmail.com>
|
||||||
|
References: <20210327130508.24849-1-johnchen902@gmail.com>
|
||||||
|
MIME-Version: 1.0
|
||||||
|
Content-Transfer-Encoding: 8bit
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210327130508.24849-3-johnchen902@gmail.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
It is observed that, with 3 button emulation, when middle button is
|
||||||
|
clicked, either the left button or right button is clicked as well. It
|
||||||
|
is caused by hidinput "correctly" acting on the event, oblivious to the
|
||||||
|
3 button emulation.
|
||||||
|
|
||||||
|
As raw_event has taken care of everything, no further processing is
|
||||||
|
needed. However, the only way to stop at raw_event is to return an error
|
||||||
|
(negative) value. Therefore, the processing is stopped at event instead.
|
||||||
|
|
||||||
|
Signed-off-by: John Chen <johnchen902@gmail.com>
|
||||||
|
---
|
||||||
|
drivers/hid/hid-magicmouse.c | 16 ++++++++++++++++
|
||||||
|
1 file changed, 16 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/drivers/hid/hid-magicmouse.c b/drivers/hid/hid-magicmouse.c
|
||||||
|
index 7aad6ca56780..c646b4cd3783 100644
|
||||||
|
--- a/drivers/hid/hid-magicmouse.c
|
||||||
|
+++ b/drivers/hid/hid-magicmouse.c
|
||||||
|
@@ -440,6 +440,21 @@ static int magicmouse_raw_event(struct hid_device *hdev,
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
+static int magicmouse_event(struct hid_device *hdev, struct hid_field *field,
|
||||||
|
+ struct hid_usage *usage, __s32 value)
|
||||||
|
+{
|
||||||
|
+ struct magicmouse_sc *msc = hid_get_drvdata(hdev);
|
||||||
|
+ if (msc->input->id.product == USB_DEVICE_ID_APPLE_MAGICMOUSE2 &&
|
||||||
|
+ field->report->id == MOUSE2_REPORT_ID) {
|
||||||
|
+ // magic_mouse_raw_event has done all the work. Skip hidinput.
|
||||||
|
+ //
|
||||||
|
+ // Specifically, hidinput may modify BTN_LEFT and BTN_RIGHT,
|
||||||
|
+ // breaking emulate_3button.
|
||||||
|
+ return 1;
|
||||||
|
+ }
|
||||||
|
+ return 0;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
static int magicmouse_setup_input(struct input_dev *input, struct hid_device *hdev)
|
||||||
|
{
|
||||||
|
int error;
|
||||||
|
@@ -754,6 +769,7 @@ static struct hid_driver magicmouse_driver = {
|
||||||
|
.id_table = magic_mice,
|
||||||
|
.probe = magicmouse_probe,
|
||||||
|
.raw_event = magicmouse_raw_event,
|
||||||
|
+ .event = magicmouse_event,
|
||||||
|
.input_mapping = magicmouse_input_mapping,
|
||||||
|
.input_configured = magicmouse_input_configured,
|
||||||
|
};
|
||||||
|
--
|
||||||
|
2.31.0
|
||||||
|
|
||||||
|
|
@ -0,0 +1,265 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED,
|
||||||
|
DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,
|
||||||
|
HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH,
|
||||||
|
MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham
|
||||||
|
autolearn_force=no version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id 9A212C433DB
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Sat, 27 Mar 2021 13:10:34 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 60FCC61981
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Sat, 27 Mar 2021 13:10:34 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S230394AbhC0NHJ (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Sat, 27 Mar 2021 09:07:09 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59810 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S230307AbhC0NGi (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Sat, 27 Mar 2021 09:06:38 -0400
|
||||||
|
Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1EDFCC0613B1
|
||||||
|
for <linux-kernel@vger.kernel.org>; Sat, 27 Mar 2021 06:06:38 -0700 (PDT)
|
||||||
|
Received: by mail-pf1-x432.google.com with SMTP id q5so6741894pfh.10
|
||||||
|
for <linux-kernel@vger.kernel.org>; Sat, 27 Mar 2021 06:06:38 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=gmail.com; s=20161025;
|
||||||
|
h=from:to:cc:subject:date:message-id:in-reply-to:references
|
||||||
|
:mime-version:content-transfer-encoding;
|
||||||
|
bh=fWEWnDB7IS15Aoqul4RZDergwEtbUe4NAH8lKjv7p/s=;
|
||||||
|
b=CGLrSHoDnG8b5CL6asLWP1Ym/QFl+wtwIF8PhKlW7RJ5IhavVtdO6Fd7/cY/3GQTDa
|
||||||
|
wvX9Q1wfBsakVlG9/sM9CuozOsra6Ec9c1B+0beWTAKj/tBjwvsVHtMoCiqOPL/Vbig6
|
||||||
|
4zkWMb6dwWSzAgmCqPEaYlyJYqBrDLzzXxqGhchwTfcNgNZQGq0xhh7tZsukEPz4XLIC
|
||||||
|
LNCy6+hPSVdRG1ADbyPpOGFn3fSeFs5KAwl3y1Cn0TvTPxgpckTLcFz5TsTF/w7VLGW1
|
||||||
|
bn9Gakn+MaATqxahU0lDwyzI1sMK2er7/ddjV9VugYN4PzgL9DHGu/iGzXGFftDoLdaJ
|
||||||
|
tBIQ==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
|
||||||
|
:references:mime-version:content-transfer-encoding;
|
||||||
|
bh=fWEWnDB7IS15Aoqul4RZDergwEtbUe4NAH8lKjv7p/s=;
|
||||||
|
b=PQiPlj7RSTzmBU6u/2xzL9qv8jrelC7cJFFiOHjwKfz43PMzm0nEj6PxY5ZFMSjmbs
|
||||||
|
JEfC8iDjJh39FJdthBrvaZX4yuTv4QmOdmRMWrN77sQYbZOaKOhbNrCx2/LdHzAFjLBY
|
||||||
|
qTHW0+siiP/ATBf1M0cSP200UZAjBwU8MRapxAlaIUmlrfr5+oM8ZrL2tMhzDYcn5b51
|
||||||
|
TwXEVVI5Ep0YZxyGYQ04yaMBZxb1hSKev6UhrFpk96Ukg4IY3qBQBRpjWHIWqZY21aUl
|
||||||
|
EeDLmlWZaqDbp6UQQrAd2p1kIVyrxKD2Cf4aPnk2JcvzR9qGfMwV8cpR9rqwrXBEiyLj
|
||||||
|
KZFg==
|
||||||
|
X-Gm-Message-State: AOAM532lFsZyg8BiLek2pS5Ftc0rOopeD1Q9b7d5Lc7gC8pPIjHcnizK
|
||||||
|
2/grg+4GExN9zVerojORiZgGkTwU1/c2DswO
|
||||||
|
X-Google-Smtp-Source: ABdhPJwECFbuV2SwesS0pF6L0s23ghF61g6whXAjcLZpxYe6b6OsgENBMa3gmTj9FFMF+68uJYhPPw==
|
||||||
|
X-Received: by 2002:a63:1d26:: with SMTP id d38mr17032822pgd.385.1616850397389;
|
||||||
|
Sat, 27 Mar 2021 06:06:37 -0700 (PDT)
|
||||||
|
Received: from johnchen902-arch-ryzen.. (2001-b011-3815-3a1f-9afa-9bff-fe6e-3ce2.dynamic-ip6.hinet.net. [2001:b011:3815:3a1f:9afa:9bff:fe6e:3ce2])
|
||||||
|
by smtp.gmail.com with ESMTPSA id ot17sm6413787pjb.50.2021.03.27.06.06.36
|
||||||
|
(version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
|
||||||
|
Sat, 27 Mar 2021 06:06:37 -0700 (PDT)
|
||||||
|
From: John Chen <johnchen902@gmail.com>
|
||||||
|
To: linux-kernel@vger.kernel.org
|
||||||
|
Cc: Rohit Pidaparthi <rohitpid@gmail.com>,
|
||||||
|
RicardoEPRodrigues <ricardo.e.p.rodrigues@gmail.com>,
|
||||||
|
Jiri Kosina <jikos@kernel.org>,
|
||||||
|
Benjamin Tissoires <benjamin.tissoires@redhat.com>,
|
||||||
|
John Chen <johnchen902@gmail.com>
|
||||||
|
Subject: [PATCH 3/4] HID: magicmouse: fix reconnection of Magic Mouse 2
|
||||||
|
Date: Sat, 27 Mar 2021 21:05:07 +0800
|
||||||
|
Message-Id: <20210327130508.24849-4-johnchen902@gmail.com>
|
||||||
|
X-Mailer: git-send-email 2.31.0
|
||||||
|
In-Reply-To: <20210327130508.24849-1-johnchen902@gmail.com>
|
||||||
|
References: <20210327130508.24849-1-johnchen902@gmail.com>
|
||||||
|
MIME-Version: 1.0
|
||||||
|
Content-Transfer-Encoding: 8bit
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210327130508.24849-4-johnchen902@gmail.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
It is observed that the Magic Mouse 2 would not enter multi-touch mode
|
||||||
|
unless the mouse is connected before loading the module. It seems to be
|
||||||
|
a quirk specific to Magic Mouse 2
|
||||||
|
|
||||||
|
Retrying after 500ms fixes the problem for me. The delay can't be
|
||||||
|
reduced much further --- 300ms didn't work for me. Retrying immediately
|
||||||
|
after receiving an event didn't work either.
|
||||||
|
|
||||||
|
Signed-off-by: John Chen <johnchen902@gmail.com>
|
||||||
|
---
|
||||||
|
drivers/hid/hid-magicmouse.c | 93 ++++++++++++++++++++++++------------
|
||||||
|
1 file changed, 63 insertions(+), 30 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/drivers/hid/hid-magicmouse.c b/drivers/hid/hid-magicmouse.c
|
||||||
|
index c646b4cd3783..69aefef9fe07 100644
|
||||||
|
--- a/drivers/hid/hid-magicmouse.c
|
||||||
|
+++ b/drivers/hid/hid-magicmouse.c
|
||||||
|
@@ -16,6 +16,7 @@
|
||||||
|
#include <linux/input/mt.h>
|
||||||
|
#include <linux/module.h>
|
||||||
|
#include <linux/slab.h>
|
||||||
|
+#include <linux/workqueue.h>
|
||||||
|
|
||||||
|
#include "hid-ids.h"
|
||||||
|
|
||||||
|
@@ -128,6 +129,9 @@ struct magicmouse_sc {
|
||||||
|
u8 size;
|
||||||
|
} touches[16];
|
||||||
|
int tracking_ids[16];
|
||||||
|
+
|
||||||
|
+ struct hid_device *hdev;
|
||||||
|
+ struct delayed_work work;
|
||||||
|
};
|
||||||
|
|
||||||
|
static int magicmouse_firm_touch(struct magicmouse_sc *msc)
|
||||||
|
@@ -629,9 +633,7 @@ static int magicmouse_input_configured(struct hid_device *hdev,
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
-
|
||||||
|
-static int magicmouse_probe(struct hid_device *hdev,
|
||||||
|
- const struct hid_device_id *id)
|
||||||
|
+static int magicmouse_enable_multitouch(struct hid_device *hdev)
|
||||||
|
{
|
||||||
|
const u8 *feature;
|
||||||
|
const u8 feature_mt[] = { 0xD7, 0x01 };
|
||||||
|
@@ -639,10 +641,52 @@ static int magicmouse_probe(struct hid_device *hdev,
|
||||||
|
const u8 feature_mt_trackpad2_usb[] = { 0x02, 0x01 };
|
||||||
|
const u8 feature_mt_trackpad2_bt[] = { 0xF1, 0x02, 0x01 };
|
||||||
|
u8 *buf;
|
||||||
|
+ int ret;
|
||||||
|
+ int feature_size;
|
||||||
|
+
|
||||||
|
+ if (hdev->product == USB_DEVICE_ID_APPLE_MAGICTRACKPAD2) {
|
||||||
|
+ if (hdev->vendor == BT_VENDOR_ID_APPLE) {
|
||||||
|
+ feature_size = sizeof(feature_mt_trackpad2_bt);
|
||||||
|
+ feature = feature_mt_trackpad2_bt;
|
||||||
|
+ } else { /* USB_VENDOR_ID_APPLE */
|
||||||
|
+ feature_size = sizeof(feature_mt_trackpad2_usb);
|
||||||
|
+ feature = feature_mt_trackpad2_usb;
|
||||||
|
+ }
|
||||||
|
+ } else if (hdev->product == USB_DEVICE_ID_APPLE_MAGICMOUSE2) {
|
||||||
|
+ feature_size = sizeof(feature_mt_mouse2);
|
||||||
|
+ feature = feature_mt_mouse2;
|
||||||
|
+ } else {
|
||||||
|
+ feature_size = sizeof(feature_mt);
|
||||||
|
+ feature = feature_mt;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ buf = kmemdup(feature, feature_size, GFP_KERNEL);
|
||||||
|
+ if (!buf)
|
||||||
|
+ return -ENOMEM;
|
||||||
|
+
|
||||||
|
+ ret = hid_hw_raw_request(hdev, buf[0], buf, feature_size,
|
||||||
|
+ HID_FEATURE_REPORT, HID_REQ_SET_REPORT);
|
||||||
|
+ kfree(buf);
|
||||||
|
+ return ret;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void magicmouse_enable_mt_work(struct work_struct *work)
|
||||||
|
+{
|
||||||
|
+ struct magicmouse_sc *msc =
|
||||||
|
+ container_of(work, struct magicmouse_sc, work.work);
|
||||||
|
+ int ret;
|
||||||
|
+
|
||||||
|
+ ret = magicmouse_enable_multitouch(msc->hdev);
|
||||||
|
+ if (ret < 0)
|
||||||
|
+ hid_err(msc->hdev, "unable to request touch data (%d)\n", ret);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static int magicmouse_probe(struct hid_device *hdev,
|
||||||
|
+ const struct hid_device_id *id)
|
||||||
|
+{
|
||||||
|
struct magicmouse_sc *msc;
|
||||||
|
struct hid_report *report;
|
||||||
|
int ret;
|
||||||
|
- int feature_size;
|
||||||
|
|
||||||
|
if (id->vendor == USB_VENDOR_ID_APPLE &&
|
||||||
|
id->product == USB_DEVICE_ID_APPLE_MAGICTRACKPAD2 &&
|
||||||
|
@@ -656,6 +700,8 @@ static int magicmouse_probe(struct hid_device *hdev,
|
||||||
|
}
|
||||||
|
|
||||||
|
msc->scroll_accel = SCROLL_ACCEL_DEFAULT;
|
||||||
|
+ msc->hdev = hdev;
|
||||||
|
+ INIT_DEFERRABLE_WORK(&msc->work, magicmouse_enable_mt_work);
|
||||||
|
|
||||||
|
msc->quirks = id->driver_data;
|
||||||
|
hid_set_drvdata(hdev, msc);
|
||||||
|
@@ -705,28 +751,6 @@ static int magicmouse_probe(struct hid_device *hdev,
|
||||||
|
}
|
||||||
|
report->size = 6;
|
||||||
|
|
||||||
|
- if (id->product == USB_DEVICE_ID_APPLE_MAGICTRACKPAD2) {
|
||||||
|
- if (id->vendor == BT_VENDOR_ID_APPLE) {
|
||||||
|
- feature_size = sizeof(feature_mt_trackpad2_bt);
|
||||||
|
- feature = feature_mt_trackpad2_bt;
|
||||||
|
- } else { /* USB_VENDOR_ID_APPLE */
|
||||||
|
- feature_size = sizeof(feature_mt_trackpad2_usb);
|
||||||
|
- feature = feature_mt_trackpad2_usb;
|
||||||
|
- }
|
||||||
|
- } else if (id->product == USB_DEVICE_ID_APPLE_MAGICMOUSE2) {
|
||||||
|
- feature_size = sizeof(feature_mt_mouse2);
|
||||||
|
- feature = feature_mt_mouse2;
|
||||||
|
- } else {
|
||||||
|
- feature_size = sizeof(feature_mt);
|
||||||
|
- feature = feature_mt;
|
||||||
|
- }
|
||||||
|
-
|
||||||
|
- buf = kmemdup(feature, feature_size, GFP_KERNEL);
|
||||||
|
- if (!buf) {
|
||||||
|
- ret = -ENOMEM;
|
||||||
|
- goto err_stop_hw;
|
||||||
|
- }
|
||||||
|
-
|
||||||
|
/*
|
||||||
|
* Some devices repond with 'invalid report id' when feature
|
||||||
|
* report switching it into multitouch mode is sent to it.
|
||||||
|
@@ -735,13 +759,14 @@ static int magicmouse_probe(struct hid_device *hdev,
|
||||||
|
* but there seems to be no other way of switching the mode.
|
||||||
|
* Thus the super-ugly hacky success check below.
|
||||||
|
*/
|
||||||
|
- ret = hid_hw_raw_request(hdev, buf[0], buf, feature_size,
|
||||||
|
- HID_FEATURE_REPORT, HID_REQ_SET_REPORT);
|
||||||
|
- kfree(buf);
|
||||||
|
- if (ret != -EIO && ret != feature_size) {
|
||||||
|
+ ret = magicmouse_enable_multitouch(hdev);
|
||||||
|
+ if (ret != -EIO && ret < 0) {
|
||||||
|
hid_err(hdev, "unable to request touch data (%d)\n", ret);
|
||||||
|
goto err_stop_hw;
|
||||||
|
}
|
||||||
|
+ if (ret == -EIO && id->product == USB_DEVICE_ID_APPLE_MAGICMOUSE2) {
|
||||||
|
+ schedule_delayed_work(&msc->work, msecs_to_jiffies(500));
|
||||||
|
+ }
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
err_stop_hw:
|
||||||
|
@@ -749,6 +774,13 @@ static int magicmouse_probe(struct hid_device *hdev,
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
|
||||||
|
+static void magicmouse_remove(struct hid_device *hdev)
|
||||||
|
+{
|
||||||
|
+ struct magicmouse_sc *msc = hid_get_drvdata(hdev);
|
||||||
|
+ cancel_delayed_work_sync(&msc->work);
|
||||||
|
+ hid_hw_stop(hdev);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
static const struct hid_device_id magic_mice[] = {
|
||||||
|
{ HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_APPLE,
|
||||||
|
USB_DEVICE_ID_APPLE_MAGICMOUSE), .driver_data = 0 },
|
||||||
|
@@ -768,6 +800,7 @@ static struct hid_driver magicmouse_driver = {
|
||||||
|
.name = "magicmouse",
|
||||||
|
.id_table = magic_mice,
|
||||||
|
.probe = magicmouse_probe,
|
||||||
|
+ .remove = magicmouse_remove,
|
||||||
|
.raw_event = magicmouse_raw_event,
|
||||||
|
.event = magicmouse_event,
|
||||||
|
.input_mapping = magicmouse_input_mapping,
|
||||||
|
--
|
||||||
|
2.31.0
|
||||||
|
|
||||||
|
|
@ -0,0 +1,146 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id B98A5C43462
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:56:52 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 9970A613D1
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:56:52 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S245186AbhDMG5J (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:09 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44138 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S242333AbhDMG5C (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:02 -0400
|
||||||
|
Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28542C061574
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:43 -0700 (PDT)
|
||||||
|
Received: by mail-yb1-xb4a.google.com with SMTP id i2so15393704ybl.21
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:43 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=uoN+cWnulcs+MZz6Yfoth7IiX8iwaSm44WY0GAxt+Q4=;
|
||||||
|
b=Ky/g/4nTpvE6H1kNq4Im8vCSVqJJWgdY64updRqr3NGODL/gY7XSLNlMuXa/Yqagpg
|
||||||
|
8h8aUIGoWcm6zgtJI5Fw5fMN+PJDxOQb+W3x0OLBhrQ+nOe/aDQ/DaNsTpFLgKXpBR7/
|
||||||
|
Nvvw4ruE5Db9uCII9HC5YVMWkv6n0oPwKqmHcIgXqyJRfj6NX9MMyHBXVjqP883hb1k1
|
||||||
|
Uts/76AmsciIF0vpEK2WDi/7DTKQWJN38NKXgOIJgZwI3uctZHJ221m0qvGUkZ8xVQ8M
|
||||||
|
LJm2bY+K9olC9c50QyUPY+bxF/x11l+o56tHmajIr/WsoQoJ64e/eJ6Tpi1C0nsUQsqW
|
||||||
|
HHBQ==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=uoN+cWnulcs+MZz6Yfoth7IiX8iwaSm44WY0GAxt+Q4=;
|
||||||
|
b=q4VjA6z3lTu7Y75EQkCaOUnGPrZr+a8VxIneVHg9KIy8GcVnTbV6azYx3iJlfN/mqY
|
||||||
|
nM4GFUu6opNihX2CTE1sYviNzX90nlsf6Ip3WykacM0NVKoiD/02EGRPQvc0l3EE/8K0
|
||||||
|
43Y8NKqjqKspr7Tjz074a8EJrkBUqhaBpFzDGZwvcg5JCb19/+tTrjWSio3YSp1gtbA+
|
||||||
|
8OB8fTMMZlhaH5pTQWlQnQM3YN8CNJBooHERVgByq78Q7xObvheM9tjTza0hz5coErNv
|
||||||
|
aLMQMSIT87k3f7EWq0H6qOBAaxbbR8uChrhfVLanXWxhaw/G+ZI5csPO154ctl5A0+5/
|
||||||
|
Yc5g==
|
||||||
|
X-Gm-Message-State: AOAM5311I++jOq9dpMAS7ctzsZDbqRUOtVWfMxjhdktZjjKeusU8mSAv
|
||||||
|
AjoVQqVxKqAzXcw+CT2fcJSxxzNjPAU=
|
||||||
|
X-Google-Smtp-Source: ABdhPJyfoyUlusz71TmeoRvttPw/GuUM1FYO9KnbxFJsUN5OFDqRz4J7wq87XkLveCWglWGJeEC6Et9cWvE=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:a25:bb41:: with SMTP id b1mr41562657ybk.249.1618297002301;
|
||||||
|
Mon, 12 Apr 2021 23:56:42 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:18 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-2-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 01/16] include/linux/memcontrol.h: do not warn in
|
||||||
|
page_memcg_rcu() if !CONFIG_MEMCG
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-2-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
page_memcg_rcu() warns on !rcu_read_lock_held() regardless of
|
||||||
|
CONFIG_MEMCG. The following code is legit, but it triggers the warning
|
||||||
|
when !CONFIG_MEMCG, since lock_page_memcg() and unlock_page_memcg()
|
||||||
|
are empty for this config.
|
||||||
|
|
||||||
|
memcg = lock_page_memcg(page1)
|
||||||
|
(rcu_read_lock() if CONFIG_MEMCG=y)
|
||||||
|
|
||||||
|
do something to page1
|
||||||
|
|
||||||
|
if (page_memcg_rcu(page2) == memcg)
|
||||||
|
do something to page2 too as it cannot be migrated away from the
|
||||||
|
memcg either.
|
||||||
|
|
||||||
|
unlock_page_memcg(page1)
|
||||||
|
(rcu_read_unlock() if CONFIG_MEMCG=y)
|
||||||
|
|
||||||
|
Locking/unlocking rcu consistently for both configs is rigorous but it
|
||||||
|
also forces unnecessary locking upon users who have no interest in
|
||||||
|
CONFIG_MEMCG.
|
||||||
|
|
||||||
|
This patch removes the assertion for !CONFIG_MEMCG, because
|
||||||
|
page_memcg_rcu() has a few callers and there are no concerns regarding
|
||||||
|
their correctness at the moment.
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
include/linux/memcontrol.h | 1 -
|
||||||
|
1 file changed, 1 deletion(-)
|
||||||
|
|
||||||
|
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
|
||||||
|
index 0c04d39a7967..f13dc02cf277 100644
|
||||||
|
--- a/include/linux/memcontrol.h
|
||||||
|
+++ b/include/linux/memcontrol.h
|
||||||
|
@@ -1077,7 +1077,6 @@ static inline struct mem_cgroup *page_memcg(struct page *page)
|
||||||
|
|
||||||
|
static inline struct mem_cgroup *page_memcg_rcu(struct page *page)
|
||||||
|
{
|
||||||
|
- WARN_ON_ONCE(!rcu_read_lock_held());
|
||||||
|
return NULL;
|
||||||
|
}
|
||||||
|
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -0,0 +1,124 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id DD966C43460
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:56:56 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id AD0DD613B1
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:56:56 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S245188AbhDMG5O (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:14 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44148 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S245147AbhDMG5D (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:03 -0400
|
||||||
|
Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF0D4C061574
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:44 -0700 (PDT)
|
||||||
|
Received: by mail-yb1-xb4a.google.com with SMTP id e185so6246113ybf.4
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:44 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=EM7U/N62rbxjpd/wy3lwoMJ7CSKXstnqAzc6WMXVO+c=;
|
||||||
|
b=t/TvdOo7hn9eFyLRcO6IKN2knJLFlMvJD85LqS3p70ezJY9KmJyQnoNmrkIR2uthXy
|
||||||
|
WmFHutjhP3sNRUFV88YVqyqRzdb/QCULw0znZtShHzf8oRGvUznrafDt1yFbCPXkkI+0
|
||||||
|
Y1bOuKRWZn44z9QIgS0RLo1mHpFU76jVw8i6GqzSatKn5V3qIjC6li7inmOfVCGRz5Zl
|
||||||
|
+SxAwEh7kMa92WQx0NoeerKExD4+Xxk3+iMBmL0VuvWnWnvSTan6oFLfspI3Vr1AfObf
|
||||||
|
fAVPm3SigqMxgdFIo7OoLz/1wI9FPVPrUSETRfh9HMZZzvtlTIxOqZEUvZjaaMCiZtbS
|
||||||
|
2tUA==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=EM7U/N62rbxjpd/wy3lwoMJ7CSKXstnqAzc6WMXVO+c=;
|
||||||
|
b=rGLib4fG3u5JfGsMESfD549XkyQTbkxo+ZDK2peyx+gBJaLu40gpIMHYqYOJAyzqzG
|
||||||
|
ix/FmZokOmB2+3Naq4VoOPQoJeMjsTJL0YBtF/6MDHz1/XjT5miqUjxHiUs4UtTo2Du6
|
||||||
|
F/+TEZ6RtK0ePZqj+F41HO2cFdLMN0FfxwTT86IF0q5FEXGo7ZGqUj/nGxuH9w5dgmHf
|
||||||
|
9Nskde954GH8rRzCtUmRNHuA8h7Ac3cmaz+uI7FTFiX01W+tcnke/SrzFAqCCl6ML8Ah
|
||||||
|
6Js8R+1sL+sXe8TtZjGQ2aa7aOQGYsPwyF+SJW5qYMLvDpcoUNdkKpfb2nSVpEKolrJA
|
||||||
|
C3cg==
|
||||||
|
X-Gm-Message-State: AOAM533k6NruViQt9bY73WARuw0APJWRdFLtJTsHl/VJrzJggskh0kcA
|
||||||
|
On0mU/on2LGVIbt6g8dxcT+hA0GZgOI=
|
||||||
|
X-Google-Smtp-Source: ABdhPJx9dY0CYhzp53dRcd9T1SUoIr4KnxC7LGKi7djvDgAR/DF3q/feIx7ybIki3WMXmS4BOiKGzGOIvao=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:a5b:b4a:: with SMTP id b10mr2519734ybr.182.1618297003935;
|
||||||
|
Mon, 12 Apr 2021 23:56:43 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:19 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-3-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 02/16] include/linux/nodemask.h: define next_memory_node()
|
||||||
|
if !CONFIG_NUMA
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-3-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
Currently next_memory_node only exists when CONFIG_NUMA=y. This patch
|
||||||
|
adds the macro for !CONFIG_NUMA.
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
include/linux/nodemask.h | 1 +
|
||||||
|
1 file changed, 1 insertion(+)
|
||||||
|
|
||||||
|
diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
|
||||||
|
index ac398e143c9a..89fe4e3592f9 100644
|
||||||
|
--- a/include/linux/nodemask.h
|
||||||
|
+++ b/include/linux/nodemask.h
|
||||||
|
@@ -486,6 +486,7 @@ static inline int num_node_state(enum node_states state)
|
||||||
|
#define first_online_node 0
|
||||||
|
#define first_memory_node 0
|
||||||
|
#define next_online_node(nid) (MAX_NUMNODES)
|
||||||
|
+#define next_memory_node(nid) (MAX_NUMNODES)
|
||||||
|
#define nr_node_ids 1U
|
||||||
|
#define nr_online_nodes 1U
|
||||||
|
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -0,0 +1,130 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id 6E4E7C433ED
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:56:58 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 2A301613EB
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:56:58 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S1345084AbhDMG5Q (ORCPT
|
||||||
|
<rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:16 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44152 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S237032AbhDMG5F (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:05 -0400
|
||||||
|
Received: from mail-qv1-xf4a.google.com (mail-qv1-xf4a.google.com [IPv6:2607:f8b0:4864:20::f4a])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26D6FC061574
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:46 -0700 (PDT)
|
||||||
|
Received: by mail-qv1-xf4a.google.com with SMTP id gu11so7133331qvb.0
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:46 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=ty30EBMFobCGhabQdsuq+v2Kg8uUmEONp40/WyUA1q8=;
|
||||||
|
b=i8n6+BP4XniO7GqYB3njPBeS1g0cajvT/0XeibRC9E79Y2kxVkXGp/HuAtF4IVW6+L
|
||||||
|
/n2Z+ZNUjzYoRG1K8TO2KT7wPH4dB0dBfh+QxjE4pa3hFSlYATFkHsATy+5tXCYxPNI5
|
||||||
|
icwBWKo7lmwEnXOUHSMAZbfasHoawvCVog/UnTwIW6ATbaU4DRzi4r/NM6Dk8D5iMFw0
|
||||||
|
uINBgxANuIFFKRfVUOyfzXT7qWKDHKlb5wvR3T/4y2+SRO3Xq0OMidUV+vii8Ijbi9C8
|
||||||
|
OKDCcdJr7BmAzQtIPAXlE+vxaL8G9raL19q09IcdqKKULLNIy57jVK2xtDVpTIbZE6jh
|
||||||
|
DVMg==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=ty30EBMFobCGhabQdsuq+v2Kg8uUmEONp40/WyUA1q8=;
|
||||||
|
b=YVzfIBXrv665u9gqpA0aaR8rYQ3ksKwQ6y1pnY3UhRF3H0B9Ey8UftLQ5sEjQSYXf5
|
||||||
|
4YJG1pSXti7Zr0NjAcVojZxJ3vul55+LG8QsAqvrkxu9kZe9BCPGcZ7CtjYmvAXZMaJS
|
||||||
|
LTzQMVutjT5FccfRztpgbLs4XZyflvf+EfncOMZ0jVl38t1cj4+1gqSFR9l9ghy+Xj2h
|
||||||
|
TuyP9qzN8JVm4XYhKfTX+rAB+yQ+CKmVvhh3Oj8O2I0hVOGKHfv1GT2BxP8lsdodzCri
|
||||||
|
TV4h5qxgSpmrJT5zS82i0VC+Kgi1iQ5lNkeUwKrowIXgTTdj2LkXGChb1hia2Sb2fq/c
|
||||||
|
/0RA==
|
||||||
|
X-Gm-Message-State: AOAM532KBvjkAJqjUGm4z3T6vDFjQzVEl4MdDPqiOTi/Sx/00HV2Sk4T
|
||||||
|
CDYdSIReMsyd3sZTjfEkJQizn1CUbQo=
|
||||||
|
X-Google-Smtp-Source: ABdhPJz9bP7GjZCXkR9CChLjfI00GuzH9av/gCfg2jgEdkGIxWUcBRwxRgL0Vxc4uB1fdD7yCdL0ylir3GM=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:a05:6214:161:: with SMTP id
|
||||||
|
y1mr13969669qvs.31.1618297005251; Mon, 12 Apr 2021 23:56:45 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:20 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-4-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 03/16] include/linux/huge_mm.h: define is_huge_zero_pmd()
|
||||||
|
if !CONFIG_TRANSPARENT_HUGEPAGE
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-4-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
Currently is_huge_zero_pmd() only exists when
|
||||||
|
CONFIG_TRANSPARENT_HUGEPAGE=y. This patch adds the function for
|
||||||
|
!CONFIG_TRANSPARENT_HUGEPAGE.
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
include/linux/huge_mm.h | 5 +++++
|
||||||
|
1 file changed, 5 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
|
||||||
|
index ba973efcd369..0ba7b3f9029c 100644
|
||||||
|
--- a/include/linux/huge_mm.h
|
||||||
|
+++ b/include/linux/huge_mm.h
|
||||||
|
@@ -443,6 +443,11 @@ static inline bool is_huge_zero_page(struct page *page)
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
+static inline bool is_huge_zero_pmd(pmd_t pmd)
|
||||||
|
+{
|
||||||
|
+ return false;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
static inline bool is_huge_zero_pud(pud_t pud)
|
||||||
|
{
|
||||||
|
return false;
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -0,0 +1,151 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id B1779C433B4
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:56:59 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 93C83613CB
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:56:59 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S1345093AbhDMG5R (ORCPT
|
||||||
|
<rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:17 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44160 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S237122AbhDMG5G (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:06 -0400
|
||||||
|
Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70228C061756
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:47 -0700 (PDT)
|
||||||
|
Received: by mail-yb1-xb4a.google.com with SMTP id d1so15228352ybj.15
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:47 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=NPo7MPPcRQhwQwi0VkGJEhiUUoPZKpCjODwiJd36ReE=;
|
||||||
|
b=baGnCiioZTP9ADs7IVEB/mQcb3cvKmCKgg9drauUZQ+Tp4ZFhqV8SVk54iVXXC/g4a
|
||||||
|
cpq3VBdcxXnUKSenbwAnH9Jp0vcf5HUqcvm0/PItCUte5xo66HxROV5Obn4PGte89xi9
|
||||||
|
p+R4eomS1+PIS2MLxgShOMpnFvyxeBgpYJvBAHU3FKJ3dtUuQ8TMqtRRYgDLRETQtThQ
|
||||||
|
kFEKuP+qBTfl6NS1fHTb9BFTIgP5Z/N1DOBc07huBgFItja27dgr56dPRNvm09QqhgN8
|
||||||
|
KNYrM6tJs6Md4vWQFOufoHl576biAVAYjl1tmh0+nRa81An0lfEfinpclElVWZVJap6f
|
||||||
|
3K6Q==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=NPo7MPPcRQhwQwi0VkGJEhiUUoPZKpCjODwiJd36ReE=;
|
||||||
|
b=VQUPKq30uKeUAF6Ejq35xfekJF7nOdr7VngI/76uX8lOU1pIKoO4mC5aTAYeOIOrr8
|
||||||
|
d9hpCUWEcuxEWFU49K2HTzz6r9TRtei0Z3TR3n5CdNJqIigsBiTmuLGfOPgRfmTdf4p1
|
||||||
|
Gy4MP3Ln+GHBFflwKZ+f5OPcq+R/slU8HpAWd4KR6PshMeb/Uf/RnHWhIQ3qI8S3QLXv
|
||||||
|
K66JL1wL5gT1XsIvdtHxoLQ/CLC3QqmB2rSMp/tB7Orqc6DK48r53Kt037j1ALstA0O7
|
||||||
|
qY6CPZRsbCum+NhqDvT8/KN1dsIkOSEmKUt0TfQc8hUEIm0I2juU0HYZsBV7D9xioz8r
|
||||||
|
p45w==
|
||||||
|
X-Gm-Message-State: AOAM533p7SYDUFBf9Ifm7vaTwGtjEO4CrlaCuZ4KoZ7jp3M6fMJFAFBH
|
||||||
|
4BBDhvIWmrjLJRxSeBVIWDYQXg1lPro=
|
||||||
|
X-Google-Smtp-Source: ABdhPJyRALAhdJY/7MdeRvaPV8dMvbenEwa1GhqHOoi94XTiY8IwvBzrDPMpa5ltVLi8kkX49f0gbWJD/40=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:a25:8b86:: with SMTP id j6mr39368340ybl.470.1618297006589;
|
||||||
|
Mon, 12 Apr 2021 23:56:46 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:21 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-5-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 04/16] include/linux/cgroup.h: export cgroup_mutex
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-5-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
cgroup_mutex is needed to synchronize with memcg creations.
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
include/linux/cgroup.h | 15 ++++++++++++++-
|
||||||
|
1 file changed, 14 insertions(+), 1 deletion(-)
|
||||||
|
|
||||||
|
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
|
||||||
|
index 4f2f79de083e..bd5744360cfa 100644
|
||||||
|
--- a/include/linux/cgroup.h
|
||||||
|
+++ b/include/linux/cgroup.h
|
||||||
|
@@ -432,6 +432,18 @@ static inline void cgroup_put(struct cgroup *cgrp)
|
||||||
|
css_put(&cgrp->self);
|
||||||
|
}
|
||||||
|
|
||||||
|
+extern struct mutex cgroup_mutex;
|
||||||
|
+
|
||||||
|
+static inline void cgroup_lock(void)
|
||||||
|
+{
|
||||||
|
+ mutex_lock(&cgroup_mutex);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static inline void cgroup_unlock(void)
|
||||||
|
+{
|
||||||
|
+ mutex_unlock(&cgroup_mutex);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
/**
|
||||||
|
* task_css_set_check - obtain a task's css_set with extra access conditions
|
||||||
|
* @task: the task to obtain css_set for
|
||||||
|
@@ -446,7 +458,6 @@ static inline void cgroup_put(struct cgroup *cgrp)
|
||||||
|
* as locks used during the cgroup_subsys::attach() methods.
|
||||||
|
*/
|
||||||
|
#ifdef CONFIG_PROVE_RCU
|
||||||
|
-extern struct mutex cgroup_mutex;
|
||||||
|
extern spinlock_t css_set_lock;
|
||||||
|
#define task_css_set_check(task, __c) \
|
||||||
|
rcu_dereference_check((task)->cgroups, \
|
||||||
|
@@ -704,6 +715,8 @@ struct cgroup;
|
||||||
|
static inline u64 cgroup_id(const struct cgroup *cgrp) { return 1; }
|
||||||
|
static inline void css_get(struct cgroup_subsys_state *css) {}
|
||||||
|
static inline void css_put(struct cgroup_subsys_state *css) {}
|
||||||
|
+static inline void cgroup_lock(void) {}
|
||||||
|
+static inline void cgroup_unlock(void) {}
|
||||||
|
static inline int cgroup_attach_task_all(struct task_struct *from,
|
||||||
|
struct task_struct *t) { return 0; }
|
||||||
|
static inline int cgroupstats_build(struct cgroupstats *stats,
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -0,0 +1,190 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id 3D894C433ED
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:01 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 16A5761278
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:01 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S1345104AbhDMG5S (ORCPT
|
||||||
|
<rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:18 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44168 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S237169AbhDMG5I (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:08 -0400
|
||||||
|
Received: from mail-qk1-x749.google.com (mail-qk1-x749.google.com [IPv6:2607:f8b0:4864:20::749])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 01AF4C06175F
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:49 -0700 (PDT)
|
||||||
|
Received: by mail-qk1-x749.google.com with SMTP id j24so9889811qkg.7
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:48 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=kZ40TZQJmz2zt6lYwCpeAnxVbOWM8KwFdCtsfH6CbQ4=;
|
||||||
|
b=Lo7XMOOHbyzBoRlK8b2GE15qCT4QqS9ijyXSl1ryGVj5Alkuv2mcfhY4vR1gU/ak5i
|
||||||
|
HPCaNU4SNyd/togq6z9pJeIcKdhVNoakHlBzalPajFLmRC9Qbai2K4MiOiC3w/4zVP3/
|
||||||
|
NtLrS3pnu6kRnE/1OF1NCyaMABOTJ1Ahmg/dZPqItxMI54CzXgYo6GdLYksK4AzjBKx6
|
||||||
|
3OPkxOXxP71Nm7Tjl273X7BKZEBEv2cYYpFtO65/dAM6wU+OCRnD0EkkgtX7e7+gTBso
|
||||||
|
oX16tOXHwiiZ6sLaMJLirvmeW9Lp7bXGjP63ZC1IEHuQFyVaxg7TzhpG+PXULs33Mwht
|
||||||
|
64KQ==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=kZ40TZQJmz2zt6lYwCpeAnxVbOWM8KwFdCtsfH6CbQ4=;
|
||||||
|
b=m5HbExYCzmc21c5OLCzzHa8Xe8EdXvMRiTtiR09Dq8ChzNpcxJHIjjhpQyFMcUJWLj
|
||||||
|
+EmmgKiIE+uS4OHdEXmzNSv8MNhhEq7kUHf2SgjNDKlYLuCdTyrGG1MSWfK/msnX8s0I
|
||||||
|
ed03u8uPvY4i5nrXUPDSK0dSOilJdsKsbJ2GZF+UbwvHZb/bl7np8JUMFzrB2dYfV3GD
|
||||||
|
rJFKMpvlKiHjGv/usQSGWtLVDxlNl2ZH02SQETt2ZwtrhNj3g1Je8bALwt2ZVdzkZCGJ
|
||||||
|
ieq/RzKjaSqH69A9hehJuecmBRowdH3vtX4JtNR1N62OtoE92KN5JhRy7UIVzomglFHL
|
||||||
|
9n1A==
|
||||||
|
X-Gm-Message-State: AOAM533DVaJizLoTWtX7Zoe1e9yCLp7H3odxXAoCcHrMJ9IzNh+lDvEB
|
||||||
|
F0NqK2LlktrIoIPLMrk68BAVCsE0tyc=
|
||||||
|
X-Google-Smtp-Source: ABdhPJx0OFD8QshALbNm7ufdWhFpw5ctF+y/1hKbFM42Olw0k5XnLx6uQVu5On95xo6CAByxMQgtMhVbOBY=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:a0c:fa12:: with SMTP id q18mr9972206qvn.2.1618297008125;
|
||||||
|
Mon, 12 Apr 2021 23:56:48 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:22 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-6-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 05/16] mm/swap.c: export activate_page()
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-6-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
activate_page() is needed to activate pages that are already on lru or
|
||||||
|
queued in lru_pvecs.lru_add. The exported function is a merger between
|
||||||
|
the existing activate_page() and __lru_cache_activate_page().
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
include/linux/swap.h | 1 +
|
||||||
|
mm/swap.c | 28 +++++++++++++++-------------
|
||||||
|
2 files changed, 16 insertions(+), 13 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/include/linux/swap.h b/include/linux/swap.h
|
||||||
|
index 4cc6ec3bf0ab..de2bbbf181ba 100644
|
||||||
|
--- a/include/linux/swap.h
|
||||||
|
+++ b/include/linux/swap.h
|
||||||
|
@@ -344,6 +344,7 @@ extern void lru_add_drain_cpu(int cpu);
|
||||||
|
extern void lru_add_drain_cpu_zone(struct zone *zone);
|
||||||
|
extern void lru_add_drain_all(void);
|
||||||
|
extern void rotate_reclaimable_page(struct page *page);
|
||||||
|
+extern void activate_page(struct page *page);
|
||||||
|
extern void deactivate_file_page(struct page *page);
|
||||||
|
extern void deactivate_page(struct page *page);
|
||||||
|
extern void mark_page_lazyfree(struct page *page);
|
||||||
|
diff --git a/mm/swap.c b/mm/swap.c
|
||||||
|
index 31b844d4ed94..f20ed56ebbbf 100644
|
||||||
|
--- a/mm/swap.c
|
||||||
|
+++ b/mm/swap.c
|
||||||
|
@@ -334,7 +334,7 @@ static bool need_activate_page_drain(int cpu)
|
||||||
|
return pagevec_count(&per_cpu(lru_pvecs.activate_page, cpu)) != 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
-static void activate_page(struct page *page)
|
||||||
|
+static void activate_page_on_lru(struct page *page)
|
||||||
|
{
|
||||||
|
page = compound_head(page);
|
||||||
|
if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
|
||||||
|
@@ -354,7 +354,7 @@ static inline void activate_page_drain(int cpu)
|
||||||
|
{
|
||||||
|
}
|
||||||
|
|
||||||
|
-static void activate_page(struct page *page)
|
||||||
|
+static void activate_page_on_lru(struct page *page)
|
||||||
|
{
|
||||||
|
struct lruvec *lruvec;
|
||||||
|
|
||||||
|
@@ -368,11 +368,22 @@ static void activate_page(struct page *page)
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
-static void __lru_cache_activate_page(struct page *page)
|
||||||
|
+/*
|
||||||
|
+ * If the page is on the LRU, queue it for activation via
|
||||||
|
+ * lru_pvecs.activate_page. Otherwise, assume the page is on a
|
||||||
|
+ * pagevec, mark it active and it'll be moved to the active
|
||||||
|
+ * LRU on the next drain.
|
||||||
|
+ */
|
||||||
|
+void activate_page(struct page *page)
|
||||||
|
{
|
||||||
|
struct pagevec *pvec;
|
||||||
|
int i;
|
||||||
|
|
||||||
|
+ if (PageLRU(page)) {
|
||||||
|
+ activate_page_on_lru(page);
|
||||||
|
+ return;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
local_lock(&lru_pvecs.lock);
|
||||||
|
pvec = this_cpu_ptr(&lru_pvecs.lru_add);
|
||||||
|
|
||||||
|
@@ -421,16 +432,7 @@ void mark_page_accessed(struct page *page)
|
||||||
|
* evictable page accessed has no effect.
|
||||||
|
*/
|
||||||
|
} else if (!PageActive(page)) {
|
||||||
|
- /*
|
||||||
|
- * If the page is on the LRU, queue it for activation via
|
||||||
|
- * lru_pvecs.activate_page. Otherwise, assume the page is on a
|
||||||
|
- * pagevec, mark it active and it'll be moved to the active
|
||||||
|
- * LRU on the next drain.
|
||||||
|
- */
|
||||||
|
- if (PageLRU(page))
|
||||||
|
- activate_page(page);
|
||||||
|
- else
|
||||||
|
- __lru_cache_activate_page(page);
|
||||||
|
+ activate_page(page);
|
||||||
|
ClearPageReferenced(page);
|
||||||
|
workingset_activation(page);
|
||||||
|
}
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -0,0 +1,214 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id AE093C433B4
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:02 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 867F3613B6
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:02 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S237032AbhDMG5T (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:19 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44174 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S245189AbhDMG5J (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:09 -0400
|
||||||
|
Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 804B0C061756
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:50 -0700 (PDT)
|
||||||
|
Received: by mail-yb1-xb49.google.com with SMTP id t9so4737272ybd.11
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:50 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=ClG8Hyf5TTtr5EO4ugQ7lEF5g9jg1Khbtn9fiHHHUO4=;
|
||||||
|
b=Bz/NCIqrnVElEbVXzKEuDo/rZuQj9KS3qgxTKdWtHhz5pm8i/K2zNVWoVZLOT3rUSR
|
||||||
|
LsBDpHnPsr/ZpnLlRjgBWaTe1LWedpUZEH5ms55YmlHa6b6jgezdJL3RT6PspSs7PC0D
|
||||||
|
X2Cp8BNNHZoXRtz4WK/5SGU3p+K+AzCV3OWzqDVroA6mh4+0ezV8mgPVSzwRPD5kb0gr
|
||||||
|
h1rkXixNjOMz9WdBgGoShJ+IdH8LzpJqTgis+qWDrFblJngv4Of0j7VP1YZiUBDZBIO8
|
||||||
|
UPhfTPDB4QZtT8MN0GMlMXbeAlUWYEo/7WcySgFwiSO0kt7YfrA1ke9uBnFFX4PziJEZ
|
||||||
|
ISaA==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=ClG8Hyf5TTtr5EO4ugQ7lEF5g9jg1Khbtn9fiHHHUO4=;
|
||||||
|
b=Ln0JHJYmVa2eSlKqpGtl/4uP0U/tFRs/pk5G6Sl8Iec4RrR5oqZdSeZC19j8TSeMUO
|
||||||
|
DmIZ5X8vhdMmgBAkWF7E4NxzMbBEJfzjseP4tvMHiWSQ+ZWeCLuYCrW6DEaObyCK+T7t
|
||||||
|
zIVNPEeJOIg1zDbSyPA0EVnJqpe6Gkec8ahBEG03YbyTmfuG6vb0McULQljJ5OhniFfX
|
||||||
|
UripKlgaIV1a55hf1KsyL81MPaz5nGMe/cCHrm8EHqvFhxWzKWFO1Qk4Tc1VI45wYTHS
|
||||||
|
YVo0QOvbSbampG2ears9RXvYdJ9QVT1M8JfO5/+bVnbN3VbRLxG7g4jVuwkA4zPKOHYI
|
||||||
|
dISw==
|
||||||
|
X-Gm-Message-State: AOAM531fA312edJF5bN6zMI4xlJ2NDI7L0pqlv/7HXEcSl6sGX7pfMuO
|
||||||
|
8LvKSxlzMxN/BLov7kCFr0vqNk/bYbk=
|
||||||
|
X-Google-Smtp-Source: ABdhPJwc8JriuoHPQ23GGBqKR69oc5Gp+cE2EiR0xXWJLv2glle7kn2s+OHctKLTVqR0qrsNshOCMzVz8BQ=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:a25:8b0f:: with SMTP id i15mr42151231ybl.277.1618297009506;
|
||||||
|
Mon, 12 Apr 2021 23:56:49 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:23 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-7-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 06/16] mm, x86: support the access bit on non-leaf PMD entries
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-7-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
Some architectures support the accessed bit on non-leaf PMD entries
|
||||||
|
(parents) in addition to leaf PTE entries (children) where pages are
|
||||||
|
mapped, e.g., x86_64 sets the accessed bit on a parent when using it
|
||||||
|
as part of linear-address translation [1]. Page table walkers who are
|
||||||
|
interested in the accessed bit on children can take advantage of this:
|
||||||
|
they do not need to search the children when the accessed bit is not
|
||||||
|
set on a parent, given that they have previously cleared the accessed
|
||||||
|
bit on this parent.
|
||||||
|
|
||||||
|
[1]: Intel 64 and IA-32 Architectures Software Developer's Manual
|
||||||
|
Volume 3 (October 2019), section 4.8
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
arch/Kconfig | 9 +++++++++
|
||||||
|
arch/x86/Kconfig | 1 +
|
||||||
|
arch/x86/include/asm/pgtable.h | 2 +-
|
||||||
|
arch/x86/mm/pgtable.c | 5 ++++-
|
||||||
|
include/linux/pgtable.h | 4 ++--
|
||||||
|
5 files changed, 17 insertions(+), 4 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/arch/Kconfig b/arch/Kconfig
|
||||||
|
index ecfd3520b676..cbd7f66734ee 100644
|
||||||
|
--- a/arch/Kconfig
|
||||||
|
+++ b/arch/Kconfig
|
||||||
|
@@ -782,6 +782,15 @@ config HAVE_ARCH_TRANSPARENT_HUGEPAGE
|
||||||
|
config HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
|
||||||
|
bool
|
||||||
|
|
||||||
|
+config HAVE_ARCH_PARENT_PMD_YOUNG
|
||||||
|
+ bool
|
||||||
|
+ depends on PGTABLE_LEVELS > 2
|
||||||
|
+ help
|
||||||
|
+ Architectures that select this are able to set the accessed bit on
|
||||||
|
+ non-leaf PMD entries in addition to leaf PTE entries where pages are
|
||||||
|
+ mapped. For them, page table walkers that clear the accessed bit may
|
||||||
|
+ stop at non-leaf PMD entries when they do not see the accessed bit.
|
||||||
|
+
|
||||||
|
config HAVE_ARCH_HUGE_VMAP
|
||||||
|
bool
|
||||||
|
|
||||||
|
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
|
||||||
|
index 2792879d398e..b5972eb82337 100644
|
||||||
|
--- a/arch/x86/Kconfig
|
||||||
|
+++ b/arch/x86/Kconfig
|
||||||
|
@@ -163,6 +163,7 @@ config X86
|
||||||
|
select HAVE_ARCH_TRACEHOOK
|
||||||
|
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
|
||||||
|
select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if X86_64
|
||||||
|
+ select HAVE_ARCH_PARENT_PMD_YOUNG if X86_64
|
||||||
|
select HAVE_ARCH_USERFAULTFD_WP if X86_64 && USERFAULTFD
|
||||||
|
select HAVE_ARCH_VMAP_STACK if X86_64
|
||||||
|
select HAVE_ARCH_WITHIN_STACK_FRAMES
|
||||||
|
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
|
||||||
|
index a02c67291cfc..a6b5cfe1fc5a 100644
|
||||||
|
--- a/arch/x86/include/asm/pgtable.h
|
||||||
|
+++ b/arch/x86/include/asm/pgtable.h
|
||||||
|
@@ -846,7 +846,7 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
|
||||||
|
|
||||||
|
static inline int pmd_bad(pmd_t pmd)
|
||||||
|
{
|
||||||
|
- return (pmd_flags(pmd) & ~_PAGE_USER) != _KERNPG_TABLE;
|
||||||
|
+ return ((pmd_flags(pmd) | _PAGE_ACCESSED) & ~_PAGE_USER) != _KERNPG_TABLE;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline unsigned long pages_to_mb(unsigned long npg)
|
||||||
|
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
|
||||||
|
index f6a9e2e36642..1c27e6f43f80 100644
|
||||||
|
--- a/arch/x86/mm/pgtable.c
|
||||||
|
+++ b/arch/x86/mm/pgtable.c
|
||||||
|
@@ -550,7 +550,7 @@ int ptep_test_and_clear_young(struct vm_area_struct *vma,
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
|
||||||
|
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
|
||||||
|
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG)
|
||||||
|
int pmdp_test_and_clear_young(struct vm_area_struct *vma,
|
||||||
|
unsigned long addr, pmd_t *pmdp)
|
||||||
|
{
|
||||||
|
@@ -562,6 +562,9 @@ int pmdp_test_and_clear_young(struct vm_area_struct *vma,
|
||||||
|
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
+#endif
|
||||||
|
+
|
||||||
|
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
|
||||||
|
int pudp_test_and_clear_young(struct vm_area_struct *vma,
|
||||||
|
unsigned long addr, pud_t *pudp)
|
||||||
|
{
|
||||||
|
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
|
||||||
|
index 5e772392a379..08dd9b8c055a 100644
|
||||||
|
--- a/include/linux/pgtable.h
|
||||||
|
+++ b/include/linux/pgtable.h
|
||||||
|
@@ -193,7 +193,7 @@ static inline int ptep_test_and_clear_young(struct vm_area_struct *vma,
|
||||||
|
#endif
|
||||||
|
|
||||||
|
#ifndef __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG
|
||||||
|
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
|
||||||
|
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG)
|
||||||
|
static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
|
||||||
|
unsigned long address,
|
||||||
|
pmd_t *pmdp)
|
||||||
|
@@ -214,7 +214,7 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
|
||||||
|
BUILD_BUG();
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
|
||||||
|
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG */
|
||||||
|
#endif
|
||||||
|
|
||||||
|
#ifndef __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -0,0 +1,324 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id 51FFEC43460
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:09 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 379F261278
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:09 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S1345127AbhDMG5Z (ORCPT
|
||||||
|
<rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:25 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44184 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S242333AbhDMG5L (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:11 -0400
|
||||||
|
Received: from mail-qk1-x74a.google.com (mail-qk1-x74a.google.com [IPv6:2607:f8b0:4864:20::74a])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5CAAC06138C
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:51 -0700 (PDT)
|
||||||
|
Received: by mail-qk1-x74a.google.com with SMTP id g62so10544674qkf.18
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:51 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=1g9DmXJ6S7uvtuGOH48osWaF0/2fGGaQ6ChmAYKTm4o=;
|
||||||
|
b=VX7vOS1iaX+Hrwo31qklSok4an751KXHjlIezhTcoCSLXRV871k6PBsw+EibR4qWwF
|
||||||
|
i7kN3+4V671SYh9T69KvNxd786HKo+6WHv6Cd77TeqTfMbKijle6EBM4m+gl3DmNgnt0
|
||||||
|
ZA8WH1LPEZfGwn3JGivnRSoUPFkulI9NBk9pGJpe7wwngua0FZfbXjlpD6td2UZKxBbD
|
||||||
|
sm8Xc+HrppZn5mA4exh2/iFeR515mlnGTrbTx70pum7Y/iYPYQ2/HgcjccRGsGWUBLbF
|
||||||
|
bSOTnALSUrqOctmdDO2fO0EzfSnndPfVgKwv5QWLNUcXAi3ZlYRs7lyuvShH4lnaJxFe
|
||||||
|
LTUA==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=1g9DmXJ6S7uvtuGOH48osWaF0/2fGGaQ6ChmAYKTm4o=;
|
||||||
|
b=oh0TJS5Iv72EGLBpsE6HR3bE5cZX3J2uuz3z3TwZZpsfqvBQ4F+ZjeXnT9ZM8znSwl
|
||||||
|
DwO2yHU9V2acH3+Fw0txUASuMMXsp1h+eHsdlfoqbA5zx2G/8OJbldp/rudOwBO+wc4D
|
||||||
|
Wu7IiJYBc9jidKDE7Rputac3XOWXhSIhHMN1UGb8rIrlefaHD89A6pEKF6H/v6TSV99v
|
||||||
|
1MEtFUSmceep3K2EmUGX64fyXznC0KPZIkHHX/LcuC8xgYK2Go0LXGglt5x6U6QQ+Yk8
|
||||||
|
QGNr4pv1ynAg5b5FcA5bQe34gJ4JarQfXZx82+zF84UGh0Hj4hR4I60qEnSwVJBlCNqE
|
||||||
|
o7DA==
|
||||||
|
X-Gm-Message-State: AOAM532mqZo9PBRpK7zpxWavyuHSPxCR5uYKAcywst7dl0qA/ZdHQHKq
|
||||||
|
TyCJ6Kl6g2of6qtWwfJ7m9Y3UH3EDGM=
|
||||||
|
X-Google-Smtp-Source: ABdhPJwH+ey8nBGqYBlYs+cX0y6B8vZ/ifwsZXXs+V8u1FJGnhfXc1ufux+fOtI1iR9OnRAE6E9FqbhZIZQ=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:a0c:db82:: with SMTP id m2mr21253979qvk.37.1618297010980;
|
||||||
|
Mon, 12 Apr 2021 23:56:50 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:24 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-8-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 07/16] mm/vmscan.c: refactor shrink_node()
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-8-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
Heuristics that determine scan balance between anon and file LRUs are
|
||||||
|
rather independent. Move them into a separate function to improve
|
||||||
|
readability.
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
mm/vmscan.c | 186 +++++++++++++++++++++++++++-------------------------
|
||||||
|
1 file changed, 98 insertions(+), 88 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/mm/vmscan.c b/mm/vmscan.c
|
||||||
|
index 562e87cbd7a1..1a24d2e0a4cb 100644
|
||||||
|
--- a/mm/vmscan.c
|
||||||
|
+++ b/mm/vmscan.c
|
||||||
|
@@ -2224,6 +2224,103 @@ enum scan_balance {
|
||||||
|
SCAN_FILE,
|
||||||
|
};
|
||||||
|
|
||||||
|
+static void prepare_scan_count(pg_data_t *pgdat, struct scan_control *sc)
|
||||||
|
+{
|
||||||
|
+ unsigned long file;
|
||||||
|
+ struct lruvec *target_lruvec;
|
||||||
|
+
|
||||||
|
+ target_lruvec = mem_cgroup_lruvec(sc->target_mem_cgroup, pgdat);
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * Determine the scan balance between anon and file LRUs.
|
||||||
|
+ */
|
||||||
|
+ spin_lock_irq(&target_lruvec->lru_lock);
|
||||||
|
+ sc->anon_cost = target_lruvec->anon_cost;
|
||||||
|
+ sc->file_cost = target_lruvec->file_cost;
|
||||||
|
+ spin_unlock_irq(&target_lruvec->lru_lock);
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * Target desirable inactive:active list ratios for the anon
|
||||||
|
+ * and file LRU lists.
|
||||||
|
+ */
|
||||||
|
+ if (!sc->force_deactivate) {
|
||||||
|
+ unsigned long refaults;
|
||||||
|
+
|
||||||
|
+ refaults = lruvec_page_state(target_lruvec,
|
||||||
|
+ WORKINGSET_ACTIVATE_ANON);
|
||||||
|
+ if (refaults != target_lruvec->refaults[0] ||
|
||||||
|
+ inactive_is_low(target_lruvec, LRU_INACTIVE_ANON))
|
||||||
|
+ sc->may_deactivate |= DEACTIVATE_ANON;
|
||||||
|
+ else
|
||||||
|
+ sc->may_deactivate &= ~DEACTIVATE_ANON;
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * When refaults are being observed, it means a new
|
||||||
|
+ * workingset is being established. Deactivate to get
|
||||||
|
+ * rid of any stale active pages quickly.
|
||||||
|
+ */
|
||||||
|
+ refaults = lruvec_page_state(target_lruvec,
|
||||||
|
+ WORKINGSET_ACTIVATE_FILE);
|
||||||
|
+ if (refaults != target_lruvec->refaults[1] ||
|
||||||
|
+ inactive_is_low(target_lruvec, LRU_INACTIVE_FILE))
|
||||||
|
+ sc->may_deactivate |= DEACTIVATE_FILE;
|
||||||
|
+ else
|
||||||
|
+ sc->may_deactivate &= ~DEACTIVATE_FILE;
|
||||||
|
+ } else
|
||||||
|
+ sc->may_deactivate = DEACTIVATE_ANON | DEACTIVATE_FILE;
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * If we have plenty of inactive file pages that aren't
|
||||||
|
+ * thrashing, try to reclaim those first before touching
|
||||||
|
+ * anonymous pages.
|
||||||
|
+ */
|
||||||
|
+ file = lruvec_page_state(target_lruvec, NR_INACTIVE_FILE);
|
||||||
|
+ if (file >> sc->priority && !(sc->may_deactivate & DEACTIVATE_FILE))
|
||||||
|
+ sc->cache_trim_mode = 1;
|
||||||
|
+ else
|
||||||
|
+ sc->cache_trim_mode = 0;
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * Prevent the reclaimer from falling into the cache trap: as
|
||||||
|
+ * cache pages start out inactive, every cache fault will tip
|
||||||
|
+ * the scan balance towards the file LRU. And as the file LRU
|
||||||
|
+ * shrinks, so does the window for rotation from references.
|
||||||
|
+ * This means we have a runaway feedback loop where a tiny
|
||||||
|
+ * thrashing file LRU becomes infinitely more attractive than
|
||||||
|
+ * anon pages. Try to detect this based on file LRU size.
|
||||||
|
+ */
|
||||||
|
+ if (!cgroup_reclaim(sc)) {
|
||||||
|
+ unsigned long total_high_wmark = 0;
|
||||||
|
+ unsigned long free, anon;
|
||||||
|
+ int z;
|
||||||
|
+
|
||||||
|
+ free = sum_zone_node_page_state(pgdat->node_id, NR_FREE_PAGES);
|
||||||
|
+ file = node_page_state(pgdat, NR_ACTIVE_FILE) +
|
||||||
|
+ node_page_state(pgdat, NR_INACTIVE_FILE);
|
||||||
|
+
|
||||||
|
+ for (z = 0; z < MAX_NR_ZONES; z++) {
|
||||||
|
+ struct zone *zone = &pgdat->node_zones[z];
|
||||||
|
+
|
||||||
|
+ if (!managed_zone(zone))
|
||||||
|
+ continue;
|
||||||
|
+
|
||||||
|
+ total_high_wmark += high_wmark_pages(zone);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * Consider anon: if that's low too, this isn't a
|
||||||
|
+ * runaway file reclaim problem, but rather just
|
||||||
|
+ * extreme pressure. Reclaim as per usual then.
|
||||||
|
+ */
|
||||||
|
+ anon = node_page_state(pgdat, NR_INACTIVE_ANON);
|
||||||
|
+
|
||||||
|
+ sc->file_is_tiny =
|
||||||
|
+ file + free <= total_high_wmark &&
|
||||||
|
+ !(sc->may_deactivate & DEACTIVATE_ANON) &&
|
||||||
|
+ anon >> sc->priority;
|
||||||
|
+ }
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
/*
|
||||||
|
* Determine how aggressively the anon and file LRU lists should be
|
||||||
|
* scanned. The relative value of each set of LRU lists is determined
|
||||||
|
@@ -2669,7 +2766,6 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
|
||||||
|
unsigned long nr_reclaimed, nr_scanned;
|
||||||
|
struct lruvec *target_lruvec;
|
||||||
|
bool reclaimable = false;
|
||||||
|
- unsigned long file;
|
||||||
|
|
||||||
|
target_lruvec = mem_cgroup_lruvec(sc->target_mem_cgroup, pgdat);
|
||||||
|
|
||||||
|
@@ -2679,93 +2775,7 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
|
||||||
|
nr_reclaimed = sc->nr_reclaimed;
|
||||||
|
nr_scanned = sc->nr_scanned;
|
||||||
|
|
||||||
|
- /*
|
||||||
|
- * Determine the scan balance between anon and file LRUs.
|
||||||
|
- */
|
||||||
|
- spin_lock_irq(&target_lruvec->lru_lock);
|
||||||
|
- sc->anon_cost = target_lruvec->anon_cost;
|
||||||
|
- sc->file_cost = target_lruvec->file_cost;
|
||||||
|
- spin_unlock_irq(&target_lruvec->lru_lock);
|
||||||
|
-
|
||||||
|
- /*
|
||||||
|
- * Target desirable inactive:active list ratios for the anon
|
||||||
|
- * and file LRU lists.
|
||||||
|
- */
|
||||||
|
- if (!sc->force_deactivate) {
|
||||||
|
- unsigned long refaults;
|
||||||
|
-
|
||||||
|
- refaults = lruvec_page_state(target_lruvec,
|
||||||
|
- WORKINGSET_ACTIVATE_ANON);
|
||||||
|
- if (refaults != target_lruvec->refaults[0] ||
|
||||||
|
- inactive_is_low(target_lruvec, LRU_INACTIVE_ANON))
|
||||||
|
- sc->may_deactivate |= DEACTIVATE_ANON;
|
||||||
|
- else
|
||||||
|
- sc->may_deactivate &= ~DEACTIVATE_ANON;
|
||||||
|
-
|
||||||
|
- /*
|
||||||
|
- * When refaults are being observed, it means a new
|
||||||
|
- * workingset is being established. Deactivate to get
|
||||||
|
- * rid of any stale active pages quickly.
|
||||||
|
- */
|
||||||
|
- refaults = lruvec_page_state(target_lruvec,
|
||||||
|
- WORKINGSET_ACTIVATE_FILE);
|
||||||
|
- if (refaults != target_lruvec->refaults[1] ||
|
||||||
|
- inactive_is_low(target_lruvec, LRU_INACTIVE_FILE))
|
||||||
|
- sc->may_deactivate |= DEACTIVATE_FILE;
|
||||||
|
- else
|
||||||
|
- sc->may_deactivate &= ~DEACTIVATE_FILE;
|
||||||
|
- } else
|
||||||
|
- sc->may_deactivate = DEACTIVATE_ANON | DEACTIVATE_FILE;
|
||||||
|
-
|
||||||
|
- /*
|
||||||
|
- * If we have plenty of inactive file pages that aren't
|
||||||
|
- * thrashing, try to reclaim those first before touching
|
||||||
|
- * anonymous pages.
|
||||||
|
- */
|
||||||
|
- file = lruvec_page_state(target_lruvec, NR_INACTIVE_FILE);
|
||||||
|
- if (file >> sc->priority && !(sc->may_deactivate & DEACTIVATE_FILE))
|
||||||
|
- sc->cache_trim_mode = 1;
|
||||||
|
- else
|
||||||
|
- sc->cache_trim_mode = 0;
|
||||||
|
-
|
||||||
|
- /*
|
||||||
|
- * Prevent the reclaimer from falling into the cache trap: as
|
||||||
|
- * cache pages start out inactive, every cache fault will tip
|
||||||
|
- * the scan balance towards the file LRU. And as the file LRU
|
||||||
|
- * shrinks, so does the window for rotation from references.
|
||||||
|
- * This means we have a runaway feedback loop where a tiny
|
||||||
|
- * thrashing file LRU becomes infinitely more attractive than
|
||||||
|
- * anon pages. Try to detect this based on file LRU size.
|
||||||
|
- */
|
||||||
|
- if (!cgroup_reclaim(sc)) {
|
||||||
|
- unsigned long total_high_wmark = 0;
|
||||||
|
- unsigned long free, anon;
|
||||||
|
- int z;
|
||||||
|
-
|
||||||
|
- free = sum_zone_node_page_state(pgdat->node_id, NR_FREE_PAGES);
|
||||||
|
- file = node_page_state(pgdat, NR_ACTIVE_FILE) +
|
||||||
|
- node_page_state(pgdat, NR_INACTIVE_FILE);
|
||||||
|
-
|
||||||
|
- for (z = 0; z < MAX_NR_ZONES; z++) {
|
||||||
|
- struct zone *zone = &pgdat->node_zones[z];
|
||||||
|
- if (!managed_zone(zone))
|
||||||
|
- continue;
|
||||||
|
-
|
||||||
|
- total_high_wmark += high_wmark_pages(zone);
|
||||||
|
- }
|
||||||
|
-
|
||||||
|
- /*
|
||||||
|
- * Consider anon: if that's low too, this isn't a
|
||||||
|
- * runaway file reclaim problem, but rather just
|
||||||
|
- * extreme pressure. Reclaim as per usual then.
|
||||||
|
- */
|
||||||
|
- anon = node_page_state(pgdat, NR_INACTIVE_ANON);
|
||||||
|
-
|
||||||
|
- sc->file_is_tiny =
|
||||||
|
- file + free <= total_high_wmark &&
|
||||||
|
- !(sc->may_deactivate & DEACTIVATE_ANON) &&
|
||||||
|
- anon >> sc->priority;
|
||||||
|
- }
|
||||||
|
+ prepare_scan_count(pgdat, sc);
|
||||||
|
|
||||||
|
shrink_node_memcgs(pgdat, sc);
|
||||||
|
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,940 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id EF4FEC43462
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:18 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id CFA6161278
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:18 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S1345133AbhDMG5g (ORCPT
|
||||||
|
<rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:36 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44204 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S1345075AbhDMG5O (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:14 -0400
|
||||||
|
Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1B27C061342
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:54 -0700 (PDT)
|
||||||
|
Received: by mail-yb1-xb49.google.com with SMTP id g7so15243258ybm.13
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:54 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=o5Jou7hUitprbLWSkwF9m0rzlQtpjYePVUNvL8744B4=;
|
||||||
|
b=j0OnRRuICsaUkKDFgMmxVB6XdLNdlw7bkERy4WEKt8hjBSvD+Kp0+iOIcFy8N7824S
|
||||||
|
fiIZT/4kse0kGwqLNz6aT5fmfZX9JxxYEdOVwlR/Ws0MZO827eTQkQKIlfbqh7xkc4GT
|
||||||
|
TA7uVRsWqbOXCZgWt9zOAQjOZb/rs2P9QMKUlOFvfucJY2YuTWnwAyhKKGoanMVjppPe
|
||||||
|
XiDsyf+xl36l8HZCKTFf1nC3jlDQYELifqMsU7LnJQvyp4qL2Ghw5qGYALRz1HLWn1HT
|
||||||
|
nDo94se9xqkySvHWr7K7F6f3bxkPeLasd/CUo3jf80RHfUmgLwPgfJh9UGJtXbKnz7fZ
|
||||||
|
QiIQ==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=o5Jou7hUitprbLWSkwF9m0rzlQtpjYePVUNvL8744B4=;
|
||||||
|
b=GyMzG4Y9CRlIQTVJmAqzu40iDf9Ip5RESHdeLQAYm+tiJUh2RGVBJa6vKg38UMcgXC
|
||||||
|
EphRx2fv2WzLbuzG3KYV63fQ6mVN44J7Q5DZllmGANTY0ulI4ONN6upN04OPR+6Py8nD
|
||||||
|
thVg9bECRFbbKis2TNfSLXbGoO0/p8IfhjTpTAY+/gcDlXuuEwdN42+F5w+mKC73Ybd4
|
||||||
|
YzMfYRrVWHdmd49KirIiJ2yKVwsTTFfOgJlsRhMjIxnKiDO88ZiQPXOhSThi9Pq3d4xZ
|
||||||
|
AKWIylGhQNKmESlmvpmEzuo3lhpofz6NtP61MD5kogRHKN8cOrfEwHfr81CTzg1JSAjQ
|
||||||
|
d+PQ==
|
||||||
|
X-Gm-Message-State: AOAM530BBghVYsHEGPHYaVOEjeRU+Fi6DhCLAJz+E/4KNkH046B//NxP
|
||||||
|
jRpr98Lw0DozCkFBmdQ3Y2SqfxcTm/k=
|
||||||
|
X-Google-Smtp-Source: ABdhPJw4gIvDWjMb3eWqmdPfHBjM8mpzIQ6uMlcwopqsTVyafHAw8KFn3kdXyj3+PrOeIymH0kmLZduE+GQ=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:a5b:f51:: with SMTP id y17mr7630772ybr.398.1618297013927;
|
||||||
|
Mon, 12 Apr 2021 23:56:53 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:26 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-10-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 09/16] mm: multigenerational lru: activation
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-10-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
For pages accessed multiple times via file descriptors, instead of
|
||||||
|
activating them upon the second accesses, we activate them based on
|
||||||
|
the refault rates of their tiers. Pages accessed N times via file
|
||||||
|
descriptors belong to tier order_base_2(N). Pages from tier 0, i.e.,
|
||||||
|
those read ahead, accessed once via file descriptors and accessed only
|
||||||
|
via page tables, are evicted regardless of the refault rate. Pages
|
||||||
|
from other tiers will be moved to the next generation, i.e.,
|
||||||
|
activated, if the refault rates of their tiers are higher than that of
|
||||||
|
tier 0. Each generation contains at most MAX_NR_TIERS tiers, and they
|
||||||
|
require additional MAX_NR_TIERS-2 bits in page->flags. This feedback
|
||||||
|
model has a few advantages over the current feedforward model:
|
||||||
|
1) It has a negligible overhead in the access path because
|
||||||
|
activations are done in the reclaim path.
|
||||||
|
2) It takes mapped pages into account and avoids overprotecting
|
||||||
|
pages accessed multiple times via file descriptors.
|
||||||
|
3) More tiers offer better protection to pages accessed more than
|
||||||
|
twice when buffered-I/O-intensive workloads are under memory
|
||||||
|
pressure.
|
||||||
|
|
||||||
|
For pages mapped upon page faults, the accessed bit is set and they
|
||||||
|
must be properly aged. We add them to the per-zone lists index by
|
||||||
|
max_seq, i.e., the youngest generation. For pages not in page cache
|
||||||
|
or swap cache, this can be done easily in the page fault path: we
|
||||||
|
rename lru_cache_add_inactive_or_unevictable() to
|
||||||
|
lru_cache_add_page_vma() and add a new parameter, which is set to true
|
||||||
|
for pages mapped upon page faults. For pages in page cache or swap
|
||||||
|
cache, we cannot differentiate the page fault path from the read ahead
|
||||||
|
path at the time we call lru_cache_add() in add_to_page_cache_lru()
|
||||||
|
and __read_swap_cache_async(). So we add a new function
|
||||||
|
lru_gen_activation(), which is essentially activate_page(), to move
|
||||||
|
pages to the per-zone lists indexed by max_seq at a later time.
|
||||||
|
Hopefully we would find those pages in lru_pvecs.lru_add and simply
|
||||||
|
set PageActive() on them without having to actually move them.
|
||||||
|
|
||||||
|
Finally, we need to be compatible with the existing notion of active
|
||||||
|
and inactive. We cannot use PageActive() because it is not set on
|
||||||
|
active pages unless they are isolated, in order to spare the aging the
|
||||||
|
trouble of clearing it when an active generation becomes inactive. A
|
||||||
|
new function page_is_active() compares the generation number of a page
|
||||||
|
with max_seq and max_seq-1 (modulo MAX_NR_GENS), which are considered
|
||||||
|
active and protected from the eviction. Other generations, which may
|
||||||
|
or may not exist, are considered inactive.
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
fs/proc/task_mmu.c | 3 +-
|
||||||
|
include/linux/mm_inline.h | 101 +++++++++++++++++++++
|
||||||
|
include/linux/swap.h | 4 +-
|
||||||
|
kernel/events/uprobes.c | 2 +-
|
||||||
|
mm/huge_memory.c | 2 +-
|
||||||
|
mm/khugepaged.c | 2 +-
|
||||||
|
mm/memory.c | 14 +--
|
||||||
|
mm/migrate.c | 2 +-
|
||||||
|
mm/swap.c | 26 +++---
|
||||||
|
mm/swapfile.c | 2 +-
|
||||||
|
mm/userfaultfd.c | 2 +-
|
||||||
|
mm/vmscan.c | 91 ++++++++++++++++++-
|
||||||
|
mm/workingset.c | 179 +++++++++++++++++++++++++++++++-------
|
||||||
|
13 files changed, 371 insertions(+), 59 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
|
||||||
|
index e862cab69583..d292f20c4e3d 100644
|
||||||
|
--- a/fs/proc/task_mmu.c
|
||||||
|
+++ b/fs/proc/task_mmu.c
|
||||||
|
@@ -19,6 +19,7 @@
|
||||||
|
#include <linux/shmem_fs.h>
|
||||||
|
#include <linux/uaccess.h>
|
||||||
|
#include <linux/pkeys.h>
|
||||||
|
+#include <linux/mm_inline.h>
|
||||||
|
|
||||||
|
#include <asm/elf.h>
|
||||||
|
#include <asm/tlb.h>
|
||||||
|
@@ -1718,7 +1719,7 @@ static void gather_stats(struct page *page, struct numa_maps *md, int pte_dirty,
|
||||||
|
if (PageSwapCache(page))
|
||||||
|
md->swapcache += nr_pages;
|
||||||
|
|
||||||
|
- if (PageActive(page) || PageUnevictable(page))
|
||||||
|
+ if (PageUnevictable(page) || page_is_active(compound_head(page), NULL))
|
||||||
|
md->active += nr_pages;
|
||||||
|
|
||||||
|
if (PageWriteback(page))
|
||||||
|
diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
|
||||||
|
index 2bf910eb3dd7..5eb4b12972ec 100644
|
||||||
|
--- a/include/linux/mm_inline.h
|
||||||
|
+++ b/include/linux/mm_inline.h
|
||||||
|
@@ -95,6 +95,12 @@ static inline int lru_gen_from_seq(unsigned long seq)
|
||||||
|
return seq % MAX_NR_GENS;
|
||||||
|
}
|
||||||
|
|
||||||
|
+/* Convert the level of usage to a tier. See the comment on MAX_NR_TIERS. */
|
||||||
|
+static inline int lru_tier_from_usage(int usage)
|
||||||
|
+{
|
||||||
|
+ return order_base_2(usage + 1);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
/* Return a proper index regardless whether we keep a full history of stats. */
|
||||||
|
static inline int sid_from_seq_or_gen(int seq_or_gen)
|
||||||
|
{
|
||||||
|
@@ -238,12 +244,93 @@ static inline bool lru_gen_deletion(struct page *page, struct lruvec *lruvec)
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
+/* Activate a page from page cache or swap cache after it's mapped. */
|
||||||
|
+static inline void lru_gen_activation(struct page *page, struct vm_area_struct *vma)
|
||||||
|
+{
|
||||||
|
+ if (!lru_gen_enabled())
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ if (PageActive(page) || PageUnevictable(page) || vma_is_dax(vma) ||
|
||||||
|
+ (vma->vm_flags & (VM_LOCKED | VM_SPECIAL)))
|
||||||
|
+ return;
|
||||||
|
+ /*
|
||||||
|
+ * TODO: pass vm_fault to add_to_page_cache_lru() and
|
||||||
|
+ * __read_swap_cache_async() so they can activate pages directly when in
|
||||||
|
+ * the page fault path.
|
||||||
|
+ */
|
||||||
|
+ activate_page(page);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
/* Return -1 when a page is not on a list of the multigenerational lru. */
|
||||||
|
static inline int page_lru_gen(struct page *page)
|
||||||
|
{
|
||||||
|
return ((READ_ONCE(page->flags) & LRU_GEN_MASK) >> LRU_GEN_PGOFF) - 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
+/* This function works regardless whether the multigenerational lru is enabled. */
|
||||||
|
+static inline bool page_is_active(struct page *page, struct lruvec *lruvec)
|
||||||
|
+{
|
||||||
|
+ struct mem_cgroup *memcg;
|
||||||
|
+ int gen = page_lru_gen(page);
|
||||||
|
+ bool active = false;
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON_PAGE(PageTail(page), page);
|
||||||
|
+
|
||||||
|
+ if (gen < 0)
|
||||||
|
+ return PageActive(page);
|
||||||
|
+
|
||||||
|
+ if (lruvec) {
|
||||||
|
+ VM_BUG_ON_PAGE(PageUnevictable(page), page);
|
||||||
|
+ VM_BUG_ON_PAGE(PageActive(page), page);
|
||||||
|
+ lockdep_assert_held(&lruvec->lru_lock);
|
||||||
|
+
|
||||||
|
+ return lru_gen_is_active(lruvec, gen);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ rcu_read_lock();
|
||||||
|
+
|
||||||
|
+ memcg = page_memcg_rcu(page);
|
||||||
|
+ lruvec = mem_cgroup_lruvec(memcg, page_pgdat(page));
|
||||||
|
+ active = lru_gen_is_active(lruvec, gen);
|
||||||
|
+
|
||||||
|
+ rcu_read_unlock();
|
||||||
|
+
|
||||||
|
+ return active;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+/* Return the level of usage of a page. See the comment on MAX_NR_TIERS. */
|
||||||
|
+static inline int page_tier_usage(struct page *page)
|
||||||
|
+{
|
||||||
|
+ unsigned long flags = READ_ONCE(page->flags);
|
||||||
|
+
|
||||||
|
+ return flags & BIT(PG_workingset) ?
|
||||||
|
+ ((flags & LRU_USAGE_MASK) >> LRU_USAGE_PGOFF) + 1 : 0;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+/* Increment the usage counter after a page is accessed via file descriptors. */
|
||||||
|
+static inline bool page_inc_usage(struct page *page)
|
||||||
|
+{
|
||||||
|
+ unsigned long old_flags, new_flags;
|
||||||
|
+
|
||||||
|
+ if (!lru_gen_enabled())
|
||||||
|
+ return PageActive(page);
|
||||||
|
+
|
||||||
|
+ do {
|
||||||
|
+ old_flags = READ_ONCE(page->flags);
|
||||||
|
+
|
||||||
|
+ if (!(old_flags & BIT(PG_workingset)))
|
||||||
|
+ new_flags = old_flags | BIT(PG_workingset);
|
||||||
|
+ else
|
||||||
|
+ new_flags = (old_flags & ~LRU_USAGE_MASK) | min(LRU_USAGE_MASK,
|
||||||
|
+ (old_flags & LRU_USAGE_MASK) + BIT(LRU_USAGE_PGOFF));
|
||||||
|
+
|
||||||
|
+ if (old_flags == new_flags)
|
||||||
|
+ break;
|
||||||
|
+ } while (cmpxchg(&page->flags, old_flags, new_flags) != old_flags);
|
||||||
|
+
|
||||||
|
+ return true;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
#else /* CONFIG_LRU_GEN */
|
||||||
|
|
||||||
|
static inline bool lru_gen_enabled(void)
|
||||||
|
@@ -261,6 +348,20 @@ static inline bool lru_gen_deletion(struct page *page, struct lruvec *lruvec)
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
+static inline void lru_gen_activation(struct page *page, struct vm_area_struct *vma)
|
||||||
|
+{
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static inline bool page_is_active(struct page *page, struct lruvec *lruvec)
|
||||||
|
+{
|
||||||
|
+ return PageActive(page);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static inline bool page_inc_usage(struct page *page)
|
||||||
|
+{
|
||||||
|
+ return PageActive(page);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
#endif /* CONFIG_LRU_GEN */
|
||||||
|
|
||||||
|
static __always_inline void add_page_to_lru_list(struct page *page,
|
||||||
|
diff --git a/include/linux/swap.h b/include/linux/swap.h
|
||||||
|
index de2bbbf181ba..0e7532c7db22 100644
|
||||||
|
--- a/include/linux/swap.h
|
||||||
|
+++ b/include/linux/swap.h
|
||||||
|
@@ -350,8 +350,8 @@ extern void deactivate_page(struct page *page);
|
||||||
|
extern void mark_page_lazyfree(struct page *page);
|
||||||
|
extern void swap_setup(void);
|
||||||
|
|
||||||
|
-extern void lru_cache_add_inactive_or_unevictable(struct page *page,
|
||||||
|
- struct vm_area_struct *vma);
|
||||||
|
+extern void lru_cache_add_page_vma(struct page *page, struct vm_area_struct *vma,
|
||||||
|
+ bool faulting);
|
||||||
|
|
||||||
|
/* linux/mm/vmscan.c */
|
||||||
|
extern unsigned long zone_reclaimable_pages(struct zone *zone);
|
||||||
|
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
|
||||||
|
index 6addc9780319..4e93e5602723 100644
|
||||||
|
--- a/kernel/events/uprobes.c
|
||||||
|
+++ b/kernel/events/uprobes.c
|
||||||
|
@@ -184,7 +184,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
|
||||||
|
if (new_page) {
|
||||||
|
get_page(new_page);
|
||||||
|
page_add_new_anon_rmap(new_page, vma, addr, false);
|
||||||
|
- lru_cache_add_inactive_or_unevictable(new_page, vma);
|
||||||
|
+ lru_cache_add_page_vma(new_page, vma, false);
|
||||||
|
} else
|
||||||
|
/* no new page, just dec_mm_counter for old_page */
|
||||||
|
dec_mm_counter(mm, MM_ANONPAGES);
|
||||||
|
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
|
||||||
|
index 26d3cc4a7a0b..2cf46270c84b 100644
|
||||||
|
--- a/mm/huge_memory.c
|
||||||
|
+++ b/mm/huge_memory.c
|
||||||
|
@@ -637,7 +637,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf,
|
||||||
|
entry = mk_huge_pmd(page, vma->vm_page_prot);
|
||||||
|
entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
|
||||||
|
page_add_new_anon_rmap(page, vma, haddr, true);
|
||||||
|
- lru_cache_add_inactive_or_unevictable(page, vma);
|
||||||
|
+ lru_cache_add_page_vma(page, vma, true);
|
||||||
|
pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable);
|
||||||
|
set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry);
|
||||||
|
update_mmu_cache_pmd(vma, vmf->address, vmf->pmd);
|
||||||
|
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
|
||||||
|
index a7d6cb912b05..08a43910f232 100644
|
||||||
|
--- a/mm/khugepaged.c
|
||||||
|
+++ b/mm/khugepaged.c
|
||||||
|
@@ -1199,7 +1199,7 @@ static void collapse_huge_page(struct mm_struct *mm,
|
||||||
|
spin_lock(pmd_ptl);
|
||||||
|
BUG_ON(!pmd_none(*pmd));
|
||||||
|
page_add_new_anon_rmap(new_page, vma, address, true);
|
||||||
|
- lru_cache_add_inactive_or_unevictable(new_page, vma);
|
||||||
|
+ lru_cache_add_page_vma(new_page, vma, true);
|
||||||
|
pgtable_trans_huge_deposit(mm, pmd, pgtable);
|
||||||
|
set_pmd_at(mm, address, pmd, _pmd);
|
||||||
|
update_mmu_cache_pmd(vma, address, pmd);
|
||||||
|
diff --git a/mm/memory.c b/mm/memory.c
|
||||||
|
index 550405fc3b5e..9a6cb6d31430 100644
|
||||||
|
--- a/mm/memory.c
|
||||||
|
+++ b/mm/memory.c
|
||||||
|
@@ -73,6 +73,7 @@
|
||||||
|
#include <linux/perf_event.h>
|
||||||
|
#include <linux/ptrace.h>
|
||||||
|
#include <linux/vmalloc.h>
|
||||||
|
+#include <linux/mm_inline.h>
|
||||||
|
|
||||||
|
#include <trace/events/kmem.h>
|
||||||
|
|
||||||
|
@@ -839,7 +840,7 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma
|
||||||
|
copy_user_highpage(new_page, page, addr, src_vma);
|
||||||
|
__SetPageUptodate(new_page);
|
||||||
|
page_add_new_anon_rmap(new_page, dst_vma, addr, false);
|
||||||
|
- lru_cache_add_inactive_or_unevictable(new_page, dst_vma);
|
||||||
|
+ lru_cache_add_page_vma(new_page, dst_vma, false);
|
||||||
|
rss[mm_counter(new_page)]++;
|
||||||
|
|
||||||
|
/* All done, just insert the new page copy in the child */
|
||||||
|
@@ -2907,7 +2908,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
|
||||||
|
*/
|
||||||
|
ptep_clear_flush_notify(vma, vmf->address, vmf->pte);
|
||||||
|
page_add_new_anon_rmap(new_page, vma, vmf->address, false);
|
||||||
|
- lru_cache_add_inactive_or_unevictable(new_page, vma);
|
||||||
|
+ lru_cache_add_page_vma(new_page, vma, true);
|
||||||
|
/*
|
||||||
|
* We call the notify macro here because, when using secondary
|
||||||
|
* mmu page tables (such as kvm shadow page tables), we want the
|
||||||
|
@@ -3438,9 +3439,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
|
||||||
|
/* ksm created a completely new copy */
|
||||||
|
if (unlikely(page != swapcache && swapcache)) {
|
||||||
|
page_add_new_anon_rmap(page, vma, vmf->address, false);
|
||||||
|
- lru_cache_add_inactive_or_unevictable(page, vma);
|
||||||
|
+ lru_cache_add_page_vma(page, vma, true);
|
||||||
|
} else {
|
||||||
|
do_page_add_anon_rmap(page, vma, vmf->address, exclusive);
|
||||||
|
+ lru_gen_activation(page, vma);
|
||||||
|
}
|
||||||
|
|
||||||
|
swap_free(entry);
|
||||||
|
@@ -3584,7 +3586,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
|
||||||
|
|
||||||
|
inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
|
||||||
|
page_add_new_anon_rmap(page, vma, vmf->address, false);
|
||||||
|
- lru_cache_add_inactive_or_unevictable(page, vma);
|
||||||
|
+ lru_cache_add_page_vma(page, vma, true);
|
||||||
|
setpte:
|
||||||
|
set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
|
||||||
|
|
||||||
|
@@ -3709,6 +3711,7 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
|
||||||
|
|
||||||
|
add_mm_counter(vma->vm_mm, mm_counter_file(page), HPAGE_PMD_NR);
|
||||||
|
page_add_file_rmap(page, true);
|
||||||
|
+ lru_gen_activation(page, vma);
|
||||||
|
/*
|
||||||
|
* deposit and withdraw with pmd lock held
|
||||||
|
*/
|
||||||
|
@@ -3752,10 +3755,11 @@ void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr)
|
||||||
|
if (write && !(vma->vm_flags & VM_SHARED)) {
|
||||||
|
inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
|
||||||
|
page_add_new_anon_rmap(page, vma, addr, false);
|
||||||
|
- lru_cache_add_inactive_or_unevictable(page, vma);
|
||||||
|
+ lru_cache_add_page_vma(page, vma, true);
|
||||||
|
} else {
|
||||||
|
inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page));
|
||||||
|
page_add_file_rmap(page, false);
|
||||||
|
+ lru_gen_activation(page, vma);
|
||||||
|
}
|
||||||
|
set_pte_at(vma->vm_mm, addr, vmf->pte, entry);
|
||||||
|
}
|
||||||
|
diff --git a/mm/migrate.c b/mm/migrate.c
|
||||||
|
index 62b81d5257aa..1064b03cac33 100644
|
||||||
|
--- a/mm/migrate.c
|
||||||
|
+++ b/mm/migrate.c
|
||||||
|
@@ -3004,7 +3004,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate,
|
||||||
|
inc_mm_counter(mm, MM_ANONPAGES);
|
||||||
|
page_add_new_anon_rmap(page, vma, addr, false);
|
||||||
|
if (!is_zone_device_page(page))
|
||||||
|
- lru_cache_add_inactive_or_unevictable(page, vma);
|
||||||
|
+ lru_cache_add_page_vma(page, vma, false);
|
||||||
|
get_page(page);
|
||||||
|
|
||||||
|
if (flush) {
|
||||||
|
diff --git a/mm/swap.c b/mm/swap.c
|
||||||
|
index f20ed56ebbbf..d6458ee1e9f8 100644
|
||||||
|
--- a/mm/swap.c
|
||||||
|
+++ b/mm/swap.c
|
||||||
|
@@ -306,7 +306,7 @@ void lru_note_cost_page(struct page *page)
|
||||||
|
|
||||||
|
static void __activate_page(struct page *page, struct lruvec *lruvec)
|
||||||
|
{
|
||||||
|
- if (!PageActive(page) && !PageUnevictable(page)) {
|
||||||
|
+ if (!PageUnevictable(page) && !page_is_active(page, lruvec)) {
|
||||||
|
int nr_pages = thp_nr_pages(page);
|
||||||
|
|
||||||
|
del_page_from_lru_list(page, lruvec);
|
||||||
|
@@ -337,7 +337,7 @@ static bool need_activate_page_drain(int cpu)
|
||||||
|
static void activate_page_on_lru(struct page *page)
|
||||||
|
{
|
||||||
|
page = compound_head(page);
|
||||||
|
- if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
|
||||||
|
+ if (PageLRU(page) && !PageUnevictable(page) && !page_is_active(page, NULL)) {
|
||||||
|
struct pagevec *pvec;
|
||||||
|
|
||||||
|
local_lock(&lru_pvecs.lock);
|
||||||
|
@@ -431,7 +431,7 @@ void mark_page_accessed(struct page *page)
|
||||||
|
* this list is never rotated or maintained, so marking an
|
||||||
|
* evictable page accessed has no effect.
|
||||||
|
*/
|
||||||
|
- } else if (!PageActive(page)) {
|
||||||
|
+ } else if (!page_inc_usage(page)) {
|
||||||
|
activate_page(page);
|
||||||
|
ClearPageReferenced(page);
|
||||||
|
workingset_activation(page);
|
||||||
|
@@ -467,15 +467,14 @@ void lru_cache_add(struct page *page)
|
||||||
|
EXPORT_SYMBOL(lru_cache_add);
|
||||||
|
|
||||||
|
/**
|
||||||
|
- * lru_cache_add_inactive_or_unevictable
|
||||||
|
+ * lru_cache_add_page_vma
|
||||||
|
* @page: the page to be added to LRU
|
||||||
|
* @vma: vma in which page is mapped for determining reclaimability
|
||||||
|
*
|
||||||
|
- * Place @page on the inactive or unevictable LRU list, depending on its
|
||||||
|
- * evictability.
|
||||||
|
+ * Place @page on an LRU list, depending on its evictability.
|
||||||
|
*/
|
||||||
|
-void lru_cache_add_inactive_or_unevictable(struct page *page,
|
||||||
|
- struct vm_area_struct *vma)
|
||||||
|
+void lru_cache_add_page_vma(struct page *page, struct vm_area_struct *vma,
|
||||||
|
+ bool faulting)
|
||||||
|
{
|
||||||
|
bool unevictable;
|
||||||
|
|
||||||
|
@@ -492,6 +491,11 @@ void lru_cache_add_inactive_or_unevictable(struct page *page,
|
||||||
|
__mod_zone_page_state(page_zone(page), NR_MLOCK, nr_pages);
|
||||||
|
count_vm_events(UNEVICTABLE_PGMLOCKED, nr_pages);
|
||||||
|
}
|
||||||
|
+
|
||||||
|
+ /* tell the multigenerational lru that the page is being faulted in */
|
||||||
|
+ if (lru_gen_enabled() && !unevictable && faulting)
|
||||||
|
+ SetPageActive(page);
|
||||||
|
+
|
||||||
|
lru_cache_add(page);
|
||||||
|
}
|
||||||
|
|
||||||
|
@@ -518,7 +522,7 @@ void lru_cache_add_inactive_or_unevictable(struct page *page,
|
||||||
|
*/
|
||||||
|
static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
|
||||||
|
{
|
||||||
|
- bool active = PageActive(page);
|
||||||
|
+ bool active = page_is_active(page, lruvec);
|
||||||
|
int nr_pages = thp_nr_pages(page);
|
||||||
|
|
||||||
|
if (PageUnevictable(page))
|
||||||
|
@@ -558,7 +562,7 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
|
||||||
|
|
||||||
|
static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec)
|
||||||
|
{
|
||||||
|
- if (PageActive(page) && !PageUnevictable(page)) {
|
||||||
|
+ if (!PageUnevictable(page) && page_is_active(page, lruvec)) {
|
||||||
|
int nr_pages = thp_nr_pages(page);
|
||||||
|
|
||||||
|
del_page_from_lru_list(page, lruvec);
|
||||||
|
@@ -672,7 +676,7 @@ void deactivate_file_page(struct page *page)
|
||||||
|
*/
|
||||||
|
void deactivate_page(struct page *page)
|
||||||
|
{
|
||||||
|
- if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) {
|
||||||
|
+ if (PageLRU(page) && !PageUnevictable(page) && page_is_active(page, NULL)) {
|
||||||
|
struct pagevec *pvec;
|
||||||
|
|
||||||
|
local_lock(&lru_pvecs.lock);
|
||||||
|
diff --git a/mm/swapfile.c b/mm/swapfile.c
|
||||||
|
index c6041d10a73a..ab3b5ca404fd 100644
|
||||||
|
--- a/mm/swapfile.c
|
||||||
|
+++ b/mm/swapfile.c
|
||||||
|
@@ -1936,7 +1936,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
|
||||||
|
page_add_anon_rmap(page, vma, addr, false);
|
||||||
|
} else { /* ksm created a completely new copy */
|
||||||
|
page_add_new_anon_rmap(page, vma, addr, false);
|
||||||
|
- lru_cache_add_inactive_or_unevictable(page, vma);
|
||||||
|
+ lru_cache_add_page_vma(page, vma, false);
|
||||||
|
}
|
||||||
|
swap_free(entry);
|
||||||
|
out:
|
||||||
|
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
|
||||||
|
index 9a3d451402d7..e1d4cd3103b8 100644
|
||||||
|
--- a/mm/userfaultfd.c
|
||||||
|
+++ b/mm/userfaultfd.c
|
||||||
|
@@ -123,7 +123,7 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm,
|
||||||
|
|
||||||
|
inc_mm_counter(dst_mm, MM_ANONPAGES);
|
||||||
|
page_add_new_anon_rmap(page, dst_vma, dst_addr, false);
|
||||||
|
- lru_cache_add_inactive_or_unevictable(page, dst_vma);
|
||||||
|
+ lru_cache_add_page_vma(page, dst_vma, true);
|
||||||
|
|
||||||
|
set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);
|
||||||
|
|
||||||
|
diff --git a/mm/vmscan.c b/mm/vmscan.c
|
||||||
|
index 8559bb94d452..c74ebe2039f7 100644
|
||||||
|
--- a/mm/vmscan.c
|
||||||
|
+++ b/mm/vmscan.c
|
||||||
|
@@ -898,9 +898,11 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
|
||||||
|
|
||||||
|
if (PageSwapCache(page)) {
|
||||||
|
swp_entry_t swap = { .val = page_private(page) };
|
||||||
|
- mem_cgroup_swapout(page, swap);
|
||||||
|
+
|
||||||
|
+ /* get a shadow entry before page_memcg() is cleared */
|
||||||
|
if (reclaimed && !mapping_exiting(mapping))
|
||||||
|
shadow = workingset_eviction(page, target_memcg);
|
||||||
|
+ mem_cgroup_swapout(page, swap);
|
||||||
|
__delete_from_swap_cache(page, swap, shadow);
|
||||||
|
xa_unlock_irqrestore(&mapping->i_pages, flags);
|
||||||
|
put_swap_page(page, swap);
|
||||||
|
@@ -4375,6 +4377,93 @@ static bool __maybe_unused seq_is_valid(struct lruvec *lruvec)
|
||||||
|
get_nr_gens(lruvec, 1) <= MAX_NR_GENS;
|
||||||
|
}
|
||||||
|
|
||||||
|
+/******************************************************************************
|
||||||
|
+ * refault feedback loop
|
||||||
|
+ ******************************************************************************/
|
||||||
|
+
|
||||||
|
+/*
|
||||||
|
+ * A feedback loop modeled after the PID controller. Currently supports the
|
||||||
|
+ * proportional (P) and the integral (I) terms; the derivative (D) term can be
|
||||||
|
+ * added if necessary. The setpoint (SP) is the desired position; the process
|
||||||
|
+ * variable (PV) is the measured position. The error is the difference between
|
||||||
|
+ * the SP and the PV. A positive error results in a positive control output
|
||||||
|
+ * correction, which, in our case, is to allow eviction.
|
||||||
|
+ *
|
||||||
|
+ * The P term is the current refault rate refaulted/(evicted+activated), which
|
||||||
|
+ * has a weight of 1. The I term is the arithmetic mean of the last N refault
|
||||||
|
+ * rates, weighted by geometric series 1/2, 1/4, ..., 1/(1<<N).
|
||||||
|
+ *
|
||||||
|
+ * Our goal is to make sure upper tiers have similar refault rates as the base
|
||||||
|
+ * tier. That is we try to be fair to all tiers by maintaining similar refault
|
||||||
|
+ * rates across them.
|
||||||
|
+ */
|
||||||
|
+struct controller_pos {
|
||||||
|
+ unsigned long refaulted;
|
||||||
|
+ unsigned long total;
|
||||||
|
+ int gain;
|
||||||
|
+};
|
||||||
|
+
|
||||||
|
+static void read_controller_pos(struct controller_pos *pos, struct lruvec *lruvec,
|
||||||
|
+ int file, int tier, int gain)
|
||||||
|
+{
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+ int sid = sid_from_seq_or_gen(lrugen->min_seq[file]);
|
||||||
|
+
|
||||||
|
+ pos->refaulted = lrugen->avg_refaulted[file][tier] +
|
||||||
|
+ atomic_long_read(&lrugen->refaulted[sid][file][tier]);
|
||||||
|
+ pos->total = lrugen->avg_total[file][tier] +
|
||||||
|
+ atomic_long_read(&lrugen->evicted[sid][file][tier]);
|
||||||
|
+ if (tier)
|
||||||
|
+ pos->total += lrugen->activated[sid][file][tier - 1];
|
||||||
|
+ pos->gain = gain;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void reset_controller_pos(struct lruvec *lruvec, int gen, int file)
|
||||||
|
+{
|
||||||
|
+ int tier;
|
||||||
|
+ int sid = sid_from_seq_or_gen(gen);
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+ bool carryover = gen == lru_gen_from_seq(lrugen->min_seq[file]);
|
||||||
|
+
|
||||||
|
+ if (!carryover && NR_STAT_GENS == 1)
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ for (tier = 0; tier < MAX_NR_TIERS; tier++) {
|
||||||
|
+ if (carryover) {
|
||||||
|
+ unsigned long sum;
|
||||||
|
+
|
||||||
|
+ sum = lrugen->avg_refaulted[file][tier] +
|
||||||
|
+ atomic_long_read(&lrugen->refaulted[sid][file][tier]);
|
||||||
|
+ WRITE_ONCE(lrugen->avg_refaulted[file][tier], sum >> 1);
|
||||||
|
+
|
||||||
|
+ sum = lrugen->avg_total[file][tier] +
|
||||||
|
+ atomic_long_read(&lrugen->evicted[sid][file][tier]);
|
||||||
|
+ if (tier)
|
||||||
|
+ sum += lrugen->activated[sid][file][tier - 1];
|
||||||
|
+ WRITE_ONCE(lrugen->avg_total[file][tier], sum >> 1);
|
||||||
|
+
|
||||||
|
+ if (NR_STAT_GENS > 1)
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ atomic_long_set(&lrugen->refaulted[sid][file][tier], 0);
|
||||||
|
+ atomic_long_set(&lrugen->evicted[sid][file][tier], 0);
|
||||||
|
+ if (tier)
|
||||||
|
+ WRITE_ONCE(lrugen->activated[sid][file][tier - 1], 0);
|
||||||
|
+ }
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static bool positive_ctrl_err(struct controller_pos *sp, struct controller_pos *pv)
|
||||||
|
+{
|
||||||
|
+ /*
|
||||||
|
+ * Allow eviction if the PV has a limited number of refaulted pages or a
|
||||||
|
+ * lower refault rate than the SP.
|
||||||
|
+ */
|
||||||
|
+ return pv->refaulted < SWAP_CLUSTER_MAX ||
|
||||||
|
+ pv->refaulted * max(sp->total, 1UL) * sp->gain <=
|
||||||
|
+ sp->refaulted * max(pv->total, 1UL) * pv->gain;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
/******************************************************************************
|
||||||
|
* state change
|
||||||
|
******************************************************************************/
|
||||||
|
diff --git a/mm/workingset.c b/mm/workingset.c
|
||||||
|
index cd39902c1062..df363f9419fc 100644
|
||||||
|
--- a/mm/workingset.c
|
||||||
|
+++ b/mm/workingset.c
|
||||||
|
@@ -168,9 +168,9 @@
|
||||||
|
* refault distance will immediately activate the refaulting page.
|
||||||
|
*/
|
||||||
|
|
||||||
|
-#define EVICTION_SHIFT ((BITS_PER_LONG - BITS_PER_XA_VALUE) + \
|
||||||
|
- 1 + NODES_SHIFT + MEM_CGROUP_ID_SHIFT)
|
||||||
|
-#define EVICTION_MASK (~0UL >> EVICTION_SHIFT)
|
||||||
|
+#define EVICTION_SHIFT (BITS_PER_XA_VALUE - MEM_CGROUP_ID_SHIFT - NODES_SHIFT)
|
||||||
|
+#define EVICTION_MASK (BIT(EVICTION_SHIFT) - 1)
|
||||||
|
+#define WORKINGSET_WIDTH 1
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Eviction timestamps need to be able to cover the full range of
|
||||||
|
@@ -182,38 +182,139 @@
|
||||||
|
*/
|
||||||
|
static unsigned int bucket_order __read_mostly;
|
||||||
|
|
||||||
|
-static void *pack_shadow(int memcgid, pg_data_t *pgdat, unsigned long eviction,
|
||||||
|
- bool workingset)
|
||||||
|
+static void *pack_shadow(int memcg_id, struct pglist_data *pgdat, unsigned long val)
|
||||||
|
{
|
||||||
|
- eviction >>= bucket_order;
|
||||||
|
- eviction &= EVICTION_MASK;
|
||||||
|
- eviction = (eviction << MEM_CGROUP_ID_SHIFT) | memcgid;
|
||||||
|
- eviction = (eviction << NODES_SHIFT) | pgdat->node_id;
|
||||||
|
- eviction = (eviction << 1) | workingset;
|
||||||
|
+ val = (val << MEM_CGROUP_ID_SHIFT) | memcg_id;
|
||||||
|
+ val = (val << NODES_SHIFT) | pgdat->node_id;
|
||||||
|
|
||||||
|
- return xa_mk_value(eviction);
|
||||||
|
+ return xa_mk_value(val);
|
||||||
|
}
|
||||||
|
|
||||||
|
-static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
|
||||||
|
- unsigned long *evictionp, bool *workingsetp)
|
||||||
|
+static unsigned long unpack_shadow(void *shadow, int *memcg_id, struct pglist_data **pgdat)
|
||||||
|
{
|
||||||
|
- unsigned long entry = xa_to_value(shadow);
|
||||||
|
- int memcgid, nid;
|
||||||
|
- bool workingset;
|
||||||
|
-
|
||||||
|
- workingset = entry & 1;
|
||||||
|
- entry >>= 1;
|
||||||
|
- nid = entry & ((1UL << NODES_SHIFT) - 1);
|
||||||
|
- entry >>= NODES_SHIFT;
|
||||||
|
- memcgid = entry & ((1UL << MEM_CGROUP_ID_SHIFT) - 1);
|
||||||
|
- entry >>= MEM_CGROUP_ID_SHIFT;
|
||||||
|
-
|
||||||
|
- *memcgidp = memcgid;
|
||||||
|
- *pgdat = NODE_DATA(nid);
|
||||||
|
- *evictionp = entry << bucket_order;
|
||||||
|
- *workingsetp = workingset;
|
||||||
|
+ unsigned long val = xa_to_value(shadow);
|
||||||
|
+
|
||||||
|
+ *pgdat = NODE_DATA(val & (BIT(NODES_SHIFT) - 1));
|
||||||
|
+ val >>= NODES_SHIFT;
|
||||||
|
+ *memcg_id = val & (BIT(MEM_CGROUP_ID_SHIFT) - 1);
|
||||||
|
+
|
||||||
|
+ return val >> MEM_CGROUP_ID_SHIFT;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+#ifdef CONFIG_LRU_GEN
|
||||||
|
+
|
||||||
|
+#if LRU_GEN_SHIFT + LRU_USAGE_SHIFT >= EVICTION_SHIFT
|
||||||
|
+#error "Please try smaller NODES_SHIFT, NR_LRU_GENS and TIERS_PER_GEN configurations"
|
||||||
|
+#endif
|
||||||
|
+
|
||||||
|
+static void page_set_usage(struct page *page, int usage)
|
||||||
|
+{
|
||||||
|
+ unsigned long old_flags, new_flags;
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(usage > BIT(LRU_USAGE_WIDTH));
|
||||||
|
+
|
||||||
|
+ if (!usage)
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ do {
|
||||||
|
+ old_flags = READ_ONCE(page->flags);
|
||||||
|
+ new_flags = (old_flags & ~LRU_USAGE_MASK) | LRU_TIER_FLAGS |
|
||||||
|
+ ((usage - 1UL) << LRU_USAGE_PGOFF);
|
||||||
|
+ if (old_flags == new_flags)
|
||||||
|
+ break;
|
||||||
|
+ } while (cmpxchg(&page->flags, old_flags, new_flags) != old_flags);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+/* Return a token to be stored in the shadow entry of a page being evicted. */
|
||||||
|
+static void *lru_gen_eviction(struct page *page)
|
||||||
|
+{
|
||||||
|
+ int sid, tier;
|
||||||
|
+ unsigned long token;
|
||||||
|
+ unsigned long min_seq;
|
||||||
|
+ struct lruvec *lruvec;
|
||||||
|
+ struct lrugen *lrugen;
|
||||||
|
+ int file = page_is_file_lru(page);
|
||||||
|
+ int usage = page_tier_usage(page);
|
||||||
|
+ struct mem_cgroup *memcg = page_memcg(page);
|
||||||
|
+ struct pglist_data *pgdat = page_pgdat(page);
|
||||||
|
+
|
||||||
|
+ if (!lru_gen_enabled())
|
||||||
|
+ return NULL;
|
||||||
|
+
|
||||||
|
+ lruvec = mem_cgroup_lruvec(memcg, pgdat);
|
||||||
|
+ lrugen = &lruvec->evictable;
|
||||||
|
+ min_seq = READ_ONCE(lrugen->min_seq[file]);
|
||||||
|
+ token = (min_seq << LRU_USAGE_SHIFT) | usage;
|
||||||
|
+
|
||||||
|
+ sid = sid_from_seq_or_gen(min_seq);
|
||||||
|
+ tier = lru_tier_from_usage(usage);
|
||||||
|
+ atomic_long_add(thp_nr_pages(page), &lrugen->evicted[sid][file][tier]);
|
||||||
|
+
|
||||||
|
+ return pack_shadow(mem_cgroup_id(memcg), pgdat, token);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+/* Account a refaulted page based on the token stored in its shadow entry. */
|
||||||
|
+static bool lru_gen_refault(struct page *page, void *shadow)
|
||||||
|
+{
|
||||||
|
+ int sid, tier, usage;
|
||||||
|
+ int memcg_id;
|
||||||
|
+ unsigned long token;
|
||||||
|
+ unsigned long min_seq;
|
||||||
|
+ struct lruvec *lruvec;
|
||||||
|
+ struct lrugen *lrugen;
|
||||||
|
+ struct pglist_data *pgdat;
|
||||||
|
+ struct mem_cgroup *memcg;
|
||||||
|
+ int file = page_is_file_lru(page);
|
||||||
|
+
|
||||||
|
+ if (!lru_gen_enabled())
|
||||||
|
+ return false;
|
||||||
|
+
|
||||||
|
+ token = unpack_shadow(shadow, &memcg_id, &pgdat);
|
||||||
|
+ if (page_pgdat(page) != pgdat)
|
||||||
|
+ return true;
|
||||||
|
+
|
||||||
|
+ rcu_read_lock();
|
||||||
|
+ memcg = page_memcg_rcu(page);
|
||||||
|
+ if (mem_cgroup_id(memcg) != memcg_id)
|
||||||
|
+ goto unlock;
|
||||||
|
+
|
||||||
|
+ usage = token & (BIT(LRU_USAGE_SHIFT) - 1);
|
||||||
|
+ token >>= LRU_USAGE_SHIFT;
|
||||||
|
+
|
||||||
|
+ lruvec = mem_cgroup_lruvec(memcg, pgdat);
|
||||||
|
+ lrugen = &lruvec->evictable;
|
||||||
|
+ min_seq = READ_ONCE(lrugen->min_seq[file]);
|
||||||
|
+ if (token != (min_seq & (EVICTION_MASK >> LRU_USAGE_SHIFT)))
|
||||||
|
+ goto unlock;
|
||||||
|
+
|
||||||
|
+ page_set_usage(page, usage);
|
||||||
|
+
|
||||||
|
+ sid = sid_from_seq_or_gen(min_seq);
|
||||||
|
+ tier = lru_tier_from_usage(usage);
|
||||||
|
+ atomic_long_add(thp_nr_pages(page), &lrugen->refaulted[sid][file][tier]);
|
||||||
|
+ inc_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file);
|
||||||
|
+ if (tier)
|
||||||
|
+ inc_lruvec_state(lruvec, WORKINGSET_RESTORE_BASE + file);
|
||||||
|
+unlock:
|
||||||
|
+ rcu_read_unlock();
|
||||||
|
+
|
||||||
|
+ return true;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+#else /* CONFIG_LRU_GEN */
|
||||||
|
+
|
||||||
|
+static void *lru_gen_eviction(struct page *page)
|
||||||
|
+{
|
||||||
|
+ return NULL;
|
||||||
|
}
|
||||||
|
|
||||||
|
+static bool lru_gen_refault(struct page *page, void *shadow)
|
||||||
|
+{
|
||||||
|
+ return false;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+#endif /* CONFIG_LRU_GEN */
|
||||||
|
+
|
||||||
|
/**
|
||||||
|
* workingset_age_nonresident - age non-resident entries as LRU ages
|
||||||
|
* @lruvec: the lruvec that was aged
|
||||||
|
@@ -256,18 +357,25 @@ void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg)
|
||||||
|
unsigned long eviction;
|
||||||
|
struct lruvec *lruvec;
|
||||||
|
int memcgid;
|
||||||
|
+ void *shadow;
|
||||||
|
|
||||||
|
/* Page is fully exclusive and pins page's memory cgroup pointer */
|
||||||
|
VM_BUG_ON_PAGE(PageLRU(page), page);
|
||||||
|
VM_BUG_ON_PAGE(page_count(page), page);
|
||||||
|
VM_BUG_ON_PAGE(!PageLocked(page), page);
|
||||||
|
|
||||||
|
+ shadow = lru_gen_eviction(page);
|
||||||
|
+ if (shadow)
|
||||||
|
+ return shadow;
|
||||||
|
+
|
||||||
|
lruvec = mem_cgroup_lruvec(target_memcg, pgdat);
|
||||||
|
/* XXX: target_memcg can be NULL, go through lruvec */
|
||||||
|
memcgid = mem_cgroup_id(lruvec_memcg(lruvec));
|
||||||
|
eviction = atomic_long_read(&lruvec->nonresident_age);
|
||||||
|
+ eviction >>= bucket_order;
|
||||||
|
+ eviction = (eviction << WORKINGSET_WIDTH) | PageWorkingset(page);
|
||||||
|
workingset_age_nonresident(lruvec, thp_nr_pages(page));
|
||||||
|
- return pack_shadow(memcgid, pgdat, eviction, PageWorkingset(page));
|
||||||
|
+ return pack_shadow(memcgid, pgdat, eviction);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
@@ -294,7 +402,10 @@ void workingset_refault(struct page *page, void *shadow)
|
||||||
|
bool workingset;
|
||||||
|
int memcgid;
|
||||||
|
|
||||||
|
- unpack_shadow(shadow, &memcgid, &pgdat, &eviction, &workingset);
|
||||||
|
+ if (lru_gen_refault(page, shadow))
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ eviction = unpack_shadow(shadow, &memcgid, &pgdat);
|
||||||
|
|
||||||
|
rcu_read_lock();
|
||||||
|
/*
|
||||||
|
@@ -318,6 +429,8 @@ void workingset_refault(struct page *page, void *shadow)
|
||||||
|
goto out;
|
||||||
|
eviction_lruvec = mem_cgroup_lruvec(eviction_memcg, pgdat);
|
||||||
|
refault = atomic_long_read(&eviction_lruvec->nonresident_age);
|
||||||
|
+ workingset = eviction & (BIT(WORKINGSET_WIDTH) - 1);
|
||||||
|
+ eviction = (eviction >> WORKINGSET_WIDTH) << bucket_order;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Calculate the refault distance
|
||||||
|
@@ -335,7 +448,7 @@ void workingset_refault(struct page *page, void *shadow)
|
||||||
|
* longest time, so the occasional inappropriate activation
|
||||||
|
* leading to pressure on the active list is not a problem.
|
||||||
|
*/
|
||||||
|
- refault_distance = (refault - eviction) & EVICTION_MASK;
|
||||||
|
+ refault_distance = (refault - eviction) & (EVICTION_MASK >> WORKINGSET_WIDTH);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* The activation decision for this page is made at the level
|
||||||
|
@@ -594,7 +707,7 @@ static int __init workingset_init(void)
|
||||||
|
unsigned int max_order;
|
||||||
|
int ret;
|
||||||
|
|
||||||
|
- BUILD_BUG_ON(BITS_PER_LONG < EVICTION_SHIFT);
|
||||||
|
+ BUILD_BUG_ON(EVICTION_SHIFT < WORKINGSET_WIDTH);
|
||||||
|
/*
|
||||||
|
* Calculate the eviction bucket size to cover the longest
|
||||||
|
* actionable refault distance, which is currently half of
|
||||||
|
@@ -602,7 +715,7 @@ static int __init workingset_init(void)
|
||||||
|
* some more pages at runtime, so keep working with up to
|
||||||
|
* double the initial memory by using totalram_pages as-is.
|
||||||
|
*/
|
||||||
|
- timestamp_bits = BITS_PER_LONG - EVICTION_SHIFT;
|
||||||
|
+ timestamp_bits = EVICTION_SHIFT - WORKINGSET_WIDTH;
|
||||||
|
max_order = fls_long(totalram_pages() - 1);
|
||||||
|
if (max_order > timestamp_bits)
|
||||||
|
bucket_order = max_order - timestamp_bits;
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -0,0 +1,814 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id 584E2C433B4
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:24 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 364F560FDB
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:24 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S1345163AbhDMG5l (ORCPT
|
||||||
|
<rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:41 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44208 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S1345079AbhDMG5P (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:15 -0400
|
||||||
|
Received: from mail-qt1-x849.google.com (mail-qt1-x849.google.com [IPv6:2607:f8b0:4864:20::849])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33EA5C061574
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:56 -0700 (PDT)
|
||||||
|
Received: by mail-qt1-x849.google.com with SMTP id o15so661346qtq.20
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:56 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=cu41ToFIF+otWxvIfaf0+qpOwdOuOIiyHS34SY2MKlA=;
|
||||||
|
b=DVE8DNOLCIkGnchiTSJf1aDqFVGLrvEGecUeUN0sDIHBw/EmgoB7xYiwrDwlmTJzfB
|
||||||
|
7mJ9wgXcC3xTW/xg8bwqYmzHvC/L4X4KSoDnIWPKnc562ObAH2IGWhiD3korjYqggzne
|
||||||
|
pjoL+Xglz7D6A6bOmM8M5cZKQhXRisrB5aDyIVUvRJmQLTWP2WB2n4JPqTvP/wVMQ9Sn
|
||||||
|
hXTZFKELKJbKA+BHU0pwjNA7cFy1nW2rJ9X9d+VP21+ThijMrCLuken/5O6OvPkUefZl
|
||||||
|
sakH+0tV7Yy/fR7EVGJoWcpUjUiGxd6+0AUNvryVNuijwkPETOtPNH6UfyfgZ6xdkl9P
|
||||||
|
OYsw==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=cu41ToFIF+otWxvIfaf0+qpOwdOuOIiyHS34SY2MKlA=;
|
||||||
|
b=Hv9g3zJZwz81XIFNxdDjfOsfsikJNtFff85YKjuCIJR3ru0Fl/o3i0TbhFzOjTUKBt
|
||||||
|
yhJLayQgM9XxSudGQ47m0Ya49B4k58xttPSNqFNA93EXYaxcUN7fG8T+ZYA0VxA96PeD
|
||||||
|
qZHRzegQrJ6SM3hYDYpBhvClDfl9zRD0Gpns+vVl2DjteDrRi+wekSzyz6MvMlGhtb/s
|
||||||
|
F1O38FNuucDx0CgK/so+BE9vzBcN8TzGAU9OaMBW6lDAhAcq+NxEl32LeO/a/P6Oz9A1
|
||||||
|
x77ZeDzQXRkpTd7y0bgBYZWdg+h/cc09EJonEBfUTa9tDdaDfqMhPlllI6ZHFFJYrlkh
|
||||||
|
gSDw==
|
||||||
|
X-Gm-Message-State: AOAM530hiDEzMAP2in3GTJKn5AqypprG9ZgOZOECg5xoh9CUzK15XTUw
|
||||||
|
0N5X5CtrUDDlCTAUV9QB3qMFCzKiHHg=
|
||||||
|
X-Google-Smtp-Source: ABdhPJzMmLOgNcb9fea/k5rqaH2vAtKGPRWVf2ZxGZXPr5TIM1jkpFwnMYJAYMnOr+dtOuXM8dcYCymh2hY=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:a05:6214:7ed:: with SMTP id
|
||||||
|
bp13mr7059024qvb.17.1618297015323; Mon, 12 Apr 2021 23:56:55 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:27 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-11-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 10/16] mm: multigenerational lru: mm_struct list
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-11-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
In order to scan page tables, we add an infrastructure to maintain
|
||||||
|
either a system-wide mm_struct list or per-memcg mm_struct lists.
|
||||||
|
Multiple threads can concurrently work on the same mm_struct list, and
|
||||||
|
each of them will be given a different mm_struct.
|
||||||
|
|
||||||
|
This infrastructure also tracks whether an mm_struct is being used on
|
||||||
|
any CPUs or has been used since the last time a worker looked at it.
|
||||||
|
In other words, workers will not be given an mm_struct that belongs to
|
||||||
|
a process that has been sleeping.
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
fs/exec.c | 2 +
|
||||||
|
include/linux/memcontrol.h | 6 +
|
||||||
|
include/linux/mm_types.h | 117 ++++++++++++++
|
||||||
|
include/linux/mmzone.h | 2 -
|
||||||
|
kernel/exit.c | 1 +
|
||||||
|
kernel/fork.c | 10 ++
|
||||||
|
kernel/kthread.c | 1 +
|
||||||
|
kernel/sched/core.c | 2 +
|
||||||
|
mm/memcontrol.c | 28 ++++
|
||||||
|
mm/vmscan.c | 316 +++++++++++++++++++++++++++++++++++++
|
||||||
|
10 files changed, 483 insertions(+), 2 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/fs/exec.c b/fs/exec.c
|
||||||
|
index 18594f11c31f..c691d4d7720c 100644
|
||||||
|
--- a/fs/exec.c
|
||||||
|
+++ b/fs/exec.c
|
||||||
|
@@ -1008,6 +1008,7 @@ static int exec_mmap(struct mm_struct *mm)
|
||||||
|
active_mm = tsk->active_mm;
|
||||||
|
tsk->active_mm = mm;
|
||||||
|
tsk->mm = mm;
|
||||||
|
+ lru_gen_add_mm(mm);
|
||||||
|
/*
|
||||||
|
* This prevents preemption while active_mm is being loaded and
|
||||||
|
* it and mm are being updated, which could cause problems for
|
||||||
|
@@ -1018,6 +1019,7 @@ static int exec_mmap(struct mm_struct *mm)
|
||||||
|
if (!IS_ENABLED(CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM))
|
||||||
|
local_irq_enable();
|
||||||
|
activate_mm(active_mm, mm);
|
||||||
|
+ lru_gen_switch_mm(active_mm, mm);
|
||||||
|
if (IS_ENABLED(CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM))
|
||||||
|
local_irq_enable();
|
||||||
|
tsk->mm->vmacache_seqnum = 0;
|
||||||
|
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
|
||||||
|
index f13dc02cf277..cff95ed1ee2b 100644
|
||||||
|
--- a/include/linux/memcontrol.h
|
||||||
|
+++ b/include/linux/memcontrol.h
|
||||||
|
@@ -212,6 +212,8 @@ struct obj_cgroup {
|
||||||
|
};
|
||||||
|
};
|
||||||
|
|
||||||
|
+struct lru_gen_mm_list;
|
||||||
|
+
|
||||||
|
/*
|
||||||
|
* The memory controller data structure. The memory controller controls both
|
||||||
|
* page cache and RSS per cgroup. We would eventually like to provide
|
||||||
|
@@ -335,6 +337,10 @@ struct mem_cgroup {
|
||||||
|
struct deferred_split deferred_split_queue;
|
||||||
|
#endif
|
||||||
|
|
||||||
|
+#ifdef CONFIG_LRU_GEN
|
||||||
|
+ struct lru_gen_mm_list *mm_list;
|
||||||
|
+#endif
|
||||||
|
+
|
||||||
|
struct mem_cgroup_per_node *nodeinfo[0];
|
||||||
|
/* WARNING: nodeinfo must be the last member here */
|
||||||
|
};
|
||||||
|
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
|
||||||
|
index 6613b26a8894..f8a239fbb958 100644
|
||||||
|
--- a/include/linux/mm_types.h
|
||||||
|
+++ b/include/linux/mm_types.h
|
||||||
|
@@ -15,6 +15,8 @@
|
||||||
|
#include <linux/page-flags-layout.h>
|
||||||
|
#include <linux/workqueue.h>
|
||||||
|
#include <linux/seqlock.h>
|
||||||
|
+#include <linux/nodemask.h>
|
||||||
|
+#include <linux/mmdebug.h>
|
||||||
|
|
||||||
|
#include <asm/mmu.h>
|
||||||
|
|
||||||
|
@@ -383,6 +385,8 @@ struct core_state {
|
||||||
|
struct completion startup;
|
||||||
|
};
|
||||||
|
|
||||||
|
+#define ANON_AND_FILE 2
|
||||||
|
+
|
||||||
|
struct kioctx_table;
|
||||||
|
struct mm_struct {
|
||||||
|
struct {
|
||||||
|
@@ -561,6 +565,22 @@ struct mm_struct {
|
||||||
|
|
||||||
|
#ifdef CONFIG_IOMMU_SUPPORT
|
||||||
|
u32 pasid;
|
||||||
|
+#endif
|
||||||
|
+#ifdef CONFIG_LRU_GEN
|
||||||
|
+ struct {
|
||||||
|
+ /* the node of a global or per-memcg mm_struct list */
|
||||||
|
+ struct list_head list;
|
||||||
|
+#ifdef CONFIG_MEMCG
|
||||||
|
+ /* points to memcg of the owner task above */
|
||||||
|
+ struct mem_cgroup *memcg;
|
||||||
|
+#endif
|
||||||
|
+ /* whether this mm_struct has been used since the last walk */
|
||||||
|
+ nodemask_t nodes[ANON_AND_FILE];
|
||||||
|
+#ifndef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
|
||||||
|
+ /* the number of CPUs using this mm_struct */
|
||||||
|
+ atomic_t nr_cpus;
|
||||||
|
+#endif
|
||||||
|
+ } lrugen;
|
||||||
|
#endif
|
||||||
|
} __randomize_layout;
|
||||||
|
|
||||||
|
@@ -588,6 +608,103 @@ static inline cpumask_t *mm_cpumask(struct mm_struct *mm)
|
||||||
|
return (struct cpumask *)&mm->cpu_bitmap;
|
||||||
|
}
|
||||||
|
|
||||||
|
+#ifdef CONFIG_LRU_GEN
|
||||||
|
+
|
||||||
|
+void lru_gen_init_mm(struct mm_struct *mm);
|
||||||
|
+void lru_gen_add_mm(struct mm_struct *mm);
|
||||||
|
+void lru_gen_del_mm(struct mm_struct *mm);
|
||||||
|
+#ifdef CONFIG_MEMCG
|
||||||
|
+int lru_gen_alloc_mm_list(struct mem_cgroup *memcg);
|
||||||
|
+void lru_gen_free_mm_list(struct mem_cgroup *memcg);
|
||||||
|
+void lru_gen_migrate_mm(struct mm_struct *mm);
|
||||||
|
+#endif
|
||||||
|
+
|
||||||
|
+/*
|
||||||
|
+ * Track the usage so mm_struct's that haven't been used since the last walk can
|
||||||
|
+ * be skipped. This function adds a theoretical overhead to each context switch,
|
||||||
|
+ * which hasn't been measurable.
|
||||||
|
+ */
|
||||||
|
+static inline void lru_gen_switch_mm(struct mm_struct *old, struct mm_struct *new)
|
||||||
|
+{
|
||||||
|
+ int file;
|
||||||
|
+
|
||||||
|
+ /* exclude init_mm, efi_mm, etc. */
|
||||||
|
+ if (!core_kernel_data((unsigned long)old)) {
|
||||||
|
+ VM_BUG_ON(old == &init_mm);
|
||||||
|
+
|
||||||
|
+ for (file = 0; file < ANON_AND_FILE; file++)
|
||||||
|
+ nodes_setall(old->lrugen.nodes[file]);
|
||||||
|
+
|
||||||
|
+#ifndef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
|
||||||
|
+ atomic_dec(&old->lrugen.nr_cpus);
|
||||||
|
+ VM_BUG_ON_MM(atomic_read(&old->lrugen.nr_cpus) < 0, old);
|
||||||
|
+#endif
|
||||||
|
+ } else
|
||||||
|
+ VM_BUG_ON_MM(READ_ONCE(old->lrugen.list.prev) ||
|
||||||
|
+ READ_ONCE(old->lrugen.list.next), old);
|
||||||
|
+
|
||||||
|
+ if (!core_kernel_data((unsigned long)new)) {
|
||||||
|
+ VM_BUG_ON(new == &init_mm);
|
||||||
|
+
|
||||||
|
+#ifndef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
|
||||||
|
+ atomic_inc(&new->lrugen.nr_cpus);
|
||||||
|
+ VM_BUG_ON_MM(atomic_read(&new->lrugen.nr_cpus) < 0, new);
|
||||||
|
+#endif
|
||||||
|
+ } else
|
||||||
|
+ VM_BUG_ON_MM(READ_ONCE(new->lrugen.list.prev) ||
|
||||||
|
+ READ_ONCE(new->lrugen.list.next), new);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+/* Return whether this mm_struct is being used on any CPUs. */
|
||||||
|
+static inline bool lru_gen_mm_is_active(struct mm_struct *mm)
|
||||||
|
+{
|
||||||
|
+#ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
|
||||||
|
+ return !cpumask_empty(mm_cpumask(mm));
|
||||||
|
+#else
|
||||||
|
+ return atomic_read(&mm->lrugen.nr_cpus);
|
||||||
|
+#endif
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+#else /* CONFIG_LRU_GEN */
|
||||||
|
+
|
||||||
|
+static inline void lru_gen_init_mm(struct mm_struct *mm)
|
||||||
|
+{
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static inline void lru_gen_add_mm(struct mm_struct *mm)
|
||||||
|
+{
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static inline void lru_gen_del_mm(struct mm_struct *mm)
|
||||||
|
+{
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+#ifdef CONFIG_MEMCG
|
||||||
|
+static inline int lru_gen_alloc_mm_list(struct mem_cgroup *memcg)
|
||||||
|
+{
|
||||||
|
+ return 0;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static inline void lru_gen_free_mm_list(struct mem_cgroup *memcg)
|
||||||
|
+{
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static inline void lru_gen_migrate_mm(struct mm_struct *mm)
|
||||||
|
+{
|
||||||
|
+}
|
||||||
|
+#endif
|
||||||
|
+
|
||||||
|
+static inline void lru_gen_switch_mm(struct mm_struct *old, struct mm_struct *new)
|
||||||
|
+{
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static inline bool lru_gen_mm_is_active(struct mm_struct *mm)
|
||||||
|
+{
|
||||||
|
+ return false;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+#endif /* CONFIG_LRU_GEN */
|
||||||
|
+
|
||||||
|
struct mmu_gather;
|
||||||
|
extern void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm);
|
||||||
|
extern void tlb_gather_mmu_fullmm(struct mmu_gather *tlb, struct mm_struct *mm);
|
||||||
|
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
|
||||||
|
index a60c7498afd7..dcfadf6a8c07 100644
|
||||||
|
--- a/include/linux/mmzone.h
|
||||||
|
+++ b/include/linux/mmzone.h
|
||||||
|
@@ -285,8 +285,6 @@ static inline bool is_active_lru(enum lru_list lru)
|
||||||
|
return (lru == LRU_ACTIVE_ANON || lru == LRU_ACTIVE_FILE);
|
||||||
|
}
|
||||||
|
|
||||||
|
-#define ANON_AND_FILE 2
|
||||||
|
-
|
||||||
|
enum lruvec_flags {
|
||||||
|
LRUVEC_CONGESTED, /* lruvec has many dirty pages
|
||||||
|
* backed by a congested BDI
|
||||||
|
diff --git a/kernel/exit.c b/kernel/exit.c
|
||||||
|
index 04029e35e69a..e4292717ce37 100644
|
||||||
|
--- a/kernel/exit.c
|
||||||
|
+++ b/kernel/exit.c
|
||||||
|
@@ -422,6 +422,7 @@ void mm_update_next_owner(struct mm_struct *mm)
|
||||||
|
goto retry;
|
||||||
|
}
|
||||||
|
WRITE_ONCE(mm->owner, c);
|
||||||
|
+ lru_gen_migrate_mm(mm);
|
||||||
|
task_unlock(c);
|
||||||
|
put_task_struct(c);
|
||||||
|
}
|
||||||
|
diff --git a/kernel/fork.c b/kernel/fork.c
|
||||||
|
index 426cd0c51f9e..dfa84200229f 100644
|
||||||
|
--- a/kernel/fork.c
|
||||||
|
+++ b/kernel/fork.c
|
||||||
|
@@ -665,6 +665,7 @@ static void check_mm(struct mm_struct *mm)
|
||||||
|
#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
|
||||||
|
VM_BUG_ON_MM(mm->pmd_huge_pte, mm);
|
||||||
|
#endif
|
||||||
|
+ VM_BUG_ON_MM(lru_gen_mm_is_active(mm), mm);
|
||||||
|
}
|
||||||
|
|
||||||
|
#define allocate_mm() (kmem_cache_alloc(mm_cachep, GFP_KERNEL))
|
||||||
|
@@ -1055,6 +1056,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
|
||||||
|
goto fail_nocontext;
|
||||||
|
|
||||||
|
mm->user_ns = get_user_ns(user_ns);
|
||||||
|
+ lru_gen_init_mm(mm);
|
||||||
|
return mm;
|
||||||
|
|
||||||
|
fail_nocontext:
|
||||||
|
@@ -1097,6 +1099,7 @@ static inline void __mmput(struct mm_struct *mm)
|
||||||
|
}
|
||||||
|
if (mm->binfmt)
|
||||||
|
module_put(mm->binfmt->module);
|
||||||
|
+ lru_gen_del_mm(mm);
|
||||||
|
mmdrop(mm);
|
||||||
|
}
|
||||||
|
|
||||||
|
@@ -2521,6 +2524,13 @@ pid_t kernel_clone(struct kernel_clone_args *args)
|
||||||
|
get_task_struct(p);
|
||||||
|
}
|
||||||
|
|
||||||
|
+ if (IS_ENABLED(CONFIG_LRU_GEN) && !(clone_flags & CLONE_VM)) {
|
||||||
|
+ /* lock the task to synchronize with memcg migration */
|
||||||
|
+ task_lock(p);
|
||||||
|
+ lru_gen_add_mm(p->mm);
|
||||||
|
+ task_unlock(p);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
wake_up_new_task(p);
|
||||||
|
|
||||||
|
/* forking complete and child started to run, tell ptracer */
|
||||||
|
diff --git a/kernel/kthread.c b/kernel/kthread.c
|
||||||
|
index 1578973c5740..8da7767bb06a 100644
|
||||||
|
--- a/kernel/kthread.c
|
||||||
|
+++ b/kernel/kthread.c
|
||||||
|
@@ -1303,6 +1303,7 @@ void kthread_use_mm(struct mm_struct *mm)
|
||||||
|
tsk->mm = mm;
|
||||||
|
membarrier_update_current_mm(mm);
|
||||||
|
switch_mm_irqs_off(active_mm, mm, tsk);
|
||||||
|
+ lru_gen_switch_mm(active_mm, mm);
|
||||||
|
local_irq_enable();
|
||||||
|
task_unlock(tsk);
|
||||||
|
#ifdef finish_arch_post_lock_switch
|
||||||
|
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
|
||||||
|
index 98191218d891..bd626dbdb816 100644
|
||||||
|
--- a/kernel/sched/core.c
|
||||||
|
+++ b/kernel/sched/core.c
|
||||||
|
@@ -4306,6 +4306,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
|
||||||
|
* finish_task_switch()'s mmdrop().
|
||||||
|
*/
|
||||||
|
switch_mm_irqs_off(prev->active_mm, next->mm, next);
|
||||||
|
+ lru_gen_switch_mm(prev->active_mm, next->mm);
|
||||||
|
|
||||||
|
if (!prev->mm) { // from kernel
|
||||||
|
/* will mmdrop() in finish_task_switch(). */
|
||||||
|
@@ -7597,6 +7598,7 @@ void idle_task_exit(void)
|
||||||
|
|
||||||
|
if (mm != &init_mm) {
|
||||||
|
switch_mm(mm, &init_mm, current);
|
||||||
|
+ lru_gen_switch_mm(mm, &init_mm);
|
||||||
|
finish_arch_post_lock_switch();
|
||||||
|
}
|
||||||
|
|
||||||
|
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
|
||||||
|
index e064ac0d850a..496e91e813af 100644
|
||||||
|
--- a/mm/memcontrol.c
|
||||||
|
+++ b/mm/memcontrol.c
|
||||||
|
@@ -5206,6 +5206,7 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg)
|
||||||
|
free_mem_cgroup_per_node_info(memcg, node);
|
||||||
|
free_percpu(memcg->vmstats_percpu);
|
||||||
|
free_percpu(memcg->vmstats_local);
|
||||||
|
+ lru_gen_free_mm_list(memcg);
|
||||||
|
kfree(memcg);
|
||||||
|
}
|
||||||
|
|
||||||
|
@@ -5258,6 +5259,9 @@ static struct mem_cgroup *mem_cgroup_alloc(void)
|
||||||
|
if (alloc_mem_cgroup_per_node_info(memcg, node))
|
||||||
|
goto fail;
|
||||||
|
|
||||||
|
+ if (lru_gen_alloc_mm_list(memcg))
|
||||||
|
+ goto fail;
|
||||||
|
+
|
||||||
|
if (memcg_wb_domain_init(memcg, GFP_KERNEL))
|
||||||
|
goto fail;
|
||||||
|
|
||||||
|
@@ -6162,6 +6166,29 @@ static void mem_cgroup_move_task(void)
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
+#ifdef CONFIG_LRU_GEN
|
||||||
|
+static void mem_cgroup_attach(struct cgroup_taskset *tset)
|
||||||
|
+{
|
||||||
|
+ struct cgroup_subsys_state *css;
|
||||||
|
+ struct task_struct *task = NULL;
|
||||||
|
+
|
||||||
|
+ cgroup_taskset_for_each_leader(task, css, tset)
|
||||||
|
+ ;
|
||||||
|
+
|
||||||
|
+ if (!task)
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ task_lock(task);
|
||||||
|
+ if (task->mm && task->mm->owner == task)
|
||||||
|
+ lru_gen_migrate_mm(task->mm);
|
||||||
|
+ task_unlock(task);
|
||||||
|
+}
|
||||||
|
+#else
|
||||||
|
+static void mem_cgroup_attach(struct cgroup_taskset *tset)
|
||||||
|
+{
|
||||||
|
+}
|
||||||
|
+#endif
|
||||||
|
+
|
||||||
|
static int seq_puts_memcg_tunable(struct seq_file *m, unsigned long value)
|
||||||
|
{
|
||||||
|
if (value == PAGE_COUNTER_MAX)
|
||||||
|
@@ -6502,6 +6529,7 @@ struct cgroup_subsys memory_cgrp_subsys = {
|
||||||
|
.css_free = mem_cgroup_css_free,
|
||||||
|
.css_reset = mem_cgroup_css_reset,
|
||||||
|
.can_attach = mem_cgroup_can_attach,
|
||||||
|
+ .attach = mem_cgroup_attach,
|
||||||
|
.cancel_attach = mem_cgroup_cancel_attach,
|
||||||
|
.post_attach = mem_cgroup_move_task,
|
||||||
|
.dfl_cftypes = memory_files,
|
||||||
|
diff --git a/mm/vmscan.c b/mm/vmscan.c
|
||||||
|
index c74ebe2039f7..d67dfd1e3930 100644
|
||||||
|
--- a/mm/vmscan.c
|
||||||
|
+++ b/mm/vmscan.c
|
||||||
|
@@ -4464,6 +4464,313 @@ static bool positive_ctrl_err(struct controller_pos *sp, struct controller_pos *
|
||||||
|
sp->refaulted * max(pv->total, 1UL) * pv->gain;
|
||||||
|
}
|
||||||
|
|
||||||
|
+/******************************************************************************
|
||||||
|
+ * mm_struct list
|
||||||
|
+ ******************************************************************************/
|
||||||
|
+
|
||||||
|
+enum {
|
||||||
|
+ MM_SCHED_ACTIVE, /* running processes */
|
||||||
|
+ MM_SCHED_INACTIVE, /* sleeping processes */
|
||||||
|
+ MM_LOCK_CONTENTION, /* lock contentions */
|
||||||
|
+ MM_VMA_INTERVAL, /* VMAs within the range of current table */
|
||||||
|
+ MM_LEAF_OTHER_NODE, /* entries not from node under reclaim */
|
||||||
|
+ MM_LEAF_OTHER_MEMCG, /* entries not from memcg under reclaim */
|
||||||
|
+ MM_LEAF_OLD, /* old entries */
|
||||||
|
+ MM_LEAF_YOUNG, /* young entries */
|
||||||
|
+ MM_LEAF_DIRTY, /* dirty entries */
|
||||||
|
+ MM_LEAF_HOLE, /* non-present entries */
|
||||||
|
+ MM_NONLEAF_OLD, /* old non-leaf pmd entries */
|
||||||
|
+ MM_NONLEAF_YOUNG, /* young non-leaf pmd entries */
|
||||||
|
+ NR_MM_STATS
|
||||||
|
+};
|
||||||
|
+
|
||||||
|
+/* mnemonic codes for the stats above */
|
||||||
|
+#define MM_STAT_CODES "aicvnmoydhlu"
|
||||||
|
+
|
||||||
|
+struct lru_gen_mm_list {
|
||||||
|
+ /* the head of a global or per-memcg mm_struct list */
|
||||||
|
+ struct list_head head;
|
||||||
|
+ /* protects the list */
|
||||||
|
+ spinlock_t lock;
|
||||||
|
+ struct {
|
||||||
|
+ /* set to max_seq after each round of walk */
|
||||||
|
+ unsigned long cur_seq;
|
||||||
|
+ /* the next mm on the list to walk */
|
||||||
|
+ struct list_head *iter;
|
||||||
|
+ /* to wait for the last worker to finish */
|
||||||
|
+ struct wait_queue_head wait;
|
||||||
|
+ /* the number of concurrent workers */
|
||||||
|
+ int nr_workers;
|
||||||
|
+ /* stats for debugging */
|
||||||
|
+ unsigned long stats[NR_STAT_GENS][NR_MM_STATS];
|
||||||
|
+ } nodes[0];
|
||||||
|
+};
|
||||||
|
+
|
||||||
|
+static struct lru_gen_mm_list *global_mm_list;
|
||||||
|
+
|
||||||
|
+static struct lru_gen_mm_list *alloc_mm_list(void)
|
||||||
|
+{
|
||||||
|
+ int nid;
|
||||||
|
+ struct lru_gen_mm_list *mm_list;
|
||||||
|
+
|
||||||
|
+ mm_list = kzalloc(struct_size(mm_list, nodes, nr_node_ids), GFP_KERNEL);
|
||||||
|
+ if (!mm_list)
|
||||||
|
+ return NULL;
|
||||||
|
+
|
||||||
|
+ INIT_LIST_HEAD(&mm_list->head);
|
||||||
|
+ spin_lock_init(&mm_list->lock);
|
||||||
|
+
|
||||||
|
+ for_each_node(nid) {
|
||||||
|
+ mm_list->nodes[nid].cur_seq = MIN_NR_GENS;
|
||||||
|
+ mm_list->nodes[nid].iter = &mm_list->head;
|
||||||
|
+ init_waitqueue_head(&mm_list->nodes[nid].wait);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return mm_list;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static struct lru_gen_mm_list *get_mm_list(struct mem_cgroup *memcg)
|
||||||
|
+{
|
||||||
|
+#ifdef CONFIG_MEMCG
|
||||||
|
+ if (!mem_cgroup_disabled())
|
||||||
|
+ return memcg ? memcg->mm_list : root_mem_cgroup->mm_list;
|
||||||
|
+#endif
|
||||||
|
+ VM_BUG_ON(memcg);
|
||||||
|
+
|
||||||
|
+ return global_mm_list;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+void lru_gen_init_mm(struct mm_struct *mm)
|
||||||
|
+{
|
||||||
|
+ int file;
|
||||||
|
+
|
||||||
|
+ INIT_LIST_HEAD(&mm->lrugen.list);
|
||||||
|
+#ifdef CONFIG_MEMCG
|
||||||
|
+ mm->lrugen.memcg = NULL;
|
||||||
|
+#endif
|
||||||
|
+#ifndef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
|
||||||
|
+ atomic_set(&mm->lrugen.nr_cpus, 0);
|
||||||
|
+#endif
|
||||||
|
+ for (file = 0; file < ANON_AND_FILE; file++)
|
||||||
|
+ nodes_clear(mm->lrugen.nodes[file]);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+void lru_gen_add_mm(struct mm_struct *mm)
|
||||||
|
+{
|
||||||
|
+ struct mem_cgroup *memcg = get_mem_cgroup_from_mm(mm);
|
||||||
|
+ struct lru_gen_mm_list *mm_list = get_mm_list(memcg);
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON_MM(!list_empty(&mm->lrugen.list), mm);
|
||||||
|
+#ifdef CONFIG_MEMCG
|
||||||
|
+ VM_BUG_ON_MM(mm->lrugen.memcg, mm);
|
||||||
|
+ WRITE_ONCE(mm->lrugen.memcg, memcg);
|
||||||
|
+#endif
|
||||||
|
+ spin_lock(&mm_list->lock);
|
||||||
|
+ list_add_tail(&mm->lrugen.list, &mm_list->head);
|
||||||
|
+ spin_unlock(&mm_list->lock);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+void lru_gen_del_mm(struct mm_struct *mm)
|
||||||
|
+{
|
||||||
|
+ int nid;
|
||||||
|
+#ifdef CONFIG_MEMCG
|
||||||
|
+ struct lru_gen_mm_list *mm_list = get_mm_list(mm->lrugen.memcg);
|
||||||
|
+#else
|
||||||
|
+ struct lru_gen_mm_list *mm_list = get_mm_list(NULL);
|
||||||
|
+#endif
|
||||||
|
+
|
||||||
|
+ spin_lock(&mm_list->lock);
|
||||||
|
+
|
||||||
|
+ for_each_node(nid) {
|
||||||
|
+ if (mm_list->nodes[nid].iter != &mm->lrugen.list)
|
||||||
|
+ continue;
|
||||||
|
+
|
||||||
|
+ mm_list->nodes[nid].iter = mm_list->nodes[nid].iter->next;
|
||||||
|
+ if (mm_list->nodes[nid].iter == &mm_list->head)
|
||||||
|
+ WRITE_ONCE(mm_list->nodes[nid].cur_seq,
|
||||||
|
+ mm_list->nodes[nid].cur_seq + 1);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ list_del_init(&mm->lrugen.list);
|
||||||
|
+
|
||||||
|
+ spin_unlock(&mm_list->lock);
|
||||||
|
+
|
||||||
|
+#ifdef CONFIG_MEMCG
|
||||||
|
+ mem_cgroup_put(mm->lrugen.memcg);
|
||||||
|
+ WRITE_ONCE(mm->lrugen.memcg, NULL);
|
||||||
|
+#endif
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+#ifdef CONFIG_MEMCG
|
||||||
|
+int lru_gen_alloc_mm_list(struct mem_cgroup *memcg)
|
||||||
|
+{
|
||||||
|
+ if (mem_cgroup_disabled())
|
||||||
|
+ return 0;
|
||||||
|
+
|
||||||
|
+ memcg->mm_list = alloc_mm_list();
|
||||||
|
+
|
||||||
|
+ return memcg->mm_list ? 0 : -ENOMEM;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+void lru_gen_free_mm_list(struct mem_cgroup *memcg)
|
||||||
|
+{
|
||||||
|
+ kfree(memcg->mm_list);
|
||||||
|
+ memcg->mm_list = NULL;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+void lru_gen_migrate_mm(struct mm_struct *mm)
|
||||||
|
+{
|
||||||
|
+ struct mem_cgroup *memcg;
|
||||||
|
+
|
||||||
|
+ lockdep_assert_held(&mm->owner->alloc_lock);
|
||||||
|
+
|
||||||
|
+ if (mem_cgroup_disabled())
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ rcu_read_lock();
|
||||||
|
+ memcg = mem_cgroup_from_task(mm->owner);
|
||||||
|
+ rcu_read_unlock();
|
||||||
|
+ if (memcg == mm->lrugen.memcg)
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON_MM(!mm->lrugen.memcg, mm);
|
||||||
|
+ VM_BUG_ON_MM(list_empty(&mm->lrugen.list), mm);
|
||||||
|
+
|
||||||
|
+ lru_gen_del_mm(mm);
|
||||||
|
+ lru_gen_add_mm(mm);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static bool mm_has_migrated(struct mm_struct *mm, struct mem_cgroup *memcg)
|
||||||
|
+{
|
||||||
|
+ return READ_ONCE(mm->lrugen.memcg) != memcg;
|
||||||
|
+}
|
||||||
|
+#else
|
||||||
|
+static bool mm_has_migrated(struct mm_struct *mm, struct mem_cgroup *memcg)
|
||||||
|
+{
|
||||||
|
+ return false;
|
||||||
|
+}
|
||||||
|
+#endif
|
||||||
|
+
|
||||||
|
+struct mm_walk_args {
|
||||||
|
+ struct mem_cgroup *memcg;
|
||||||
|
+ unsigned long max_seq;
|
||||||
|
+ unsigned long next_addr;
|
||||||
|
+ unsigned long start_pfn;
|
||||||
|
+ unsigned long end_pfn;
|
||||||
|
+ int node_id;
|
||||||
|
+ int batch_size;
|
||||||
|
+ int mm_stats[NR_MM_STATS];
|
||||||
|
+ int nr_pages[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES];
|
||||||
|
+ bool should_walk[ANON_AND_FILE];
|
||||||
|
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG)
|
||||||
|
+ unsigned long bitmap[BITS_TO_LONGS(PTRS_PER_PMD)];
|
||||||
|
+#endif
|
||||||
|
+};
|
||||||
|
+
|
||||||
|
+static void reset_mm_stats(struct lru_gen_mm_list *mm_list, bool last,
|
||||||
|
+ struct mm_walk_args *args)
|
||||||
|
+{
|
||||||
|
+ int i;
|
||||||
|
+ int nid = args->node_id;
|
||||||
|
+ int sid = sid_from_seq_or_gen(args->max_seq);
|
||||||
|
+
|
||||||
|
+ lockdep_assert_held(&mm_list->lock);
|
||||||
|
+
|
||||||
|
+ for (i = 0; i < NR_MM_STATS; i++) {
|
||||||
|
+ WRITE_ONCE(mm_list->nodes[nid].stats[sid][i],
|
||||||
|
+ mm_list->nodes[nid].stats[sid][i] + args->mm_stats[i]);
|
||||||
|
+ args->mm_stats[i] = 0;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (!last || NR_STAT_GENS == 1)
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ sid = sid_from_seq_or_gen(args->max_seq + 1);
|
||||||
|
+ for (i = 0; i < NR_MM_STATS; i++)
|
||||||
|
+ WRITE_ONCE(mm_list->nodes[nid].stats[sid][i], 0);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static bool should_skip_mm(struct mm_struct *mm, int nid, int swappiness)
|
||||||
|
+{
|
||||||
|
+ int file;
|
||||||
|
+ unsigned long size = 0;
|
||||||
|
+
|
||||||
|
+ if (mm_is_oom_victim(mm))
|
||||||
|
+ return true;
|
||||||
|
+
|
||||||
|
+ for (file = !swappiness; file < ANON_AND_FILE; file++) {
|
||||||
|
+ if (lru_gen_mm_is_active(mm) || node_isset(nid, mm->lrugen.nodes[file]))
|
||||||
|
+ size += file ? get_mm_counter(mm, MM_FILEPAGES) :
|
||||||
|
+ get_mm_counter(mm, MM_ANONPAGES) +
|
||||||
|
+ get_mm_counter(mm, MM_SHMEMPAGES);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ /* leave the legwork to the rmap if mapped pages are too sparse */
|
||||||
|
+ if (size < max(SWAP_CLUSTER_MAX, mm_pgtables_bytes(mm) / PAGE_SIZE))
|
||||||
|
+ return true;
|
||||||
|
+
|
||||||
|
+ return !mmget_not_zero(mm);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+/* To support multiple workers that concurrently walk mm_struct list. */
|
||||||
|
+static bool get_next_mm(struct mm_walk_args *args, int swappiness, struct mm_struct **iter)
|
||||||
|
+{
|
||||||
|
+ bool last = true;
|
||||||
|
+ struct mm_struct *mm = NULL;
|
||||||
|
+ int nid = args->node_id;
|
||||||
|
+ struct lru_gen_mm_list *mm_list = get_mm_list(args->memcg);
|
||||||
|
+
|
||||||
|
+ if (*iter)
|
||||||
|
+ mmput_async(*iter);
|
||||||
|
+ else if (args->max_seq <= READ_ONCE(mm_list->nodes[nid].cur_seq))
|
||||||
|
+ return false;
|
||||||
|
+
|
||||||
|
+ spin_lock(&mm_list->lock);
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(args->max_seq > mm_list->nodes[nid].cur_seq + 1);
|
||||||
|
+ VM_BUG_ON(*iter && args->max_seq < mm_list->nodes[nid].cur_seq);
|
||||||
|
+ VM_BUG_ON(*iter && !mm_list->nodes[nid].nr_workers);
|
||||||
|
+
|
||||||
|
+ if (args->max_seq <= mm_list->nodes[nid].cur_seq) {
|
||||||
|
+ last = *iter;
|
||||||
|
+ goto done;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (mm_list->nodes[nid].iter == &mm_list->head) {
|
||||||
|
+ VM_BUG_ON(*iter || mm_list->nodes[nid].nr_workers);
|
||||||
|
+ mm_list->nodes[nid].iter = mm_list->nodes[nid].iter->next;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ while (!mm && mm_list->nodes[nid].iter != &mm_list->head) {
|
||||||
|
+ mm = list_entry(mm_list->nodes[nid].iter, struct mm_struct, lrugen.list);
|
||||||
|
+ mm_list->nodes[nid].iter = mm_list->nodes[nid].iter->next;
|
||||||
|
+ if (should_skip_mm(mm, nid, swappiness))
|
||||||
|
+ mm = NULL;
|
||||||
|
+
|
||||||
|
+ args->mm_stats[mm ? MM_SCHED_ACTIVE : MM_SCHED_INACTIVE]++;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (mm_list->nodes[nid].iter == &mm_list->head)
|
||||||
|
+ WRITE_ONCE(mm_list->nodes[nid].cur_seq,
|
||||||
|
+ mm_list->nodes[nid].cur_seq + 1);
|
||||||
|
+done:
|
||||||
|
+ if (*iter && !mm)
|
||||||
|
+ mm_list->nodes[nid].nr_workers--;
|
||||||
|
+ if (!*iter && mm)
|
||||||
|
+ mm_list->nodes[nid].nr_workers++;
|
||||||
|
+
|
||||||
|
+ last = last && !mm_list->nodes[nid].nr_workers &&
|
||||||
|
+ mm_list->nodes[nid].iter == &mm_list->head;
|
||||||
|
+
|
||||||
|
+ reset_mm_stats(mm_list, last, args);
|
||||||
|
+
|
||||||
|
+ spin_unlock(&mm_list->lock);
|
||||||
|
+
|
||||||
|
+ *iter = mm;
|
||||||
|
+
|
||||||
|
+ return last;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
/******************************************************************************
|
||||||
|
* state change
|
||||||
|
******************************************************************************/
|
||||||
|
@@ -4694,6 +5001,15 @@ static int __init init_lru_gen(void)
|
||||||
|
{
|
||||||
|
BUILD_BUG_ON(MIN_NR_GENS + 1 >= MAX_NR_GENS);
|
||||||
|
BUILD_BUG_ON(BIT(LRU_GEN_WIDTH) <= MAX_NR_GENS);
|
||||||
|
+ BUILD_BUG_ON(sizeof(MM_STAT_CODES) != NR_MM_STATS + 1);
|
||||||
|
+
|
||||||
|
+ if (mem_cgroup_disabled()) {
|
||||||
|
+ global_mm_list = alloc_mm_list();
|
||||||
|
+ if (!global_mm_list) {
|
||||||
|
+ pr_err("lru_gen: failed to allocate global mm_struct list\n");
|
||||||
|
+ return -ENOMEM;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
|
||||||
|
if (hotplug_memory_notifier(lru_gen_online_mem, 0))
|
||||||
|
pr_err("lru_gen: failed to subscribe hotplug notifications\n");
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -0,0 +1,853 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id CC788C43460
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:20 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id ADF7B6128E
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:20 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S1345140AbhDMG5i (ORCPT
|
||||||
|
<rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:38 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44200 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S1345088AbhDMG5Q (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:16 -0400
|
||||||
|
Received: from mail-qt1-x849.google.com (mail-qt1-x849.google.com [IPv6:2607:f8b0:4864:20::849])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8F91CC061756
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:57 -0700 (PDT)
|
||||||
|
Received: by mail-qt1-x849.google.com with SMTP id t18so666548qtw.15
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:57 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=6fhJHIbNqUBjvtOegfE2MyphyVhL6hJWTXmeiM/7CYU=;
|
||||||
|
b=nCCKEcrZRzhFu47i9x+KHFgV9bpn2QVPdLNp94/tvI2vdGJLS5yFnnrPQk/ZvV+805
|
||||||
|
oU9Y2xHhJFPVW5TfOLl+0cfdlw6G7bEAFmF1h4Uf+m4IIGVwMY+rg0tngfuV3hILEC/m
|
||||||
|
n+gQGstNi8BWz/WCQfT/CZcdFvYSUN04sTRJQZuLJPkujaFh7e8KEoTWM8Els3JqHgbc
|
||||||
|
LgYf9G3svPIdXSaGd7VPKBNPPf6gEFy/2HFBYAgJkJKvcduCSex9l6NdzI0GMRm0OYUM
|
||||||
|
C4BaQwaJZ6SJQXdHUAecfaC52R8b2Z/IZLmM44hUGJ3NGHSotvQ6lyAB8x6J2J/K2F2i
|
||||||
|
PJ9A==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=6fhJHIbNqUBjvtOegfE2MyphyVhL6hJWTXmeiM/7CYU=;
|
||||||
|
b=RKke0otlx0Z8q7yzlS4XpyZ5aovH7VEdxD07op8jejoFs5sh8CiOsB0OWYJ7WtpxIx
|
||||||
|
5eGpQFXb9BDl7z/w8mHGGABHKc6R44O+H6hfTDY7lBM6ycMXzUSbjQvnLzA1hgsk5Qzz
|
||||||
|
dFshVj2i3XpZoeXGBCx8f9E8lOrxcWydcMYmGU5PvLhJcJh5otr+dDPYiOpTdW+v1h1F
|
||||||
|
7zmsGOz9U6qOA3KwGKCLm44MrC1JtdV9omiuSJHBD+QfkfnIBcdeKCwgyRE44/35eufm
|
||||||
|
6b2R7XpOsNHciIksiDnzt5wgJJ1KnlB7E7hjCN/Q77qQcVL7cnSVQBCcYQOvUHoJ8lNg
|
||||||
|
fXFA==
|
||||||
|
X-Gm-Message-State: AOAM532Oo0F4MpWnfaEOY3TDummCsibMAZArGFkZs9eTu66X+a59qfdI
|
||||||
|
ziZoz/a2u1Q+YaODOe4XEW2tOqr3t3c=
|
||||||
|
X-Google-Smtp-Source: ABdhPJwG6wdrxi/hta1GN0K/zTCsJXK0CKzWYrx4efW6qkJhGiiXfKR8fAg0J/tzxkhd2xOMwJf4T1jXgvA=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:ad4:4894:: with SMTP id bv20mr10806368qvb.34.1618297016759;
|
||||||
|
Mon, 12 Apr 2021 23:56:56 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:28 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-12-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 11/16] mm: multigenerational lru: aging
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-12-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
The aging produces young generations. Given an lruvec, the aging walks
|
||||||
|
the mm_struct list associated with this lruvec to scan page tables for
|
||||||
|
referenced pages. Upon finding one, the aging updates the generation
|
||||||
|
number of this page to max_seq. After each round of scan, the aging
|
||||||
|
increments max_seq. The aging is due when both of min_seq[2] reaches
|
||||||
|
max_seq-1, assuming both anon and file types are reclaimable.
|
||||||
|
|
||||||
|
The aging uses the following optimizations when scanning page tables:
|
||||||
|
1) It will not scan page tables from processes that have been
|
||||||
|
sleeping since the last scan.
|
||||||
|
2) It will not scan PTE tables under non-leaf PMD entries that do
|
||||||
|
not have the accessed bit set, when
|
||||||
|
CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG=y.
|
||||||
|
3) It will not zigzag between the PGD table and the same PMD or PTE
|
||||||
|
table spanning multiple VMAs. In other words, it finishes all the
|
||||||
|
VMAs with the range of the same PMD or PTE table before it returns
|
||||||
|
to the PGD table. This optimizes workloads that have large numbers
|
||||||
|
of tiny VMAs, especially when CONFIG_PGTABLE_LEVELS=5.
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
mm/vmscan.c | 700 ++++++++++++++++++++++++++++++++++++++++++++++++++++
|
||||||
|
1 file changed, 700 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/mm/vmscan.c b/mm/vmscan.c
|
||||||
|
index d67dfd1e3930..31e1b4155677 100644
|
||||||
|
--- a/mm/vmscan.c
|
||||||
|
+++ b/mm/vmscan.c
|
||||||
|
@@ -50,6 +50,7 @@
|
||||||
|
#include <linux/dax.h>
|
||||||
|
#include <linux/psi.h>
|
||||||
|
#include <linux/memory.h>
|
||||||
|
+#include <linux/pagewalk.h>
|
||||||
|
|
||||||
|
#include <asm/tlbflush.h>
|
||||||
|
#include <asm/div64.h>
|
||||||
|
@@ -4771,6 +4772,702 @@ static bool get_next_mm(struct mm_walk_args *args, int swappiness, struct mm_str
|
||||||
|
return last;
|
||||||
|
}
|
||||||
|
|
||||||
|
+/******************************************************************************
|
||||||
|
+ * the aging
|
||||||
|
+ ******************************************************************************/
|
||||||
|
+
|
||||||
|
+static void update_batch_size(struct page *page, int old_gen, int new_gen,
|
||||||
|
+ struct mm_walk_args *args)
|
||||||
|
+{
|
||||||
|
+ int file = page_is_file_lru(page);
|
||||||
|
+ int zone = page_zonenum(page);
|
||||||
|
+ int delta = thp_nr_pages(page);
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(old_gen >= MAX_NR_GENS);
|
||||||
|
+ VM_BUG_ON(new_gen >= MAX_NR_GENS);
|
||||||
|
+
|
||||||
|
+ args->batch_size++;
|
||||||
|
+
|
||||||
|
+ args->nr_pages[old_gen][file][zone] -= delta;
|
||||||
|
+ args->nr_pages[new_gen][file][zone] += delta;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void reset_batch_size(struct lruvec *lruvec, struct mm_walk_args *args)
|
||||||
|
+{
|
||||||
|
+ int gen, file, zone;
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+
|
||||||
|
+ args->batch_size = 0;
|
||||||
|
+
|
||||||
|
+ spin_lock_irq(&lruvec->lru_lock);
|
||||||
|
+
|
||||||
|
+ for_each_gen_type_zone(gen, file, zone) {
|
||||||
|
+ enum lru_list lru = LRU_FILE * file;
|
||||||
|
+ int total = args->nr_pages[gen][file][zone];
|
||||||
|
+
|
||||||
|
+ if (!total)
|
||||||
|
+ continue;
|
||||||
|
+
|
||||||
|
+ args->nr_pages[gen][file][zone] = 0;
|
||||||
|
+ WRITE_ONCE(lrugen->sizes[gen][file][zone],
|
||||||
|
+ lrugen->sizes[gen][file][zone] + total);
|
||||||
|
+
|
||||||
|
+ if (lru_gen_is_active(lruvec, gen))
|
||||||
|
+ lru += LRU_ACTIVE;
|
||||||
|
+ update_lru_size(lruvec, lru, zone, total);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ spin_unlock_irq(&lruvec->lru_lock);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static int page_update_gen(struct page *page, int new_gen)
|
||||||
|
+{
|
||||||
|
+ int old_gen;
|
||||||
|
+ unsigned long old_flags, new_flags;
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(new_gen >= MAX_NR_GENS);
|
||||||
|
+
|
||||||
|
+ do {
|
||||||
|
+ old_flags = READ_ONCE(page->flags);
|
||||||
|
+
|
||||||
|
+ old_gen = ((old_flags & LRU_GEN_MASK) >> LRU_GEN_PGOFF) - 1;
|
||||||
|
+ if (old_gen < 0)
|
||||||
|
+ new_flags = old_flags | BIT(PG_referenced);
|
||||||
|
+ else
|
||||||
|
+ new_flags = (old_flags & ~(LRU_GEN_MASK | LRU_USAGE_MASK |
|
||||||
|
+ LRU_TIER_FLAGS)) | ((new_gen + 1UL) << LRU_GEN_PGOFF);
|
||||||
|
+
|
||||||
|
+ if (old_flags == new_flags)
|
||||||
|
+ break;
|
||||||
|
+ } while (cmpxchg(&page->flags, old_flags, new_flags) != old_flags);
|
||||||
|
+
|
||||||
|
+ return old_gen;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static int should_skip_vma(unsigned long start, unsigned long end, struct mm_walk *walk)
|
||||||
|
+{
|
||||||
|
+ struct vm_area_struct *vma = walk->vma;
|
||||||
|
+ struct mm_walk_args *args = walk->private;
|
||||||
|
+
|
||||||
|
+ if (!vma_is_accessible(vma) || is_vm_hugetlb_page(vma) ||
|
||||||
|
+ (vma->vm_flags & (VM_LOCKED | VM_SPECIAL)))
|
||||||
|
+ return true;
|
||||||
|
+
|
||||||
|
+ if (vma_is_anonymous(vma))
|
||||||
|
+ return !args->should_walk[0];
|
||||||
|
+
|
||||||
|
+ if (vma_is_shmem(vma))
|
||||||
|
+ return !args->should_walk[0] ||
|
||||||
|
+ mapping_unevictable(vma->vm_file->f_mapping);
|
||||||
|
+
|
||||||
|
+ return !args->should_walk[1] || vma_is_dax(vma) ||
|
||||||
|
+ vma == get_gate_vma(vma->vm_mm) ||
|
||||||
|
+ mapping_unevictable(vma->vm_file->f_mapping);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+/*
|
||||||
|
+ * Some userspace memory allocators create many single-page VMAs. So instead of
|
||||||
|
+ * returning back to the PGD table for each of such VMAs, we finish at least an
|
||||||
|
+ * entire PMD table and therefore avoid many zigzags. This optimizes page table
|
||||||
|
+ * walks for workloads that have large numbers of tiny VMAs.
|
||||||
|
+ *
|
||||||
|
+ * We scan PMD tables in two pass. The first pass reaches to PTE tables and
|
||||||
|
+ * doesn't take the PMD lock. The second pass clears the accessed bit on PMD
|
||||||
|
+ * entries and needs to take the PMD lock. The second pass is only done on the
|
||||||
|
+ * PMD entries that first pass has found the accessed bit is set, and they must
|
||||||
|
+ * be:
|
||||||
|
+ * 1) leaf entries mapping huge pages from the node under reclaim
|
||||||
|
+ * 2) non-leaf entries whose leaf entries only map pages from the node under
|
||||||
|
+ * reclaim, when CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG=y.
|
||||||
|
+ */
|
||||||
|
+static bool get_next_interval(struct mm_walk *walk, unsigned long mask, unsigned long size,
|
||||||
|
+ unsigned long *start, unsigned long *end)
|
||||||
|
+{
|
||||||
|
+ unsigned long next = round_up(*end, size);
|
||||||
|
+ struct mm_walk_args *args = walk->private;
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(mask & size);
|
||||||
|
+ VM_BUG_ON(*start != *end);
|
||||||
|
+ VM_BUG_ON(!(*end & ~mask));
|
||||||
|
+ VM_BUG_ON((*end & mask) != (next & mask));
|
||||||
|
+
|
||||||
|
+ while (walk->vma) {
|
||||||
|
+ if (next >= walk->vma->vm_end) {
|
||||||
|
+ walk->vma = walk->vma->vm_next;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if ((next & mask) != (walk->vma->vm_start & mask))
|
||||||
|
+ return false;
|
||||||
|
+
|
||||||
|
+ if (next <= walk->vma->vm_start &&
|
||||||
|
+ should_skip_vma(walk->vma->vm_start, walk->vma->vm_end, walk)) {
|
||||||
|
+ walk->vma = walk->vma->vm_next;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ args->mm_stats[MM_VMA_INTERVAL]++;
|
||||||
|
+
|
||||||
|
+ *start = max(next, walk->vma->vm_start);
|
||||||
|
+ next = (next | ~mask) + 1;
|
||||||
|
+ /* rounded-up boundaries can wrap to 0 */
|
||||||
|
+ *end = next && next < walk->vma->vm_end ? next : walk->vma->vm_end;
|
||||||
|
+
|
||||||
|
+ return true;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return false;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end,
|
||||||
|
+ struct mm_walk *walk)
|
||||||
|
+{
|
||||||
|
+ int i;
|
||||||
|
+ pte_t *pte;
|
||||||
|
+ spinlock_t *ptl;
|
||||||
|
+ int remote = 0;
|
||||||
|
+ struct mm_walk_args *args = walk->private;
|
||||||
|
+ int old_gen, new_gen = lru_gen_from_seq(args->max_seq);
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(pmd_leaf(*pmd));
|
||||||
|
+
|
||||||
|
+ pte = pte_offset_map_lock(walk->mm, pmd, start & PMD_MASK, &ptl);
|
||||||
|
+ arch_enter_lazy_mmu_mode();
|
||||||
|
+restart:
|
||||||
|
+ for (i = pte_index(start); start != end; i++, start += PAGE_SIZE) {
|
||||||
|
+ struct page *page;
|
||||||
|
+ unsigned long pfn = pte_pfn(pte[i]);
|
||||||
|
+
|
||||||
|
+ if (!pte_present(pte[i]) || is_zero_pfn(pfn)) {
|
||||||
|
+ args->mm_stats[MM_LEAF_HOLE]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (!pte_young(pte[i])) {
|
||||||
|
+ args->mm_stats[MM_LEAF_OLD]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (pfn < args->start_pfn || pfn >= args->end_pfn) {
|
||||||
|
+ remote++;
|
||||||
|
+ args->mm_stats[MM_LEAF_OTHER_NODE]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ page = compound_head(pfn_to_page(pfn));
|
||||||
|
+ if (page_to_nid(page) != args->node_id) {
|
||||||
|
+ remote++;
|
||||||
|
+ args->mm_stats[MM_LEAF_OTHER_NODE]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (!ptep_test_and_clear_young(walk->vma, start, pte + i))
|
||||||
|
+ continue;
|
||||||
|
+
|
||||||
|
+ if (pte_dirty(pte[i]) && !PageDirty(page) &&
|
||||||
|
+ !(PageAnon(page) && PageSwapBacked(page) && !PageSwapCache(page))) {
|
||||||
|
+ set_page_dirty(page);
|
||||||
|
+ args->mm_stats[MM_LEAF_DIRTY]++;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (page_memcg_rcu(page) != args->memcg) {
|
||||||
|
+ args->mm_stats[MM_LEAF_OTHER_MEMCG]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ old_gen = page_update_gen(page, new_gen);
|
||||||
|
+ if (old_gen >= 0 && old_gen != new_gen)
|
||||||
|
+ update_batch_size(page, old_gen, new_gen, args);
|
||||||
|
+ args->mm_stats[MM_LEAF_YOUNG]++;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (i < PTRS_PER_PTE && get_next_interval(walk, PMD_MASK, PAGE_SIZE, &start, &end))
|
||||||
|
+ goto restart;
|
||||||
|
+
|
||||||
|
+ arch_leave_lazy_mmu_mode();
|
||||||
|
+ pte_unmap_unlock(pte, ptl);
|
||||||
|
+
|
||||||
|
+ return !remote;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static bool walk_pmd_range_unlocked(pud_t *pud, unsigned long start, unsigned long end,
|
||||||
|
+ struct mm_walk *walk)
|
||||||
|
+{
|
||||||
|
+ int i;
|
||||||
|
+ pmd_t *pmd;
|
||||||
|
+ unsigned long next;
|
||||||
|
+ int young = 0;
|
||||||
|
+ struct mm_walk_args *args = walk->private;
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(pud_leaf(*pud));
|
||||||
|
+
|
||||||
|
+ pmd = pmd_offset(pud, start & PUD_MASK);
|
||||||
|
+restart:
|
||||||
|
+ for (i = pmd_index(start); start != end; i++, start = next) {
|
||||||
|
+ pmd_t val = pmd_read_atomic(pmd + i);
|
||||||
|
+
|
||||||
|
+ next = pmd_addr_end(start, end);
|
||||||
|
+
|
||||||
|
+ barrier();
|
||||||
|
+ if (!pmd_present(val) || is_huge_zero_pmd(val)) {
|
||||||
|
+ args->mm_stats[MM_LEAF_HOLE]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (pmd_trans_huge(val)) {
|
||||||
|
+ unsigned long pfn = pmd_pfn(val);
|
||||||
|
+
|
||||||
|
+ if (!pmd_young(val)) {
|
||||||
|
+ args->mm_stats[MM_LEAF_OLD]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (pfn < args->start_pfn || pfn >= args->end_pfn) {
|
||||||
|
+ args->mm_stats[MM_LEAF_OTHER_NODE]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
|
||||||
|
+ young++;
|
||||||
|
+ __set_bit(i, args->bitmap);
|
||||||
|
+#endif
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+#ifdef CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG
|
||||||
|
+ if (!pmd_young(val)) {
|
||||||
|
+ args->mm_stats[MM_NONLEAF_OLD]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+#endif
|
||||||
|
+
|
||||||
|
+ if (walk_pte_range(&val, start, next, walk)) {
|
||||||
|
+#ifdef CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG
|
||||||
|
+ young++;
|
||||||
|
+ __set_bit(i, args->bitmap);
|
||||||
|
+#endif
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (i < PTRS_PER_PMD && get_next_interval(walk, PUD_MASK, PMD_SIZE, &start, &end))
|
||||||
|
+ goto restart;
|
||||||
|
+
|
||||||
|
+ return young;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG)
|
||||||
|
+static void walk_pmd_range_locked(pud_t *pud, unsigned long start, unsigned long end,
|
||||||
|
+ struct mm_walk *walk)
|
||||||
|
+{
|
||||||
|
+ int i;
|
||||||
|
+ pmd_t *pmd;
|
||||||
|
+ spinlock_t *ptl;
|
||||||
|
+ struct mm_walk_args *args = walk->private;
|
||||||
|
+ int old_gen, new_gen = lru_gen_from_seq(args->max_seq);
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(pud_leaf(*pud));
|
||||||
|
+
|
||||||
|
+ start &= PUD_MASK;
|
||||||
|
+ pmd = pmd_offset(pud, start);
|
||||||
|
+ ptl = pmd_lock(walk->mm, pmd);
|
||||||
|
+ arch_enter_lazy_mmu_mode();
|
||||||
|
+
|
||||||
|
+ for_each_set_bit(i, args->bitmap, PTRS_PER_PMD) {
|
||||||
|
+ struct page *page;
|
||||||
|
+ unsigned long pfn = pmd_pfn(pmd[i]);
|
||||||
|
+ unsigned long addr = start + PMD_SIZE * i;
|
||||||
|
+
|
||||||
|
+ if (!pmd_present(pmd[i]) || is_huge_zero_pmd(pmd[i])) {
|
||||||
|
+ args->mm_stats[MM_LEAF_HOLE]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (!pmd_young(pmd[i])) {
|
||||||
|
+ args->mm_stats[MM_LEAF_OLD]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (!pmd_trans_huge(pmd[i])) {
|
||||||
|
+#ifdef CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG
|
||||||
|
+ args->mm_stats[MM_NONLEAF_YOUNG]++;
|
||||||
|
+ pmdp_test_and_clear_young(walk->vma, addr, pmd + i);
|
||||||
|
+#endif
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (pfn < args->start_pfn || pfn >= args->end_pfn) {
|
||||||
|
+ args->mm_stats[MM_LEAF_OTHER_NODE]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ page = pfn_to_page(pfn);
|
||||||
|
+ VM_BUG_ON_PAGE(PageTail(page), page);
|
||||||
|
+ if (page_to_nid(page) != args->node_id) {
|
||||||
|
+ args->mm_stats[MM_LEAF_OTHER_NODE]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (!pmdp_test_and_clear_young(walk->vma, addr, pmd + i))
|
||||||
|
+ continue;
|
||||||
|
+
|
||||||
|
+ if (pmd_dirty(pmd[i]) && !PageDirty(page) &&
|
||||||
|
+ !(PageAnon(page) && PageSwapBacked(page) && !PageSwapCache(page))) {
|
||||||
|
+ set_page_dirty(page);
|
||||||
|
+ args->mm_stats[MM_LEAF_DIRTY]++;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (page_memcg_rcu(page) != args->memcg) {
|
||||||
|
+ args->mm_stats[MM_LEAF_OTHER_MEMCG]++;
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ old_gen = page_update_gen(page, new_gen);
|
||||||
|
+ if (old_gen >= 0 && old_gen != new_gen)
|
||||||
|
+ update_batch_size(page, old_gen, new_gen, args);
|
||||||
|
+ args->mm_stats[MM_LEAF_YOUNG]++;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ arch_leave_lazy_mmu_mode();
|
||||||
|
+ spin_unlock(ptl);
|
||||||
|
+
|
||||||
|
+ memset(args->bitmap, 0, sizeof(args->bitmap));
|
||||||
|
+}
|
||||||
|
+#else
|
||||||
|
+static void walk_pmd_range_locked(pud_t *pud, unsigned long start, unsigned long end,
|
||||||
|
+ struct mm_walk *walk)
|
||||||
|
+{
|
||||||
|
+}
|
||||||
|
+#endif
|
||||||
|
+
|
||||||
|
+static int walk_pud_range(p4d_t *p4d, unsigned long start, unsigned long end,
|
||||||
|
+ struct mm_walk *walk)
|
||||||
|
+{
|
||||||
|
+ int i;
|
||||||
|
+ pud_t *pud;
|
||||||
|
+ unsigned long next;
|
||||||
|
+ struct mm_walk_args *args = walk->private;
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(p4d_leaf(*p4d));
|
||||||
|
+
|
||||||
|
+ pud = pud_offset(p4d, start & P4D_MASK);
|
||||||
|
+restart:
|
||||||
|
+ for (i = pud_index(start); start != end; i++, start = next) {
|
||||||
|
+ pud_t val = READ_ONCE(pud[i]);
|
||||||
|
+
|
||||||
|
+ next = pud_addr_end(start, end);
|
||||||
|
+
|
||||||
|
+ if (!pud_present(val) || WARN_ON_ONCE(pud_leaf(val)))
|
||||||
|
+ continue;
|
||||||
|
+
|
||||||
|
+ if (walk_pmd_range_unlocked(&val, start, next, walk))
|
||||||
|
+ walk_pmd_range_locked(&val, start, next, walk);
|
||||||
|
+
|
||||||
|
+ if (args->batch_size >= MAX_BATCH_SIZE) {
|
||||||
|
+ end = (start | ~PUD_MASK) + 1;
|
||||||
|
+ goto done;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (i < PTRS_PER_PUD && get_next_interval(walk, P4D_MASK, PUD_SIZE, &start, &end))
|
||||||
|
+ goto restart;
|
||||||
|
+
|
||||||
|
+ end = round_up(end, P4D_SIZE);
|
||||||
|
+done:
|
||||||
|
+ /* rounded-up boundaries can wrap to 0 */
|
||||||
|
+ args->next_addr = end && walk->vma ? max(end, walk->vma->vm_start) : 0;
|
||||||
|
+
|
||||||
|
+ return -EAGAIN;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void walk_mm(struct mm_walk_args *args, int swappiness, struct mm_struct *mm)
|
||||||
|
+{
|
||||||
|
+ static const struct mm_walk_ops mm_walk_ops = {
|
||||||
|
+ .test_walk = should_skip_vma,
|
||||||
|
+ .p4d_entry = walk_pud_range,
|
||||||
|
+ };
|
||||||
|
+
|
||||||
|
+ int err;
|
||||||
|
+ int file;
|
||||||
|
+ int nid = args->node_id;
|
||||||
|
+ struct mem_cgroup *memcg = args->memcg;
|
||||||
|
+ struct lruvec *lruvec = mem_cgroup_lruvec(memcg, NODE_DATA(nid));
|
||||||
|
+
|
||||||
|
+ args->next_addr = FIRST_USER_ADDRESS;
|
||||||
|
+ for (file = !swappiness; file < ANON_AND_FILE; file++)
|
||||||
|
+ args->should_walk[file] = lru_gen_mm_is_active(mm) ||
|
||||||
|
+ node_isset(nid, mm->lrugen.nodes[file]);
|
||||||
|
+
|
||||||
|
+ do {
|
||||||
|
+ unsigned long start = args->next_addr;
|
||||||
|
+ unsigned long end = mm->highest_vm_end;
|
||||||
|
+
|
||||||
|
+ err = -EBUSY;
|
||||||
|
+
|
||||||
|
+ preempt_disable();
|
||||||
|
+ rcu_read_lock();
|
||||||
|
+
|
||||||
|
+#ifdef CONFIG_MEMCG
|
||||||
|
+ if (memcg && atomic_read(&memcg->moving_account)) {
|
||||||
|
+ args->mm_stats[MM_LOCK_CONTENTION]++;
|
||||||
|
+ goto contended;
|
||||||
|
+ }
|
||||||
|
+#endif
|
||||||
|
+ if (!mmap_read_trylock(mm)) {
|
||||||
|
+ args->mm_stats[MM_LOCK_CONTENTION]++;
|
||||||
|
+ goto contended;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ err = walk_page_range(mm, start, end, &mm_walk_ops, args);
|
||||||
|
+
|
||||||
|
+ mmap_read_unlock(mm);
|
||||||
|
+
|
||||||
|
+ if (args->batch_size)
|
||||||
|
+ reset_batch_size(lruvec, args);
|
||||||
|
+contended:
|
||||||
|
+ rcu_read_unlock();
|
||||||
|
+ preempt_enable();
|
||||||
|
+
|
||||||
|
+ cond_resched();
|
||||||
|
+ } while (err == -EAGAIN && args->next_addr &&
|
||||||
|
+ !mm_is_oom_victim(mm) && !mm_has_migrated(mm, memcg));
|
||||||
|
+
|
||||||
|
+ if (err == -EBUSY)
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ for (file = !swappiness; file < ANON_AND_FILE; file++) {
|
||||||
|
+ if (args->should_walk[file])
|
||||||
|
+ node_clear(nid, mm->lrugen.nodes[file]);
|
||||||
|
+ }
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void page_inc_gen(struct page *page, struct lruvec *lruvec, bool front)
|
||||||
|
+{
|
||||||
|
+ int old_gen, new_gen;
|
||||||
|
+ unsigned long old_flags, new_flags;
|
||||||
|
+ int file = page_is_file_lru(page);
|
||||||
|
+ int zone = page_zonenum(page);
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+
|
||||||
|
+ old_gen = lru_gen_from_seq(lrugen->min_seq[file]);
|
||||||
|
+
|
||||||
|
+ do {
|
||||||
|
+ old_flags = READ_ONCE(page->flags);
|
||||||
|
+ new_gen = ((old_flags & LRU_GEN_MASK) >> LRU_GEN_PGOFF) - 1;
|
||||||
|
+ VM_BUG_ON_PAGE(new_gen < 0, page);
|
||||||
|
+ if (new_gen >= 0 && new_gen != old_gen)
|
||||||
|
+ goto sort;
|
||||||
|
+
|
||||||
|
+ new_gen = (old_gen + 1) % MAX_NR_GENS;
|
||||||
|
+ new_flags = (old_flags & ~(LRU_GEN_MASK | LRU_USAGE_MASK | LRU_TIER_FLAGS)) |
|
||||||
|
+ ((new_gen + 1UL) << LRU_GEN_PGOFF);
|
||||||
|
+ /* mark the page for reclaim if it's pending writeback */
|
||||||
|
+ if (front)
|
||||||
|
+ new_flags |= BIT(PG_reclaim);
|
||||||
|
+ } while (cmpxchg(&page->flags, old_flags, new_flags) != old_flags);
|
||||||
|
+
|
||||||
|
+ lru_gen_update_size(page, lruvec, old_gen, new_gen);
|
||||||
|
+sort:
|
||||||
|
+ if (front)
|
||||||
|
+ list_move(&page->lru, &lrugen->lists[new_gen][file][zone]);
|
||||||
|
+ else
|
||||||
|
+ list_move_tail(&page->lru, &lrugen->lists[new_gen][file][zone]);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static bool try_inc_min_seq(struct lruvec *lruvec, int file)
|
||||||
|
+{
|
||||||
|
+ int gen, zone;
|
||||||
|
+ bool success = false;
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(!seq_is_valid(lruvec));
|
||||||
|
+
|
||||||
|
+ while (get_nr_gens(lruvec, file) > MIN_NR_GENS) {
|
||||||
|
+ gen = lru_gen_from_seq(lrugen->min_seq[file]);
|
||||||
|
+
|
||||||
|
+ for (zone = 0; zone < MAX_NR_ZONES; zone++) {
|
||||||
|
+ if (!list_empty(&lrugen->lists[gen][file][zone]))
|
||||||
|
+ return success;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ reset_controller_pos(lruvec, gen, file);
|
||||||
|
+ WRITE_ONCE(lrugen->min_seq[file], lrugen->min_seq[file] + 1);
|
||||||
|
+
|
||||||
|
+ success = true;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return success;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static bool inc_min_seq(struct lruvec *lruvec, int file)
|
||||||
|
+{
|
||||||
|
+ int gen, zone;
|
||||||
|
+ int batch_size = 0;
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(!seq_is_valid(lruvec));
|
||||||
|
+
|
||||||
|
+ if (get_nr_gens(lruvec, file) != MAX_NR_GENS)
|
||||||
|
+ return true;
|
||||||
|
+
|
||||||
|
+ gen = lru_gen_from_seq(lrugen->min_seq[file]);
|
||||||
|
+
|
||||||
|
+ for (zone = 0; zone < MAX_NR_ZONES; zone++) {
|
||||||
|
+ struct list_head *head = &lrugen->lists[gen][file][zone];
|
||||||
|
+
|
||||||
|
+ while (!list_empty(head)) {
|
||||||
|
+ struct page *page = lru_to_page(head);
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON_PAGE(PageTail(page), page);
|
||||||
|
+ VM_BUG_ON_PAGE(PageUnevictable(page), page);
|
||||||
|
+ VM_BUG_ON_PAGE(PageActive(page), page);
|
||||||
|
+ VM_BUG_ON_PAGE(page_is_file_lru(page) != file, page);
|
||||||
|
+ VM_BUG_ON_PAGE(page_zonenum(page) != zone, page);
|
||||||
|
+
|
||||||
|
+ prefetchw_prev_lru_page(page, head, flags);
|
||||||
|
+
|
||||||
|
+ page_inc_gen(page, lruvec, false);
|
||||||
|
+
|
||||||
|
+ if (++batch_size == MAX_BATCH_SIZE)
|
||||||
|
+ return false;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(lrugen->sizes[gen][file][zone]);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ reset_controller_pos(lruvec, gen, file);
|
||||||
|
+ WRITE_ONCE(lrugen->min_seq[file], lrugen->min_seq[file] + 1);
|
||||||
|
+
|
||||||
|
+ return true;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void inc_max_seq(struct lruvec *lruvec)
|
||||||
|
+{
|
||||||
|
+ int gen, file, zone;
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+
|
||||||
|
+ spin_lock_irq(&lruvec->lru_lock);
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(!seq_is_valid(lruvec));
|
||||||
|
+
|
||||||
|
+ for (file = 0; file < ANON_AND_FILE; file++) {
|
||||||
|
+ if (try_inc_min_seq(lruvec, file))
|
||||||
|
+ continue;
|
||||||
|
+
|
||||||
|
+ while (!inc_min_seq(lruvec, file)) {
|
||||||
|
+ spin_unlock_irq(&lruvec->lru_lock);
|
||||||
|
+ cond_resched();
|
||||||
|
+ spin_lock_irq(&lruvec->lru_lock);
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ gen = lru_gen_from_seq(lrugen->max_seq - 1);
|
||||||
|
+ for_each_type_zone(file, zone) {
|
||||||
|
+ enum lru_list lru = LRU_FILE * file;
|
||||||
|
+ long total = lrugen->sizes[gen][file][zone];
|
||||||
|
+
|
||||||
|
+ if (!total)
|
||||||
|
+ continue;
|
||||||
|
+
|
||||||
|
+ WARN_ON_ONCE(total != (int)total);
|
||||||
|
+
|
||||||
|
+ update_lru_size(lruvec, lru, zone, total);
|
||||||
|
+ update_lru_size(lruvec, lru + LRU_ACTIVE, zone, -total);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ gen = lru_gen_from_seq(lrugen->max_seq + 1);
|
||||||
|
+ for_each_type_zone(file, zone) {
|
||||||
|
+ VM_BUG_ON(lrugen->sizes[gen][file][zone]);
|
||||||
|
+ VM_BUG_ON(!list_empty(&lrugen->lists[gen][file][zone]));
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ for (file = 0; file < ANON_AND_FILE; file++)
|
||||||
|
+ reset_controller_pos(lruvec, gen, file);
|
||||||
|
+
|
||||||
|
+ WRITE_ONCE(lrugen->timestamps[gen], jiffies);
|
||||||
|
+ /* make sure all preceding modifications appear first */
|
||||||
|
+ smp_store_release(&lrugen->max_seq, lrugen->max_seq + 1);
|
||||||
|
+
|
||||||
|
+ spin_unlock_irq(&lruvec->lru_lock);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+/* Main function used by foreground, background and user-triggered aging. */
|
||||||
|
+static bool walk_mm_list(struct lruvec *lruvec, unsigned long max_seq,
|
||||||
|
+ struct scan_control *sc, int swappiness, struct mm_walk_args *args)
|
||||||
|
+{
|
||||||
|
+ bool last;
|
||||||
|
+ bool alloc = !args;
|
||||||
|
+ struct mm_struct *mm = NULL;
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+ struct pglist_data *pgdat = lruvec_pgdat(lruvec);
|
||||||
|
+ int nid = pgdat->node_id;
|
||||||
|
+ struct mem_cgroup *memcg = lruvec_memcg(lruvec);
|
||||||
|
+ struct lru_gen_mm_list *mm_list = get_mm_list(memcg);
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(max_seq > READ_ONCE(lrugen->max_seq));
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * For each walk of the mm_struct list of a memcg, we decrement the
|
||||||
|
+ * priority of its lrugen. For each walk of all memcgs in kswapd, we
|
||||||
|
+ * increment the priority of every lrugen.
|
||||||
|
+ *
|
||||||
|
+ * So if this lrugen has a higher priority (smaller value), it means
|
||||||
|
+ * other concurrent reclaimers have walked its mm list, and we skip it
|
||||||
|
+ * for this priority in order to balance the pressure on all memcgs.
|
||||||
|
+ */
|
||||||
|
+ if (!mem_cgroup_disabled() && !cgroup_reclaim(sc) &&
|
||||||
|
+ sc->priority > atomic_read(&lrugen->priority))
|
||||||
|
+ return false;
|
||||||
|
+
|
||||||
|
+ if (alloc) {
|
||||||
|
+ args = kvzalloc_node(sizeof(*args), GFP_KERNEL, nid);
|
||||||
|
+ if (!args)
|
||||||
|
+ return false;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ args->memcg = memcg;
|
||||||
|
+ args->max_seq = max_seq;
|
||||||
|
+ args->start_pfn = pgdat->node_start_pfn;
|
||||||
|
+ args->end_pfn = pgdat_end_pfn(pgdat);
|
||||||
|
+ args->node_id = nid;
|
||||||
|
+
|
||||||
|
+ do {
|
||||||
|
+ last = get_next_mm(args, swappiness, &mm);
|
||||||
|
+ if (mm)
|
||||||
|
+ walk_mm(args, swappiness, mm);
|
||||||
|
+
|
||||||
|
+ cond_resched();
|
||||||
|
+ } while (mm);
|
||||||
|
+
|
||||||
|
+ if (alloc)
|
||||||
|
+ kvfree(args);
|
||||||
|
+
|
||||||
|
+ if (!last) {
|
||||||
|
+ /* foreground aging prefers not to wait unless "necessary" */
|
||||||
|
+ if (!current_is_kswapd() && sc->priority < DEF_PRIORITY - 2)
|
||||||
|
+ wait_event_killable(mm_list->nodes[nid].wait,
|
||||||
|
+ max_seq < READ_ONCE(lrugen->max_seq));
|
||||||
|
+
|
||||||
|
+ return max_seq < READ_ONCE(lrugen->max_seq);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(max_seq != READ_ONCE(lrugen->max_seq));
|
||||||
|
+
|
||||||
|
+ inc_max_seq(lruvec);
|
||||||
|
+
|
||||||
|
+ if (!mem_cgroup_disabled())
|
||||||
|
+ atomic_add_unless(&lrugen->priority, -1, 0);
|
||||||
|
+
|
||||||
|
+ /* order against inc_max_seq() */
|
||||||
|
+ smp_mb();
|
||||||
|
+ /* either we see any waiters or they will see the updated max_seq */
|
||||||
|
+ if (waitqueue_active(&mm_list->nodes[nid].wait))
|
||||||
|
+ wake_up_all(&mm_list->nodes[nid].wait);
|
||||||
|
+
|
||||||
|
+ wakeup_flusher_threads(WB_REASON_VMSCAN);
|
||||||
|
+
|
||||||
|
+ return true;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
/******************************************************************************
|
||||||
|
* state change
|
||||||
|
******************************************************************************/
|
||||||
|
@@ -5002,6 +5699,9 @@ static int __init init_lru_gen(void)
|
||||||
|
BUILD_BUG_ON(MIN_NR_GENS + 1 >= MAX_NR_GENS);
|
||||||
|
BUILD_BUG_ON(BIT(LRU_GEN_WIDTH) <= MAX_NR_GENS);
|
||||||
|
BUILD_BUG_ON(sizeof(MM_STAT_CODES) != NR_MM_STATS + 1);
|
||||||
|
+ BUILD_BUG_ON(PMD_SIZE / PAGE_SIZE != PTRS_PER_PTE);
|
||||||
|
+ BUILD_BUG_ON(PUD_SIZE / PMD_SIZE != PTRS_PER_PMD);
|
||||||
|
+ BUILD_BUG_ON(P4D_SIZE / PUD_SIZE != PTRS_PER_PUD);
|
||||||
|
|
||||||
|
if (mem_cgroup_disabled()) {
|
||||||
|
global_mm_list = alloc_mm_list();
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -0,0 +1,474 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id D9882C433ED
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:29 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id B8C7E613B1
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:29 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S1345148AbhDMG5r (ORCPT
|
||||||
|
<rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:47 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44204 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S1345094AbhDMG5S (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:18 -0400
|
||||||
|
Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3319EC06175F
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:59 -0700 (PDT)
|
||||||
|
Received: by mail-yb1-xb4a.google.com with SMTP id p75so9209456ybc.8
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:56:59 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=AtcshlKlEpO25DWX4HdWHYKkg2qmJuRhLpG3jAQhwYc=;
|
||||||
|
b=KpNWVguu83mUBVdG9rV7ayYNm+Qrzu5gAuasFnKSoWlkRinGKl/FvUmCisXgOrxGC0
|
||||||
|
C9Wgab1jU/EJCdE85EdYCvp7ANytDv3ICBmljKThBcjCsU/wnl68RE3qlTlwro63hIWt
|
||||||
|
MNfXX7skFRf+i1zpUlA6T7R/rTDSlD3n0pboX0T6KXoxN8TAWeB2SgBy2EDQkapMZU3f
|
||||||
|
Yj8IM3/wDy/W+hgIexStVVze+0Y+gs0LOFo9um6QLrtZfsj/heNSAn50raUEB2w/UGHv
|
||||||
|
wBBLmbIZyRpiDtLinzpzu1fIqj9Y/2CPQeg1p+ZMcg3wMV0JQXyTUvVglWkME0v6fKsG
|
||||||
|
fSRw==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=AtcshlKlEpO25DWX4HdWHYKkg2qmJuRhLpG3jAQhwYc=;
|
||||||
|
b=I5wcigjJOE57JyIN1RgYnvjQfqi/Tu5QohjDJ3zHpF6wCQbLs1mU8eUZ+TYGRp5xwm
|
||||||
|
PxULqfFEi9PFVydtMob1umooK7ndwpJBomSO9+hgGyBluwloY/kUvS3XtnV4b4UD45J/
|
||||||
|
Ny/ylsjBg1K+INdvvcBjsJ62q+kSQWanrORUhTCG8yKu+Uug/vhGdOECiKug4pBAgktX
|
||||||
|
gjqN4aglQeOGaw3UbEG4s6mQuxRdsGY9S1TSistPPCZr+GCvEHf6tG/uc1wmO0zvm3M9
|
||||||
|
5zAnThurIlICc11ju7PpVVH/k5HZNlo7SLO0yxf5Pr03wG+SAnHTeSmT9zPzHWGTfA/6
|
||||||
|
FxdA==
|
||||||
|
X-Gm-Message-State: AOAM532rwFd52QDY7yVuzhsUHKx/vQ3mvqMJUIYRA4CK/9WfDNvEvp4X
|
||||||
|
aLVlWGREIYgvAVa4LwBCuixrg5f/t3I=
|
||||||
|
X-Google-Smtp-Source: ABdhPJxtAb+i00KPB+eZ1AkPEHseGFum+ilW8ElwcmLIJblIT+FK3beKZjdoBl7K4l7X3wfk5ecz7lYtrhU=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:a25:f0b:: with SMTP id 11mr41690159ybp.208.1618297018316;
|
||||||
|
Mon, 12 Apr 2021 23:56:58 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:29 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-13-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 12/16] mm: multigenerational lru: eviction
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-13-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
The eviction consumes old generations. Given an lruvec, the eviction
|
||||||
|
scans the pages on the per-zone lists indexed by either of min_seq[2].
|
||||||
|
It first tries to select a type based on the values of min_seq[2].
|
||||||
|
When anon and file types are both available from the same generation,
|
||||||
|
it selects the one that has a lower refault rate.
|
||||||
|
|
||||||
|
During a scan, the eviction sorts pages according to their generation
|
||||||
|
numbers, if the aging has found them referenced. It also moves pages
|
||||||
|
from the tiers that have higher refault rates than tier 0 to the next
|
||||||
|
generation. When it finds all the per-zone lists of a selected type
|
||||||
|
are empty, the eviction increments min_seq[2] indexed by this selected
|
||||||
|
type.
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
mm/vmscan.c | 341 ++++++++++++++++++++++++++++++++++++++++++++++++++++
|
||||||
|
1 file changed, 341 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/mm/vmscan.c b/mm/vmscan.c
|
||||||
|
index 31e1b4155677..6239b1acd84f 100644
|
||||||
|
--- a/mm/vmscan.c
|
||||||
|
+++ b/mm/vmscan.c
|
||||||
|
@@ -5468,6 +5468,347 @@ static bool walk_mm_list(struct lruvec *lruvec, unsigned long max_seq,
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
+/******************************************************************************
|
||||||
|
+ * the eviction
|
||||||
|
+ ******************************************************************************/
|
||||||
|
+
|
||||||
|
+static bool sort_page(struct page *page, struct lruvec *lruvec, int tier_to_isolate)
|
||||||
|
+{
|
||||||
|
+ bool success;
|
||||||
|
+ int gen = page_lru_gen(page);
|
||||||
|
+ int file = page_is_file_lru(page);
|
||||||
|
+ int zone = page_zonenum(page);
|
||||||
|
+ int tier = lru_tier_from_usage(page_tier_usage(page));
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON_PAGE(gen == -1, page);
|
||||||
|
+ VM_BUG_ON_PAGE(tier_to_isolate < 0, page);
|
||||||
|
+
|
||||||
|
+ /* a lazy-free page that has been written into? */
|
||||||
|
+ if (file && PageDirty(page) && PageAnon(page)) {
|
||||||
|
+ success = lru_gen_deletion(page, lruvec);
|
||||||
|
+ VM_BUG_ON_PAGE(!success, page);
|
||||||
|
+ SetPageSwapBacked(page);
|
||||||
|
+ add_page_to_lru_list_tail(page, lruvec);
|
||||||
|
+ return true;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ /* page_update_gen() has updated the page? */
|
||||||
|
+ if (gen != lru_gen_from_seq(lrugen->min_seq[file])) {
|
||||||
|
+ list_move(&page->lru, &lrugen->lists[gen][file][zone]);
|
||||||
|
+ return true;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ /* activate the page if its tier has a higher refault rate */
|
||||||
|
+ if (tier_to_isolate < tier) {
|
||||||
|
+ int sid = sid_from_seq_or_gen(gen);
|
||||||
|
+
|
||||||
|
+ page_inc_gen(page, lruvec, false);
|
||||||
|
+ WRITE_ONCE(lrugen->activated[sid][file][tier - 1],
|
||||||
|
+ lrugen->activated[sid][file][tier - 1] + thp_nr_pages(page));
|
||||||
|
+ inc_lruvec_state(lruvec, WORKINGSET_ACTIVATE_BASE + file);
|
||||||
|
+ return true;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * A page can't be immediately evicted, and page_inc_gen() will mark it
|
||||||
|
+ * for reclaim and hopefully writeback will write it soon if it's dirty.
|
||||||
|
+ */
|
||||||
|
+ if (PageLocked(page) || PageWriteback(page) || (file && PageDirty(page))) {
|
||||||
|
+ page_inc_gen(page, lruvec, true);
|
||||||
|
+ return true;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return false;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static bool should_skip_page(struct page *page, struct scan_control *sc)
|
||||||
|
+{
|
||||||
|
+ if (!sc->may_unmap && page_mapped(page))
|
||||||
|
+ return true;
|
||||||
|
+
|
||||||
|
+ if (!(sc->may_writepage && (sc->gfp_mask & __GFP_IO)) &&
|
||||||
|
+ (PageDirty(page) || (PageAnon(page) && !PageSwapCache(page))))
|
||||||
|
+ return true;
|
||||||
|
+
|
||||||
|
+ if (!get_page_unless_zero(page))
|
||||||
|
+ return true;
|
||||||
|
+
|
||||||
|
+ if (!TestClearPageLRU(page)) {
|
||||||
|
+ put_page(page);
|
||||||
|
+ return true;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return false;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void isolate_page(struct page *page, struct lruvec *lruvec)
|
||||||
|
+{
|
||||||
|
+ bool success;
|
||||||
|
+
|
||||||
|
+ success = lru_gen_deletion(page, lruvec);
|
||||||
|
+ VM_BUG_ON_PAGE(!success, page);
|
||||||
|
+
|
||||||
|
+ if (PageActive(page)) {
|
||||||
|
+ ClearPageActive(page);
|
||||||
|
+ /* make sure shrink_page_list() rejects this page */
|
||||||
|
+ SetPageReferenced(page);
|
||||||
|
+ return;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ /* make sure shrink_page_list() doesn't try to write this page */
|
||||||
|
+ ClearPageReclaim(page);
|
||||||
|
+ /* make sure shrink_page_list() doesn't reject this page */
|
||||||
|
+ ClearPageReferenced(page);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static int scan_lru_gen_pages(struct lruvec *lruvec, struct scan_control *sc,
|
||||||
|
+ long *nr_to_scan, int file, int tier,
|
||||||
|
+ struct list_head *list)
|
||||||
|
+{
|
||||||
|
+ bool success;
|
||||||
|
+ int gen, zone;
|
||||||
|
+ enum vm_event_item item;
|
||||||
|
+ int sorted = 0;
|
||||||
|
+ int scanned = 0;
|
||||||
|
+ int isolated = 0;
|
||||||
|
+ int batch_size = 0;
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(!list_empty(list));
|
||||||
|
+
|
||||||
|
+ if (get_nr_gens(lruvec, file) == MIN_NR_GENS)
|
||||||
|
+ return -ENOENT;
|
||||||
|
+
|
||||||
|
+ gen = lru_gen_from_seq(lrugen->min_seq[file]);
|
||||||
|
+
|
||||||
|
+ for (zone = sc->reclaim_idx; zone >= 0; zone--) {
|
||||||
|
+ LIST_HEAD(moved);
|
||||||
|
+ int skipped = 0;
|
||||||
|
+ struct list_head *head = &lrugen->lists[gen][file][zone];
|
||||||
|
+
|
||||||
|
+ while (!list_empty(head)) {
|
||||||
|
+ struct page *page = lru_to_page(head);
|
||||||
|
+ int delta = thp_nr_pages(page);
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON_PAGE(PageTail(page), page);
|
||||||
|
+ VM_BUG_ON_PAGE(PageUnevictable(page), page);
|
||||||
|
+ VM_BUG_ON_PAGE(PageActive(page), page);
|
||||||
|
+ VM_BUG_ON_PAGE(page_is_file_lru(page) != file, page);
|
||||||
|
+ VM_BUG_ON_PAGE(page_zonenum(page) != zone, page);
|
||||||
|
+
|
||||||
|
+ prefetchw_prev_lru_page(page, head, flags);
|
||||||
|
+
|
||||||
|
+ scanned += delta;
|
||||||
|
+
|
||||||
|
+ if (sort_page(page, lruvec, tier))
|
||||||
|
+ sorted += delta;
|
||||||
|
+ else if (should_skip_page(page, sc)) {
|
||||||
|
+ list_move(&page->lru, &moved);
|
||||||
|
+ skipped += delta;
|
||||||
|
+ } else {
|
||||||
|
+ isolate_page(page, lruvec);
|
||||||
|
+ list_add(&page->lru, list);
|
||||||
|
+ isolated += delta;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (scanned >= *nr_to_scan || isolated >= SWAP_CLUSTER_MAX ||
|
||||||
|
+ ++batch_size == MAX_BATCH_SIZE)
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ list_splice(&moved, head);
|
||||||
|
+ __count_zid_vm_events(PGSCAN_SKIP, zone, skipped);
|
||||||
|
+
|
||||||
|
+ if (scanned >= *nr_to_scan || isolated >= SWAP_CLUSTER_MAX ||
|
||||||
|
+ batch_size == MAX_BATCH_SIZE)
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ success = try_inc_min_seq(lruvec, file);
|
||||||
|
+
|
||||||
|
+ item = current_is_kswapd() ? PGSCAN_KSWAPD : PGSCAN_DIRECT;
|
||||||
|
+ if (!cgroup_reclaim(sc))
|
||||||
|
+ __count_vm_events(item, scanned);
|
||||||
|
+ __count_memcg_events(lruvec_memcg(lruvec), item, scanned);
|
||||||
|
+ __count_vm_events(PGSCAN_ANON + file, scanned);
|
||||||
|
+
|
||||||
|
+ *nr_to_scan -= scanned;
|
||||||
|
+
|
||||||
|
+ if (*nr_to_scan <= 0 || success || isolated)
|
||||||
|
+ return isolated;
|
||||||
|
+ /*
|
||||||
|
+ * We may have trouble finding eligible pages due to reclaim_idx,
|
||||||
|
+ * may_unmap and may_writepage. The following check makes sure we won't
|
||||||
|
+ * be stuck if we aren't making enough progress.
|
||||||
|
+ */
|
||||||
|
+ return batch_size == MAX_BATCH_SIZE && sorted >= SWAP_CLUSTER_MAX ? 0 : -ENOENT;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static int get_tier_to_isolate(struct lruvec *lruvec, int file)
|
||||||
|
+{
|
||||||
|
+ int tier;
|
||||||
|
+ struct controller_pos sp, pv;
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * Ideally we don't want to evict upper tiers that have higher refault
|
||||||
|
+ * rates. However, we need to leave some margin for the fluctuation in
|
||||||
|
+ * refault rates. So we use a larger gain factor to make sure upper
|
||||||
|
+ * tiers are indeed more active. We choose 2 because the lowest upper
|
||||||
|
+ * tier would have twice of the refault rate of the base tier, according
|
||||||
|
+ * to their numbers of accesses.
|
||||||
|
+ */
|
||||||
|
+ read_controller_pos(&sp, lruvec, file, 0, 1);
|
||||||
|
+ for (tier = 1; tier < MAX_NR_TIERS; tier++) {
|
||||||
|
+ read_controller_pos(&pv, lruvec, file, tier, 2);
|
||||||
|
+ if (!positive_ctrl_err(&sp, &pv))
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return tier - 1;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static int get_type_to_scan(struct lruvec *lruvec, int swappiness, int *tier_to_isolate)
|
||||||
|
+{
|
||||||
|
+ int file, tier;
|
||||||
|
+ struct controller_pos sp, pv;
|
||||||
|
+ int gain[ANON_AND_FILE] = { swappiness, 200 - swappiness };
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * Compare the refault rates between the base tiers of anon and file to
|
||||||
|
+ * determine which type to evict. Also need to compare the refault rates
|
||||||
|
+ * of the upper tiers of the selected type with that of the base tier to
|
||||||
|
+ * determine which tier of the selected type to evict.
|
||||||
|
+ */
|
||||||
|
+ read_controller_pos(&sp, lruvec, 0, 0, gain[0]);
|
||||||
|
+ read_controller_pos(&pv, lruvec, 1, 0, gain[1]);
|
||||||
|
+ file = positive_ctrl_err(&sp, &pv);
|
||||||
|
+
|
||||||
|
+ read_controller_pos(&sp, lruvec, !file, 0, gain[!file]);
|
||||||
|
+ for (tier = 1; tier < MAX_NR_TIERS; tier++) {
|
||||||
|
+ read_controller_pos(&pv, lruvec, file, tier, gain[file]);
|
||||||
|
+ if (!positive_ctrl_err(&sp, &pv))
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ *tier_to_isolate = tier - 1;
|
||||||
|
+
|
||||||
|
+ return file;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static int isolate_lru_gen_pages(struct lruvec *lruvec, struct scan_control *sc,
|
||||||
|
+ int swappiness, long *nr_to_scan, int *type_to_scan,
|
||||||
|
+ struct list_head *list)
|
||||||
|
+{
|
||||||
|
+ int i;
|
||||||
|
+ int file;
|
||||||
|
+ int isolated;
|
||||||
|
+ int tier = -1;
|
||||||
|
+ DEFINE_MAX_SEQ();
|
||||||
|
+ DEFINE_MIN_SEQ();
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(!seq_is_valid(lruvec));
|
||||||
|
+
|
||||||
|
+ if (max_nr_gens(max_seq, min_seq, swappiness) == MIN_NR_GENS)
|
||||||
|
+ return 0;
|
||||||
|
+ /*
|
||||||
|
+ * Try to select a type based on generations and swappiness, and if that
|
||||||
|
+ * fails, fall back to get_type_to_scan(). When anon and file are both
|
||||||
|
+ * available from the same generation, swappiness 200 is interpreted as
|
||||||
|
+ * anon first and swappiness 1 is interpreted as file first.
|
||||||
|
+ */
|
||||||
|
+ file = !swappiness || min_seq[0] > min_seq[1] ||
|
||||||
|
+ (min_seq[0] == min_seq[1] && swappiness != 200 &&
|
||||||
|
+ (swappiness == 1 || get_type_to_scan(lruvec, swappiness, &tier)));
|
||||||
|
+
|
||||||
|
+ if (tier == -1)
|
||||||
|
+ tier = get_tier_to_isolate(lruvec, file);
|
||||||
|
+
|
||||||
|
+ for (i = !swappiness; i < ANON_AND_FILE; i++) {
|
||||||
|
+ isolated = scan_lru_gen_pages(lruvec, sc, nr_to_scan, file, tier, list);
|
||||||
|
+ if (isolated >= 0)
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
+ file = !file;
|
||||||
|
+ tier = get_tier_to_isolate(lruvec, file);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (isolated < 0)
|
||||||
|
+ isolated = *nr_to_scan = 0;
|
||||||
|
+
|
||||||
|
+ *type_to_scan = file;
|
||||||
|
+
|
||||||
|
+ return isolated;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+/* Main function used by foreground, background and user-triggered eviction. */
|
||||||
|
+static bool evict_lru_gen_pages(struct lruvec *lruvec, struct scan_control *sc,
|
||||||
|
+ int swappiness, long *nr_to_scan)
|
||||||
|
+{
|
||||||
|
+ int file;
|
||||||
|
+ int isolated;
|
||||||
|
+ int reclaimed;
|
||||||
|
+ LIST_HEAD(list);
|
||||||
|
+ struct page *page;
|
||||||
|
+ enum vm_event_item item;
|
||||||
|
+ struct reclaim_stat stat;
|
||||||
|
+ struct pglist_data *pgdat = lruvec_pgdat(lruvec);
|
||||||
|
+
|
||||||
|
+ spin_lock_irq(&lruvec->lru_lock);
|
||||||
|
+
|
||||||
|
+ isolated = isolate_lru_gen_pages(lruvec, sc, swappiness, nr_to_scan, &file, &list);
|
||||||
|
+ VM_BUG_ON(list_empty(&list) == !!isolated);
|
||||||
|
+
|
||||||
|
+ if (isolated)
|
||||||
|
+ __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, isolated);
|
||||||
|
+
|
||||||
|
+ spin_unlock_irq(&lruvec->lru_lock);
|
||||||
|
+
|
||||||
|
+ if (!isolated)
|
||||||
|
+ goto done;
|
||||||
|
+
|
||||||
|
+ reclaimed = shrink_page_list(&list, pgdat, sc, &stat, false);
|
||||||
|
+ /*
|
||||||
|
+ * We need to prevent rejected pages from being added back to the same
|
||||||
|
+ * lists they were isolated from. Otherwise we may risk looping on them
|
||||||
|
+ * forever. We use PageActive() or !PageReferenced() && PageWorkingset()
|
||||||
|
+ * to tell lru_gen_addition() not to add them to the oldest generation.
|
||||||
|
+ */
|
||||||
|
+ list_for_each_entry(page, &list, lru) {
|
||||||
|
+ if (PageMlocked(page))
|
||||||
|
+ continue;
|
||||||
|
+
|
||||||
|
+ if (PageReferenced(page)) {
|
||||||
|
+ SetPageActive(page);
|
||||||
|
+ ClearPageReferenced(page);
|
||||||
|
+ } else {
|
||||||
|
+ ClearPageActive(page);
|
||||||
|
+ SetPageWorkingset(page);
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ spin_lock_irq(&lruvec->lru_lock);
|
||||||
|
+
|
||||||
|
+ move_pages_to_lru(lruvec, &list);
|
||||||
|
+
|
||||||
|
+ __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -isolated);
|
||||||
|
+
|
||||||
|
+ item = current_is_kswapd() ? PGSTEAL_KSWAPD : PGSTEAL_DIRECT;
|
||||||
|
+ if (!cgroup_reclaim(sc))
|
||||||
|
+ __count_vm_events(item, reclaimed);
|
||||||
|
+ __count_memcg_events(lruvec_memcg(lruvec), item, reclaimed);
|
||||||
|
+ __count_vm_events(PGSTEAL_ANON + file, reclaimed);
|
||||||
|
+
|
||||||
|
+ spin_unlock_irq(&lruvec->lru_lock);
|
||||||
|
+
|
||||||
|
+ mem_cgroup_uncharge_list(&list);
|
||||||
|
+ free_unref_page_list(&list);
|
||||||
|
+
|
||||||
|
+ sc->nr_reclaimed += reclaimed;
|
||||||
|
+done:
|
||||||
|
+ return *nr_to_scan > 0 && sc->nr_reclaimed < sc->nr_to_reclaim;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
/******************************************************************************
|
||||||
|
* state change
|
||||||
|
******************************************************************************/
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -0,0 +1,479 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id 555A5C43461
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:36 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 3220B60FDB
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:35 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S237146AbhDMG5w (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:52 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44208 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S1345113AbhDMG5T (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:19 -0400
|
||||||
|
Received: from mail-qk1-x749.google.com (mail-qk1-x749.google.com [IPv6:2607:f8b0:4864:20::749])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2F06C061574
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:57:00 -0700 (PDT)
|
||||||
|
Received: by mail-qk1-x749.google.com with SMTP id n191so10007274qka.9
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:57:00 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=ZkZkBuwvqnJ3RNHJhCNbR3K9qvaxv7Y+ShqFogGYPM4=;
|
||||||
|
b=YuhzAl4jnf9B8DPsAHH+IEn6TeEK8tkXzqeIIUWrV6MKmrDwRVWEaxlfpyho7LEl9c
|
||||||
|
Yb/oFtKUHNb53oILQT33tlmVOzpPgzylMipFZ2l5j9KHbcsDyRmB0oqQUa1QZ2PJMYNK
|
||||||
|
fWpCu7LXduAtYRU+OGHNrJHXp576QKDulX5A0p9heBIoiC+vWWS/x+GcCoUk17noPsZC
|
||||||
|
Su6UQCzg6NAfh+hiQZUMluxkVxIZLc0tUeagDPWX8AYcx4WshWUrgTPuDgI3s1vI7M8C
|
||||||
|
K9lLKPVh9VeBFpsycJM4koujbXoOVbPXyfWOhPPIE23ETJR5Yb0o5n5VqtBZYTB2FIhK
|
||||||
|
TPQw==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=ZkZkBuwvqnJ3RNHJhCNbR3K9qvaxv7Y+ShqFogGYPM4=;
|
||||||
|
b=HJexHWiiyDZpfXt8l6/EGRqdtM5RulG6u9GDFQ0UJD2T5+wffn01FXEWBORtSlloEv
|
||||||
|
JVoGieHk3qJawZml66nLtDTbcVGYn6Nqs6EXRfNoDgICSYXdL9NTIaojCI0ZKGyD+IxL
|
||||||
|
PUrN7oxaD8d5VGq+sBRezfThw/BfDEZnlAKs7my6MuuAOjBT8on5yBIH8/j/ICvIEG6I
|
||||||
|
gMkvHTcz3g9emOaHqBpNgMwnOo6Nuia/0YbXpr3xWCmezGFqPyDmC8JYVrlrE7T1sOtt
|
||||||
|
aM45XTkzlUnUnCLZq+dVQPAsg4IjqDoWZ7K2SbzPqIHFPVW2baQfIGX+oVazwypGzv4P
|
||||||
|
ZVCw==
|
||||||
|
X-Gm-Message-State: AOAM531aC+Fl2Rjia4/Q8PO4GqZNI/QjyevwkXojS3zWLyfXFHA97+i9
|
||||||
|
GxwWwyU1OIpVhDJlWVmUnXRSn1z/KbE=
|
||||||
|
X-Google-Smtp-Source: ABdhPJz8UXRBXxFnHjwU9KHKJ57aCdWAlTupj/VfQPjKJc1AKD7gBysJ6np5sy0VpO9JJLZsJRX7gVcs/zM=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:a05:6214:1c0c:: with SMTP id
|
||||||
|
u12mr31837398qvc.24.1618297019786; Mon, 12 Apr 2021 23:56:59 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:30 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-14-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 13/16] mm: multigenerational lru: page reclaim
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-14-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
With the aging and the eviction in place, we can build the page
|
||||||
|
reclaim in a straightforward manner:
|
||||||
|
1) In order to reduce the latency, direct reclaim only invokes the
|
||||||
|
aging when both min_seq[2] reaches max_seq-1; otherwise it invokes
|
||||||
|
the eviction.
|
||||||
|
2) In order to avoid the aging in the direct reclaim path, kswapd
|
||||||
|
does the background aging more proactively. It invokes the aging
|
||||||
|
when either of min_seq[2] reaches max_seq-1; otherwise it invokes
|
||||||
|
the eviction.
|
||||||
|
|
||||||
|
And we add another optimization: pages mapped around a referenced PTE
|
||||||
|
may also have been referenced due to the spatial locality. In the
|
||||||
|
reclaim path, if the rmap finds the PTE mapping a page under reclaim
|
||||||
|
referenced, it calls a new function lru_gen_scan_around() to scan the
|
||||||
|
vicinity of the PTE. And if this new function finds others referenced
|
||||||
|
PTEs, it updates the generation number of the pages mapped by those
|
||||||
|
PTEs.
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
include/linux/mmzone.h | 6 ++
|
||||||
|
mm/rmap.c | 6 ++
|
||||||
|
mm/vmscan.c | 236 +++++++++++++++++++++++++++++++++++++++++
|
||||||
|
3 files changed, 248 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
|
||||||
|
index dcfadf6a8c07..a22e9e40083f 100644
|
||||||
|
--- a/include/linux/mmzone.h
|
||||||
|
+++ b/include/linux/mmzone.h
|
||||||
|
@@ -292,6 +292,7 @@ enum lruvec_flags {
|
||||||
|
};
|
||||||
|
|
||||||
|
struct lruvec;
|
||||||
|
+struct page_vma_mapped_walk;
|
||||||
|
|
||||||
|
#define LRU_GEN_MASK ((BIT(LRU_GEN_WIDTH) - 1) << LRU_GEN_PGOFF)
|
||||||
|
#define LRU_USAGE_MASK ((BIT(LRU_USAGE_WIDTH) - 1) << LRU_USAGE_PGOFF)
|
||||||
|
@@ -384,6 +385,7 @@ struct lrugen {
|
||||||
|
|
||||||
|
void lru_gen_init_lruvec(struct lruvec *lruvec);
|
||||||
|
void lru_gen_set_state(bool enable, bool main, bool swap);
|
||||||
|
+void lru_gen_scan_around(struct page_vma_mapped_walk *pvmw);
|
||||||
|
|
||||||
|
#else /* CONFIG_LRU_GEN */
|
||||||
|
|
||||||
|
@@ -395,6 +397,10 @@ static inline void lru_gen_set_state(bool enable, bool main, bool swap)
|
||||||
|
{
|
||||||
|
}
|
||||||
|
|
||||||
|
+static inline void lru_gen_scan_around(struct page_vma_mapped_walk *pvmw)
|
||||||
|
+{
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
#endif /* CONFIG_LRU_GEN */
|
||||||
|
|
||||||
|
struct lruvec {
|
||||||
|
diff --git a/mm/rmap.c b/mm/rmap.c
|
||||||
|
index b0fc27e77d6d..d600b282ced5 100644
|
||||||
|
--- a/mm/rmap.c
|
||||||
|
+++ b/mm/rmap.c
|
||||||
|
@@ -72,6 +72,7 @@
|
||||||
|
#include <linux/page_idle.h>
|
||||||
|
#include <linux/memremap.h>
|
||||||
|
#include <linux/userfaultfd_k.h>
|
||||||
|
+#include <linux/mm_inline.h>
|
||||||
|
|
||||||
|
#include <asm/tlbflush.h>
|
||||||
|
|
||||||
|
@@ -792,6 +793,11 @@ static bool page_referenced_one(struct page *page, struct vm_area_struct *vma,
|
||||||
|
}
|
||||||
|
|
||||||
|
if (pvmw.pte) {
|
||||||
|
+ /* the multigenerational lru exploits the spatial locality */
|
||||||
|
+ if (lru_gen_enabled() && pte_young(*pvmw.pte)) {
|
||||||
|
+ lru_gen_scan_around(&pvmw);
|
||||||
|
+ referenced++;
|
||||||
|
+ }
|
||||||
|
if (ptep_clear_flush_young_notify(vma, address,
|
||||||
|
pvmw.pte)) {
|
||||||
|
/*
|
||||||
|
diff --git a/mm/vmscan.c b/mm/vmscan.c
|
||||||
|
index 6239b1acd84f..01c475386379 100644
|
||||||
|
--- a/mm/vmscan.c
|
||||||
|
+++ b/mm/vmscan.c
|
||||||
|
@@ -1114,6 +1114,10 @@ static unsigned int shrink_page_list(struct list_head *page_list,
|
||||||
|
if (!sc->may_unmap && page_mapped(page))
|
||||||
|
goto keep_locked;
|
||||||
|
|
||||||
|
+ /* in case the page was found accessed by lru_gen_scan_around() */
|
||||||
|
+ if (lru_gen_enabled() && !ignore_references && PageReferenced(page))
|
||||||
|
+ goto keep_locked;
|
||||||
|
+
|
||||||
|
may_enter_fs = (sc->gfp_mask & __GFP_FS) ||
|
||||||
|
(PageSwapCache(page) && (sc->gfp_mask & __GFP_IO));
|
||||||
|
|
||||||
|
@@ -2233,6 +2237,10 @@ static void prepare_scan_count(pg_data_t *pgdat, struct scan_control *sc)
|
||||||
|
unsigned long file;
|
||||||
|
struct lruvec *target_lruvec;
|
||||||
|
|
||||||
|
+ /* the multigenerational lru doesn't use these counters */
|
||||||
|
+ if (lru_gen_enabled())
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
target_lruvec = mem_cgroup_lruvec(sc->target_mem_cgroup, pgdat);
|
||||||
|
|
||||||
|
/*
|
||||||
|
@@ -2522,6 +2530,19 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
+#ifdef CONFIG_LRU_GEN
|
||||||
|
+static void age_lru_gens(struct pglist_data *pgdat, struct scan_control *sc);
|
||||||
|
+static void shrink_lru_gens(struct lruvec *lruvec, struct scan_control *sc);
|
||||||
|
+#else
|
||||||
|
+static void age_lru_gens(struct pglist_data *pgdat, struct scan_control *sc)
|
||||||
|
+{
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void shrink_lru_gens(struct lruvec *lruvec, struct scan_control *sc)
|
||||||
|
+{
|
||||||
|
+}
|
||||||
|
+#endif
|
||||||
|
+
|
||||||
|
static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
|
||||||
|
{
|
||||||
|
unsigned long nr[NR_LRU_LISTS];
|
||||||
|
@@ -2533,6 +2554,11 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
|
||||||
|
struct blk_plug plug;
|
||||||
|
bool scan_adjusted;
|
||||||
|
|
||||||
|
+ if (lru_gen_enabled()) {
|
||||||
|
+ shrink_lru_gens(lruvec, sc);
|
||||||
|
+ return;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
get_scan_count(lruvec, sc, nr);
|
||||||
|
|
||||||
|
/* Record the original scan target for proportional adjustments later */
|
||||||
|
@@ -2999,6 +3025,10 @@ static void snapshot_refaults(struct mem_cgroup *target_memcg, pg_data_t *pgdat)
|
||||||
|
struct lruvec *target_lruvec;
|
||||||
|
unsigned long refaults;
|
||||||
|
|
||||||
|
+ /* the multigenerational lru doesn't use these counters */
|
||||||
|
+ if (lru_gen_enabled())
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
target_lruvec = mem_cgroup_lruvec(target_memcg, pgdat);
|
||||||
|
refaults = lruvec_page_state(target_lruvec, WORKINGSET_ACTIVATE_ANON);
|
||||||
|
target_lruvec->refaults[0] = refaults;
|
||||||
|
@@ -3373,6 +3403,11 @@ static void age_active_anon(struct pglist_data *pgdat,
|
||||||
|
struct mem_cgroup *memcg;
|
||||||
|
struct lruvec *lruvec;
|
||||||
|
|
||||||
|
+ if (lru_gen_enabled()) {
|
||||||
|
+ age_lru_gens(pgdat, sc);
|
||||||
|
+ return;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
if (!total_swap_pages)
|
||||||
|
return;
|
||||||
|
|
||||||
|
@@ -5468,6 +5503,57 @@ static bool walk_mm_list(struct lruvec *lruvec, unsigned long max_seq,
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
+void lru_gen_scan_around(struct page_vma_mapped_walk *pvmw)
|
||||||
|
+{
|
||||||
|
+ pte_t *pte;
|
||||||
|
+ unsigned long start, end;
|
||||||
|
+ int old_gen, new_gen;
|
||||||
|
+ unsigned long flags;
|
||||||
|
+ struct lruvec *lruvec;
|
||||||
|
+ struct mem_cgroup *memcg;
|
||||||
|
+ struct pglist_data *pgdat = page_pgdat(pvmw->page);
|
||||||
|
+
|
||||||
|
+ lockdep_assert_held(pvmw->ptl);
|
||||||
|
+
|
||||||
|
+ start = max(pvmw->address & PMD_MASK, pvmw->vma->vm_start);
|
||||||
|
+ end = pmd_addr_end(pvmw->address, pvmw->vma->vm_end);
|
||||||
|
+ pte = pvmw->pte - ((pvmw->address - start) >> PAGE_SHIFT);
|
||||||
|
+
|
||||||
|
+ memcg = lock_page_memcg(pvmw->page);
|
||||||
|
+ lruvec = lock_page_lruvec_irqsave(pvmw->page, &flags);
|
||||||
|
+
|
||||||
|
+ new_gen = lru_gen_from_seq(lruvec->evictable.max_seq);
|
||||||
|
+
|
||||||
|
+ for (; start != end; pte++, start += PAGE_SIZE) {
|
||||||
|
+ struct page *page;
|
||||||
|
+ unsigned long pfn = pte_pfn(*pte);
|
||||||
|
+
|
||||||
|
+ if (!pte_present(*pte) || !pte_young(*pte) || is_zero_pfn(pfn))
|
||||||
|
+ continue;
|
||||||
|
+
|
||||||
|
+ if (pfn < pgdat->node_start_pfn || pfn >= pgdat_end_pfn(pgdat))
|
||||||
|
+ continue;
|
||||||
|
+
|
||||||
|
+ page = compound_head(pfn_to_page(pfn));
|
||||||
|
+ if (page_to_nid(page) != pgdat->node_id)
|
||||||
|
+ continue;
|
||||||
|
+
|
||||||
|
+ if (page_memcg_rcu(page) != memcg)
|
||||||
|
+ continue;
|
||||||
|
+ /*
|
||||||
|
+ * We may be holding many locks. So try to finish as fast as
|
||||||
|
+ * possible and leave the accessed and the dirty bits to page
|
||||||
|
+ * table walks.
|
||||||
|
+ */
|
||||||
|
+ old_gen = page_update_gen(page, new_gen);
|
||||||
|
+ if (old_gen >= 0 && old_gen != new_gen)
|
||||||
|
+ lru_gen_update_size(page, lruvec, old_gen, new_gen);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ unlock_page_lruvec_irqrestore(lruvec, flags);
|
||||||
|
+ unlock_page_memcg(pvmw->page);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
/******************************************************************************
|
||||||
|
* the eviction
|
||||||
|
******************************************************************************/
|
||||||
|
@@ -5809,6 +5895,156 @@ static bool evict_lru_gen_pages(struct lruvec *lruvec, struct scan_control *sc,
|
||||||
|
return *nr_to_scan > 0 && sc->nr_reclaimed < sc->nr_to_reclaim;
|
||||||
|
}
|
||||||
|
|
||||||
|
+/******************************************************************************
|
||||||
|
+ * page reclaim
|
||||||
|
+ ******************************************************************************/
|
||||||
|
+
|
||||||
|
+static int get_swappiness(struct lruvec *lruvec)
|
||||||
|
+{
|
||||||
|
+ struct mem_cgroup *memcg = lruvec_memcg(lruvec);
|
||||||
|
+ int swappiness = mem_cgroup_get_nr_swap_pages(memcg) >= (long)SWAP_CLUSTER_MAX ?
|
||||||
|
+ mem_cgroup_swappiness(memcg) : 0;
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(swappiness > 200U);
|
||||||
|
+
|
||||||
|
+ return swappiness;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static unsigned long get_nr_to_scan(struct lruvec *lruvec, struct scan_control *sc,
|
||||||
|
+ int swappiness)
|
||||||
|
+{
|
||||||
|
+ int gen, file, zone;
|
||||||
|
+ long nr_to_scan = 0;
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+ DEFINE_MAX_SEQ();
|
||||||
|
+ DEFINE_MIN_SEQ();
|
||||||
|
+
|
||||||
|
+ lru_add_drain();
|
||||||
|
+
|
||||||
|
+ for (file = !swappiness; file < ANON_AND_FILE; file++) {
|
||||||
|
+ unsigned long seq;
|
||||||
|
+
|
||||||
|
+ for (seq = min_seq[file]; seq <= max_seq; seq++) {
|
||||||
|
+ gen = lru_gen_from_seq(seq);
|
||||||
|
+
|
||||||
|
+ for (zone = 0; zone <= sc->reclaim_idx; zone++)
|
||||||
|
+ nr_to_scan += READ_ONCE(lrugen->sizes[gen][file][zone]);
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ nr_to_scan = max(nr_to_scan, 0L);
|
||||||
|
+ nr_to_scan = round_up(nr_to_scan >> sc->priority, SWAP_CLUSTER_MAX);
|
||||||
|
+
|
||||||
|
+ if (max_nr_gens(max_seq, min_seq, swappiness) > MIN_NR_GENS)
|
||||||
|
+ return nr_to_scan;
|
||||||
|
+
|
||||||
|
+ /* kswapd uses age_lru_gens() */
|
||||||
|
+ if (current_is_kswapd())
|
||||||
|
+ return 0;
|
||||||
|
+
|
||||||
|
+ return walk_mm_list(lruvec, max_seq, sc, swappiness, NULL) ? nr_to_scan : 0;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void shrink_lru_gens(struct lruvec *lruvec, struct scan_control *sc)
|
||||||
|
+{
|
||||||
|
+ struct blk_plug plug;
|
||||||
|
+ unsigned long scanned = 0;
|
||||||
|
+ struct mem_cgroup *memcg = lruvec_memcg(lruvec);
|
||||||
|
+
|
||||||
|
+ blk_start_plug(&plug);
|
||||||
|
+
|
||||||
|
+ while (true) {
|
||||||
|
+ long nr_to_scan;
|
||||||
|
+ int swappiness = sc->may_swap ? get_swappiness(lruvec) : 0;
|
||||||
|
+
|
||||||
|
+ nr_to_scan = get_nr_to_scan(lruvec, sc, swappiness) - scanned;
|
||||||
|
+ if (nr_to_scan < (long)SWAP_CLUSTER_MAX)
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
+ scanned += nr_to_scan;
|
||||||
|
+
|
||||||
|
+ if (!evict_lru_gen_pages(lruvec, sc, swappiness, &nr_to_scan))
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
+ scanned -= nr_to_scan;
|
||||||
|
+
|
||||||
|
+ if (mem_cgroup_below_min(memcg) ||
|
||||||
|
+ (mem_cgroup_below_low(memcg) && !sc->memcg_low_reclaim))
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
+ cond_resched();
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ blk_finish_plug(&plug);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+/******************************************************************************
|
||||||
|
+ * the background aging
|
||||||
|
+ ******************************************************************************/
|
||||||
|
+
|
||||||
|
+static int lru_gen_spread = MIN_NR_GENS;
|
||||||
|
+
|
||||||
|
+static void try_walk_mm_list(struct lruvec *lruvec, struct scan_control *sc)
|
||||||
|
+{
|
||||||
|
+ int gen, file, zone;
|
||||||
|
+ long old_and_young[2] = {};
|
||||||
|
+ struct mm_walk_args args = {};
|
||||||
|
+ int spread = READ_ONCE(lru_gen_spread);
|
||||||
|
+ int swappiness = get_swappiness(lruvec);
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+ DEFINE_MAX_SEQ();
|
||||||
|
+ DEFINE_MIN_SEQ();
|
||||||
|
+
|
||||||
|
+ lru_add_drain();
|
||||||
|
+
|
||||||
|
+ for (file = !swappiness; file < ANON_AND_FILE; file++) {
|
||||||
|
+ unsigned long seq;
|
||||||
|
+
|
||||||
|
+ for (seq = min_seq[file]; seq <= max_seq; seq++) {
|
||||||
|
+ gen = lru_gen_from_seq(seq);
|
||||||
|
+
|
||||||
|
+ for (zone = 0; zone < MAX_NR_ZONES; zone++)
|
||||||
|
+ old_and_young[seq == max_seq] +=
|
||||||
|
+ READ_ONCE(lrugen->sizes[gen][file][zone]);
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ old_and_young[0] = max(old_and_young[0], 0L);
|
||||||
|
+ old_and_young[1] = max(old_and_young[1], 0L);
|
||||||
|
+
|
||||||
|
+ if (old_and_young[0] + old_and_young[1] < SWAP_CLUSTER_MAX)
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ /* try to spread pages out across spread+1 generations */
|
||||||
|
+ if (old_and_young[0] >= old_and_young[1] * spread &&
|
||||||
|
+ min_nr_gens(max_seq, min_seq, swappiness) > max(spread, MIN_NR_GENS))
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ walk_mm_list(lruvec, max_seq, sc, swappiness, &args);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void age_lru_gens(struct pglist_data *pgdat, struct scan_control *sc)
|
||||||
|
+{
|
||||||
|
+ struct mem_cgroup *memcg;
|
||||||
|
+
|
||||||
|
+ VM_BUG_ON(!current_is_kswapd());
|
||||||
|
+
|
||||||
|
+ memcg = mem_cgroup_iter(NULL, NULL, NULL);
|
||||||
|
+ do {
|
||||||
|
+ struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat);
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+
|
||||||
|
+ if (!mem_cgroup_below_min(memcg) &&
|
||||||
|
+ (!mem_cgroup_below_low(memcg) || sc->memcg_low_reclaim))
|
||||||
|
+ try_walk_mm_list(lruvec, sc);
|
||||||
|
+
|
||||||
|
+ if (!mem_cgroup_disabled())
|
||||||
|
+ atomic_add_unless(&lrugen->priority, 1, DEF_PRIORITY);
|
||||||
|
+
|
||||||
|
+ cond_resched();
|
||||||
|
+ } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)));
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
/******************************************************************************
|
||||||
|
* state change
|
||||||
|
******************************************************************************/
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -0,0 +1,575 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id 31B6EC43470
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:41 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 0EBEA613B6
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:41 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S1345183AbhDMG54 (ORCPT
|
||||||
|
<rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:56 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44232 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S1345118AbhDMG5V (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:21 -0400
|
||||||
|
Received: from mail-qt1-x84a.google.com (mail-qt1-x84a.google.com [IPv6:2607:f8b0:4864:20::84a])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C4E9C061756
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:57:02 -0700 (PDT)
|
||||||
|
Received: by mail-qt1-x84a.google.com with SMTP id n21so671176qtv.12
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:57:02 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=bmixlp7YQskn8XLZNyskyxhbwQtBt0A28uS5+zjhVpk=;
|
||||||
|
b=oTGv6qg5bh0RzTaKM94g35MK59AI58jsQR7J4vE6o+6XFd35Jv2Zv+kkD/7cK0zRLR
|
||||||
|
Ck7Cs2RVKnfve+J1zVD+wa928VjcHUKUO3MuA+Cqt34BQiaAdVe26f2184VnzLQ3dvKx
|
||||||
|
z82OqBG1tTUndbk4EMVoB1ATBCP4BFNxWu8pKBJpk/N+I2MMj2uihIz/YB8QlxmuXlys
|
||||||
|
RwrXkZxVCCOUoq3encVAfJmCxv6JvxFy63iWYxkmY36qXToBwfkANHFMZAz4lcdJeH/y
|
||||||
|
xKzfHqA5vpuNdb9vsTsrozNb0UaKCAiSMM4mlUb5dey98HhAeu/oBRqdnxUz2tE+0+pZ
|
||||||
|
Z0TA==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=bmixlp7YQskn8XLZNyskyxhbwQtBt0A28uS5+zjhVpk=;
|
||||||
|
b=UkYkM6FO076Fvajq5s8whylzCbb+PpiJnz1vKUeJJXZu6YCbEYmOvaEH6+8Ddzo48Z
|
||||||
|
TI3guaaJl9qnC428Yf6FHDGXp6NeOwEblCvtmM2G7+umy+SrfwybHn1bw50Lo872DXbJ
|
||||||
|
gYls4kvFU7JQc7MioauxTlqJLpTYk3NcULfKC0GiHMuK9jrn/IsdHkAmjv1ZmsU5rVoi
|
||||||
|
eYiTShjU5iY513/VeoflBCVf0ixDD4Cr5lmm93z+i5Ey1yfqM+TVJShH9XlNUFONylgl
|
||||||
|
TRTw7Ayvc0f+UlyZ1Xa33Rbw0PvwoKpCYxcb1nsFqUtIjWowX+qxSCaJQ1u1t4X2KqnJ
|
||||||
|
hJ9w==
|
||||||
|
X-Gm-Message-State: AOAM530ZLR/zJQAB2NNEhfhm5mkXL3qXLlx6Z2Tl7QIoprpbg2sjKICU
|
||||||
|
bChNTP+0Q6f93KyJAtViluogruaRpm8=
|
||||||
|
X-Google-Smtp-Source: ABdhPJxZXtgEcQpB3hP9KaxSvf/XzXOIauyaS8KaFnbmO18XK2qWP28shfFcib9xRh5nN+wBlrRX+XvdCw8=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:ad4:4894:: with SMTP id bv20mr10806518qvb.34.1618297021214;
|
||||||
|
Mon, 12 Apr 2021 23:57:01 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:31 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-15-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 14/16] mm: multigenerational lru: user interface
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-15-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
Add a sysfs file /sys/kernel/mm/lru_gen/enabled so users can enable
|
||||||
|
and disable the multigenerational lru at runtime.
|
||||||
|
|
||||||
|
Add a sysfs file /sys/kernel/mm/lru_gen/spread so users can spread
|
||||||
|
pages out across multiple generations. More generations make the
|
||||||
|
background aging more aggressive.
|
||||||
|
|
||||||
|
Add a debugfs file /sys/kernel/debug/lru_gen so users can monitor the
|
||||||
|
multigenerational lru and trigger the aging and the eviction. This
|
||||||
|
file has the following output:
|
||||||
|
memcg memcg_id memcg_path
|
||||||
|
node node_id
|
||||||
|
min_gen birth_time anon_size file_size
|
||||||
|
...
|
||||||
|
max_gen birth_time anon_size file_size
|
||||||
|
|
||||||
|
Given a memcg and a node, "min_gen" is the oldest generation (number)
|
||||||
|
and "max_gen" is the youngest. Birth time is in milliseconds. The
|
||||||
|
sizes of anon and file types are in pages.
|
||||||
|
|
||||||
|
This file takes the following input:
|
||||||
|
+ memcg_id node_id gen [swappiness]
|
||||||
|
- memcg_id node_id gen [swappiness] [nr_to_reclaim]
|
||||||
|
|
||||||
|
The first command line accounts referenced pages to generation
|
||||||
|
"max_gen" and creates the next generation "max_gen"+1. In this case,
|
||||||
|
"gen" should be equal to "max_gen". A swap file and a non-zero
|
||||||
|
"swappiness" are required to scan anon type. If swapping is not
|
||||||
|
desired, set vm.swappiness to 0. The second command line evicts
|
||||||
|
generations less than or equal to "gen". In this case, "gen" should be
|
||||||
|
less than "max_gen"-1 as "max_gen" and "max_gen"-1 are active
|
||||||
|
generations and therefore protected from the eviction. Use
|
||||||
|
"nr_to_reclaim" to limit the number of pages to be evicted. Multiple
|
||||||
|
command lines are supported, so does concatenation with delimiters ","
|
||||||
|
and ";".
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
mm/vmscan.c | 405 ++++++++++++++++++++++++++++++++++++++++++++++++++++
|
||||||
|
1 file changed, 405 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/mm/vmscan.c b/mm/vmscan.c
|
||||||
|
index 01c475386379..284e32d897cf 100644
|
||||||
|
--- a/mm/vmscan.c
|
||||||
|
+++ b/mm/vmscan.c
|
||||||
|
@@ -51,6 +51,8 @@
|
||||||
|
#include <linux/psi.h>
|
||||||
|
#include <linux/memory.h>
|
||||||
|
#include <linux/pagewalk.h>
|
||||||
|
+#include <linux/ctype.h>
|
||||||
|
+#include <linux/debugfs.h>
|
||||||
|
|
||||||
|
#include <asm/tlbflush.h>
|
||||||
|
#include <asm/div64.h>
|
||||||
|
@@ -6248,6 +6250,403 @@ static int __meminit __maybe_unused lru_gen_online_mem(struct notifier_block *se
|
||||||
|
return NOTIFY_DONE;
|
||||||
|
}
|
||||||
|
|
||||||
|
+/******************************************************************************
|
||||||
|
+ * sysfs interface
|
||||||
|
+ ******************************************************************************/
|
||||||
|
+
|
||||||
|
+static ssize_t show_lru_gen_spread(struct kobject *kobj, struct kobj_attribute *attr,
|
||||||
|
+ char *buf)
|
||||||
|
+{
|
||||||
|
+ return sprintf(buf, "%d\n", READ_ONCE(lru_gen_spread));
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static ssize_t store_lru_gen_spread(struct kobject *kobj, struct kobj_attribute *attr,
|
||||||
|
+ const char *buf, size_t len)
|
||||||
|
+{
|
||||||
|
+ int spread;
|
||||||
|
+
|
||||||
|
+ if (kstrtoint(buf, 10, &spread) || spread >= MAX_NR_GENS)
|
||||||
|
+ return -EINVAL;
|
||||||
|
+
|
||||||
|
+ WRITE_ONCE(lru_gen_spread, spread);
|
||||||
|
+
|
||||||
|
+ return len;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static struct kobj_attribute lru_gen_spread_attr = __ATTR(
|
||||||
|
+ spread, 0644, show_lru_gen_spread, store_lru_gen_spread
|
||||||
|
+);
|
||||||
|
+
|
||||||
|
+static ssize_t show_lru_gen_enabled(struct kobject *kobj, struct kobj_attribute *attr,
|
||||||
|
+ char *buf)
|
||||||
|
+{
|
||||||
|
+ return snprintf(buf, PAGE_SIZE, "%ld\n", lru_gen_enabled());
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static ssize_t store_lru_gen_enabled(struct kobject *kobj, struct kobj_attribute *attr,
|
||||||
|
+ const char *buf, size_t len)
|
||||||
|
+{
|
||||||
|
+ int enable;
|
||||||
|
+
|
||||||
|
+ if (kstrtoint(buf, 10, &enable))
|
||||||
|
+ return -EINVAL;
|
||||||
|
+
|
||||||
|
+ lru_gen_set_state(enable, true, false);
|
||||||
|
+
|
||||||
|
+ return len;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static struct kobj_attribute lru_gen_enabled_attr = __ATTR(
|
||||||
|
+ enabled, 0644, show_lru_gen_enabled, store_lru_gen_enabled
|
||||||
|
+);
|
||||||
|
+
|
||||||
|
+static struct attribute *lru_gen_attrs[] = {
|
||||||
|
+ &lru_gen_spread_attr.attr,
|
||||||
|
+ &lru_gen_enabled_attr.attr,
|
||||||
|
+ NULL
|
||||||
|
+};
|
||||||
|
+
|
||||||
|
+static struct attribute_group lru_gen_attr_group = {
|
||||||
|
+ .name = "lru_gen",
|
||||||
|
+ .attrs = lru_gen_attrs,
|
||||||
|
+};
|
||||||
|
+
|
||||||
|
+/******************************************************************************
|
||||||
|
+ * debugfs interface
|
||||||
|
+ ******************************************************************************/
|
||||||
|
+
|
||||||
|
+static void *lru_gen_seq_start(struct seq_file *m, loff_t *pos)
|
||||||
|
+{
|
||||||
|
+ struct mem_cgroup *memcg;
|
||||||
|
+ loff_t nr_to_skip = *pos;
|
||||||
|
+
|
||||||
|
+ m->private = kzalloc(PATH_MAX, GFP_KERNEL);
|
||||||
|
+ if (!m->private)
|
||||||
|
+ return ERR_PTR(-ENOMEM);
|
||||||
|
+
|
||||||
|
+ memcg = mem_cgroup_iter(NULL, NULL, NULL);
|
||||||
|
+ do {
|
||||||
|
+ int nid;
|
||||||
|
+
|
||||||
|
+ for_each_node_state(nid, N_MEMORY) {
|
||||||
|
+ if (!nr_to_skip--)
|
||||||
|
+ return mem_cgroup_lruvec(memcg, NODE_DATA(nid));
|
||||||
|
+ }
|
||||||
|
+ } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)));
|
||||||
|
+
|
||||||
|
+ return NULL;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void lru_gen_seq_stop(struct seq_file *m, void *v)
|
||||||
|
+{
|
||||||
|
+ if (!IS_ERR_OR_NULL(v))
|
||||||
|
+ mem_cgroup_iter_break(NULL, lruvec_memcg(v));
|
||||||
|
+
|
||||||
|
+ kfree(m->private);
|
||||||
|
+ m->private = NULL;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void *lru_gen_seq_next(struct seq_file *m, void *v, loff_t *pos)
|
||||||
|
+{
|
||||||
|
+ int nid = lruvec_pgdat(v)->node_id;
|
||||||
|
+ struct mem_cgroup *memcg = lruvec_memcg(v);
|
||||||
|
+
|
||||||
|
+ ++*pos;
|
||||||
|
+
|
||||||
|
+ nid = next_memory_node(nid);
|
||||||
|
+ if (nid == MAX_NUMNODES) {
|
||||||
|
+ memcg = mem_cgroup_iter(NULL, memcg, NULL);
|
||||||
|
+ if (!memcg)
|
||||||
|
+ return NULL;
|
||||||
|
+
|
||||||
|
+ nid = first_memory_node;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return mem_cgroup_lruvec(memcg, NODE_DATA(nid));
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void lru_gen_seq_show_full(struct seq_file *m, struct lruvec *lruvec,
|
||||||
|
+ unsigned long max_seq, unsigned long *min_seq,
|
||||||
|
+ unsigned long seq)
|
||||||
|
+{
|
||||||
|
+ int i;
|
||||||
|
+ int file, tier;
|
||||||
|
+ int sid = sid_from_seq_or_gen(seq);
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+ int nid = lruvec_pgdat(lruvec)->node_id;
|
||||||
|
+ struct mem_cgroup *memcg = lruvec_memcg(lruvec);
|
||||||
|
+ struct lru_gen_mm_list *mm_list = get_mm_list(memcg);
|
||||||
|
+
|
||||||
|
+ for (tier = 0; tier < MAX_NR_TIERS; tier++) {
|
||||||
|
+ seq_printf(m, " %10d", tier);
|
||||||
|
+ for (file = 0; file < ANON_AND_FILE; file++) {
|
||||||
|
+ unsigned long n[3] = {};
|
||||||
|
+
|
||||||
|
+ if (seq == max_seq) {
|
||||||
|
+ n[0] = READ_ONCE(lrugen->avg_refaulted[file][tier]);
|
||||||
|
+ n[1] = READ_ONCE(lrugen->avg_total[file][tier]);
|
||||||
|
+
|
||||||
|
+ seq_printf(m, " %10luR %10luT %10lu ", n[0], n[1], n[2]);
|
||||||
|
+ } else if (seq == min_seq[file] || NR_STAT_GENS > 1) {
|
||||||
|
+ n[0] = atomic_long_read(&lrugen->refaulted[sid][file][tier]);
|
||||||
|
+ n[1] = atomic_long_read(&lrugen->evicted[sid][file][tier]);
|
||||||
|
+ if (tier)
|
||||||
|
+ n[2] = READ_ONCE(lrugen->activated[sid][file][tier - 1]);
|
||||||
|
+
|
||||||
|
+ seq_printf(m, " %10lur %10lue %10lua", n[0], n[1], n[2]);
|
||||||
|
+ } else
|
||||||
|
+ seq_puts(m, " 0 0 0 ");
|
||||||
|
+ }
|
||||||
|
+ seq_putc(m, '\n');
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ seq_puts(m, " ");
|
||||||
|
+ for (i = 0; i < NR_MM_STATS; i++) {
|
||||||
|
+ if (seq == max_seq && NR_STAT_GENS == 1)
|
||||||
|
+ seq_printf(m, " %10lu%c", READ_ONCE(mm_list->nodes[nid].stats[sid][i]),
|
||||||
|
+ toupper(MM_STAT_CODES[i]));
|
||||||
|
+ else if (seq != max_seq && NR_STAT_GENS > 1)
|
||||||
|
+ seq_printf(m, " %10lu%c", READ_ONCE(mm_list->nodes[nid].stats[sid][i]),
|
||||||
|
+ MM_STAT_CODES[i]);
|
||||||
|
+ else
|
||||||
|
+ seq_puts(m, " 0 ");
|
||||||
|
+ }
|
||||||
|
+ seq_putc(m, '\n');
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static int lru_gen_seq_show(struct seq_file *m, void *v)
|
||||||
|
+{
|
||||||
|
+ unsigned long seq;
|
||||||
|
+ bool full = !debugfs_real_fops(m->file)->write;
|
||||||
|
+ struct lruvec *lruvec = v;
|
||||||
|
+ struct lrugen *lrugen = &lruvec->evictable;
|
||||||
|
+ int nid = lruvec_pgdat(lruvec)->node_id;
|
||||||
|
+ struct mem_cgroup *memcg = lruvec_memcg(lruvec);
|
||||||
|
+ DEFINE_MAX_SEQ();
|
||||||
|
+ DEFINE_MIN_SEQ();
|
||||||
|
+
|
||||||
|
+ if (nid == first_memory_node) {
|
||||||
|
+#ifdef CONFIG_MEMCG
|
||||||
|
+ if (memcg)
|
||||||
|
+ cgroup_path(memcg->css.cgroup, m->private, PATH_MAX);
|
||||||
|
+#endif
|
||||||
|
+ seq_printf(m, "memcg %5hu %s\n",
|
||||||
|
+ mem_cgroup_id(memcg), (char *)m->private);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ seq_printf(m, " node %5d %10d\n", nid, atomic_read(&lrugen->priority));
|
||||||
|
+
|
||||||
|
+ seq = full ? (max_seq < MAX_NR_GENS ? 0 : max_seq - MAX_NR_GENS + 1) :
|
||||||
|
+ min(min_seq[0], min_seq[1]);
|
||||||
|
+
|
||||||
|
+ for (; seq <= max_seq; seq++) {
|
||||||
|
+ int gen, file, zone;
|
||||||
|
+ unsigned int msecs;
|
||||||
|
+
|
||||||
|
+ gen = lru_gen_from_seq(seq);
|
||||||
|
+ msecs = jiffies_to_msecs(jiffies - READ_ONCE(lrugen->timestamps[gen]));
|
||||||
|
+
|
||||||
|
+ seq_printf(m, " %10lu %10u", seq, msecs);
|
||||||
|
+
|
||||||
|
+ for (file = 0; file < ANON_AND_FILE; file++) {
|
||||||
|
+ long size = 0;
|
||||||
|
+
|
||||||
|
+ if (seq < min_seq[file]) {
|
||||||
|
+ seq_puts(m, " -0 ");
|
||||||
|
+ continue;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ for (zone = 0; zone < MAX_NR_ZONES; zone++)
|
||||||
|
+ size += READ_ONCE(lrugen->sizes[gen][file][zone]);
|
||||||
|
+
|
||||||
|
+ seq_printf(m, " %10lu ", max(size, 0L));
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ seq_putc(m, '\n');
|
||||||
|
+
|
||||||
|
+ if (full)
|
||||||
|
+ lru_gen_seq_show_full(m, lruvec, max_seq, min_seq, seq);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return 0;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static const struct seq_operations lru_gen_seq_ops = {
|
||||||
|
+ .start = lru_gen_seq_start,
|
||||||
|
+ .stop = lru_gen_seq_stop,
|
||||||
|
+ .next = lru_gen_seq_next,
|
||||||
|
+ .show = lru_gen_seq_show,
|
||||||
|
+};
|
||||||
|
+
|
||||||
|
+static int advance_max_seq(struct lruvec *lruvec, unsigned long seq, int swappiness)
|
||||||
|
+{
|
||||||
|
+ struct mm_walk_args args = {};
|
||||||
|
+ struct scan_control sc = {
|
||||||
|
+ .target_mem_cgroup = lruvec_memcg(lruvec),
|
||||||
|
+ };
|
||||||
|
+ DEFINE_MAX_SEQ();
|
||||||
|
+
|
||||||
|
+ if (seq == max_seq)
|
||||||
|
+ walk_mm_list(lruvec, max_seq, &sc, swappiness, &args);
|
||||||
|
+
|
||||||
|
+ return seq > max_seq ? -EINVAL : 0;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static int advance_min_seq(struct lruvec *lruvec, unsigned long seq, int swappiness,
|
||||||
|
+ unsigned long nr_to_reclaim)
|
||||||
|
+{
|
||||||
|
+ struct blk_plug plug;
|
||||||
|
+ int err = -EINTR;
|
||||||
|
+ long nr_to_scan = LONG_MAX;
|
||||||
|
+ struct scan_control sc = {
|
||||||
|
+ .nr_to_reclaim = nr_to_reclaim,
|
||||||
|
+ .target_mem_cgroup = lruvec_memcg(lruvec),
|
||||||
|
+ .may_writepage = 1,
|
||||||
|
+ .may_unmap = 1,
|
||||||
|
+ .may_swap = 1,
|
||||||
|
+ .reclaim_idx = MAX_NR_ZONES - 1,
|
||||||
|
+ .gfp_mask = GFP_KERNEL,
|
||||||
|
+ };
|
||||||
|
+ DEFINE_MAX_SEQ();
|
||||||
|
+
|
||||||
|
+ if (seq >= max_seq - 1)
|
||||||
|
+ return -EINVAL;
|
||||||
|
+
|
||||||
|
+ blk_start_plug(&plug);
|
||||||
|
+
|
||||||
|
+ while (!signal_pending(current)) {
|
||||||
|
+ DEFINE_MIN_SEQ();
|
||||||
|
+
|
||||||
|
+ if (seq < min(min_seq[!swappiness], min_seq[swappiness < 200]) ||
|
||||||
|
+ !evict_lru_gen_pages(lruvec, &sc, swappiness, &nr_to_scan)) {
|
||||||
|
+ err = 0;
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ cond_resched();
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ blk_finish_plug(&plug);
|
||||||
|
+
|
||||||
|
+ return err;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static int advance_seq(char cmd, int memcg_id, int nid, unsigned long seq,
|
||||||
|
+ int swappiness, unsigned long nr_to_reclaim)
|
||||||
|
+{
|
||||||
|
+ struct lruvec *lruvec;
|
||||||
|
+ int err = -EINVAL;
|
||||||
|
+ struct mem_cgroup *memcg = NULL;
|
||||||
|
+
|
||||||
|
+ if (!mem_cgroup_disabled()) {
|
||||||
|
+ rcu_read_lock();
|
||||||
|
+ memcg = mem_cgroup_from_id(memcg_id);
|
||||||
|
+#ifdef CONFIG_MEMCG
|
||||||
|
+ if (memcg && !css_tryget(&memcg->css))
|
||||||
|
+ memcg = NULL;
|
||||||
|
+#endif
|
||||||
|
+ rcu_read_unlock();
|
||||||
|
+
|
||||||
|
+ if (!memcg)
|
||||||
|
+ goto done;
|
||||||
|
+ }
|
||||||
|
+ if (memcg_id != mem_cgroup_id(memcg))
|
||||||
|
+ goto done;
|
||||||
|
+
|
||||||
|
+ if (nid < 0 || nid >= MAX_NUMNODES || !node_state(nid, N_MEMORY))
|
||||||
|
+ goto done;
|
||||||
|
+
|
||||||
|
+ lruvec = mem_cgroup_lruvec(memcg, NODE_DATA(nid));
|
||||||
|
+
|
||||||
|
+ if (swappiness == -1)
|
||||||
|
+ swappiness = get_swappiness(lruvec);
|
||||||
|
+ else if (swappiness > 200U)
|
||||||
|
+ goto done;
|
||||||
|
+
|
||||||
|
+ switch (cmd) {
|
||||||
|
+ case '+':
|
||||||
|
+ err = advance_max_seq(lruvec, seq, swappiness);
|
||||||
|
+ break;
|
||||||
|
+ case '-':
|
||||||
|
+ err = advance_min_seq(lruvec, seq, swappiness, nr_to_reclaim);
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+done:
|
||||||
|
+ mem_cgroup_put(memcg);
|
||||||
|
+
|
||||||
|
+ return err;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static ssize_t lru_gen_seq_write(struct file *file, const char __user *src,
|
||||||
|
+ size_t len, loff_t *pos)
|
||||||
|
+{
|
||||||
|
+ void *buf;
|
||||||
|
+ char *cur, *next;
|
||||||
|
+ int err = 0;
|
||||||
|
+
|
||||||
|
+ buf = kvmalloc(len + 1, GFP_USER);
|
||||||
|
+ if (!buf)
|
||||||
|
+ return -ENOMEM;
|
||||||
|
+
|
||||||
|
+ if (copy_from_user(buf, src, len)) {
|
||||||
|
+ kvfree(buf);
|
||||||
|
+ return -EFAULT;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ next = buf;
|
||||||
|
+ next[len] = '\0';
|
||||||
|
+
|
||||||
|
+ while ((cur = strsep(&next, ",;\n"))) {
|
||||||
|
+ int n;
|
||||||
|
+ int end;
|
||||||
|
+ char cmd;
|
||||||
|
+ int memcg_id;
|
||||||
|
+ int nid;
|
||||||
|
+ unsigned long seq;
|
||||||
|
+ int swappiness = -1;
|
||||||
|
+ unsigned long nr_to_reclaim = -1;
|
||||||
|
+
|
||||||
|
+ cur = skip_spaces(cur);
|
||||||
|
+ if (!*cur)
|
||||||
|
+ continue;
|
||||||
|
+
|
||||||
|
+ n = sscanf(cur, "%c %u %u %lu %n %u %n %lu %n", &cmd, &memcg_id, &nid,
|
||||||
|
+ &seq, &end, &swappiness, &end, &nr_to_reclaim, &end);
|
||||||
|
+ if (n < 4 || cur[end]) {
|
||||||
|
+ err = -EINVAL;
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ err = advance_seq(cmd, memcg_id, nid, seq, swappiness, nr_to_reclaim);
|
||||||
|
+ if (err)
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ kvfree(buf);
|
||||||
|
+
|
||||||
|
+ return err ? : len;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static int lru_gen_seq_open(struct inode *inode, struct file *file)
|
||||||
|
+{
|
||||||
|
+ return seq_open(file, &lru_gen_seq_ops);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static const struct file_operations lru_gen_rw_fops = {
|
||||||
|
+ .open = lru_gen_seq_open,
|
||||||
|
+ .read = seq_read,
|
||||||
|
+ .write = lru_gen_seq_write,
|
||||||
|
+ .llseek = seq_lseek,
|
||||||
|
+ .release = seq_release,
|
||||||
|
+};
|
||||||
|
+
|
||||||
|
+static const struct file_operations lru_gen_ro_fops = {
|
||||||
|
+ .open = lru_gen_seq_open,
|
||||||
|
+ .read = seq_read,
|
||||||
|
+ .llseek = seq_lseek,
|
||||||
|
+ .release = seq_release,
|
||||||
|
+};
|
||||||
|
+
|
||||||
|
/******************************************************************************
|
||||||
|
* initialization
|
||||||
|
******************************************************************************/
|
||||||
|
@@ -6291,6 +6690,12 @@ static int __init init_lru_gen(void)
|
||||||
|
if (hotplug_memory_notifier(lru_gen_online_mem, 0))
|
||||||
|
pr_err("lru_gen: failed to subscribe hotplug notifications\n");
|
||||||
|
|
||||||
|
+ if (sysfs_create_group(mm_kobj, &lru_gen_attr_group))
|
||||||
|
+ pr_err("lru_gen: failed to create sysfs group\n");
|
||||||
|
+
|
||||||
|
+ debugfs_create_file("lru_gen", 0644, NULL, NULL, &lru_gen_rw_fops);
|
||||||
|
+ debugfs_create_file("lru_gen_full", 0444, NULL, NULL, &lru_gen_ro_fops);
|
||||||
|
+
|
||||||
|
return 0;
|
||||||
|
};
|
||||||
|
/*
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -0,0 +1,175 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id 922C4C43461
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:45 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 7572660FDB
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:45 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S1345185AbhDMG6D (ORCPT
|
||||||
|
<rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:58:03 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44240 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S1345121AbhDMG5X (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:23 -0400
|
||||||
|
Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90316C061574
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:57:03 -0700 (PDT)
|
||||||
|
Received: by mail-yb1-xb4a.google.com with SMTP id c4so2057580ybp.6
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:57:03 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=OWomzY5O6nIPdL3TO8CK9fbu3idsDsiJdhlQbcNCNmk=;
|
||||||
|
b=hLHfxFzp5QFiDV0NCweRKZIoXrgJbYlQcW+yuS+vLMPcNKKc255Fg3tjNqfooV/OLd
|
||||||
|
U6CQ3iwK8H6zMls3pFdMBN0NLbmWj6RWEYNi/DCM+PrHNrSzMnt6S2Lg4zq0wvg3486H
|
||||||
|
+sx4x6j4kxGh5x9L9qgA+TxXylPtgpu5ds2+dsX0pD8ntrVyPxV7AvsnWB6UiW1V9ZVk
|
||||||
|
/LsyUFz5OtLMbBTake9P8xyrPjX9eTcGBEel6+oOeQ/dZObXKYPRK8qTg6fk2FWETrnD
|
||||||
|
Zbg2sgYJYwkCg4UC1pmuVjLWdyS1iObkTDP9YTfrBRXxxrrkE/8ced456rnZvUMSg1he
|
||||||
|
l4YQ==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=OWomzY5O6nIPdL3TO8CK9fbu3idsDsiJdhlQbcNCNmk=;
|
||||||
|
b=V7wMyHi072dce6ZnPpEv7/vgyxfGH4iYzC8xiwylgcN9u4SyLFR8AsWrgIpv2mVFrC
|
||||||
|
H9+fkRd2whFAERf06443LAgIA7SIiztKoG2b9INedj5rird9Kes1pDEafZP04/dNwIll
|
||||||
|
hJeAUb9N1qmeVv6vZIZsKpWDp0D/wa5gCBze6PfyzFRL82n1sUxPv6wP/l9ClegByA3J
|
||||||
|
8il8uC4X+iRjk3XACwZG+JrS7i4d2Q+qkj3ANVNNGNcDhaHbgsucUpMzpVDJleKoVoBL
|
||||||
|
Luvyo5PCSA38KyflkQS+SzfwNoU60rrlTa6oBMVzyUgoPqp3RNtFIp4yyJUcill3qvqi
|
||||||
|
5ymw==
|
||||||
|
X-Gm-Message-State: AOAM532nNDpt3iSLmHBos2xzSSPUScQwSS+AZ2hM1blhHygr52zHuQkq
|
||||||
|
triAdzH/rSQIePQ4klFd5q1eM3rRWnU=
|
||||||
|
X-Google-Smtp-Source: ABdhPJzyLPRGqf29+Ytj/xVq/duL5XVOMgJinIYyL+dmRy0rCrFAsDcush6F7fQT1oukQxSVakciHbYtiFU=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:a25:cc90:: with SMTP id l138mr2006126ybf.150.1618297022801;
|
||||||
|
Mon, 12 Apr 2021 23:57:02 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:32 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-16-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 15/16] mm: multigenerational lru: Kconfig
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-16-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
Add configuration options for the multigenerational lru.
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
mm/Kconfig | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
||||||
|
1 file changed, 55 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/mm/Kconfig b/mm/Kconfig
|
||||||
|
index 24c045b24b95..0be1c6c90cc0 100644
|
||||||
|
--- a/mm/Kconfig
|
||||||
|
+++ b/mm/Kconfig
|
||||||
|
@@ -872,4 +872,59 @@ config MAPPING_DIRTY_HELPERS
|
||||||
|
config KMAP_LOCAL
|
||||||
|
bool
|
||||||
|
|
||||||
|
+config LRU_GEN
|
||||||
|
+ bool "Multigenerational LRU"
|
||||||
|
+ depends on MMU
|
||||||
|
+ help
|
||||||
|
+ A high performance LRU implementation to heavily overcommit workloads
|
||||||
|
+ that are not IO bound. See Documentation/vm/multigen_lru.rst for
|
||||||
|
+ details.
|
||||||
|
+
|
||||||
|
+ Warning: do not enable this option unless you plan to use it because
|
||||||
|
+ it introduces a small per-process and per-memcg and per-node memory
|
||||||
|
+ overhead.
|
||||||
|
+
|
||||||
|
+config NR_LRU_GENS
|
||||||
|
+ int "Max number of generations"
|
||||||
|
+ depends on LRU_GEN
|
||||||
|
+ range 4 31
|
||||||
|
+ default 7
|
||||||
|
+ help
|
||||||
|
+ This will use order_base_2(N+1) spare bits from page flags.
|
||||||
|
+
|
||||||
|
+ Warning: do not use numbers larger than necessary because each
|
||||||
|
+ generation introduces a small per-node and per-memcg memory overhead.
|
||||||
|
+
|
||||||
|
+config TIERS_PER_GEN
|
||||||
|
+ int "Number of tiers per generation"
|
||||||
|
+ depends on LRU_GEN
|
||||||
|
+ range 2 5
|
||||||
|
+ default 4
|
||||||
|
+ help
|
||||||
|
+ This will use N-2 spare bits from page flags.
|
||||||
|
+
|
||||||
|
+ Higher values generally offer better protection to active pages under
|
||||||
|
+ heavy buffered I/O workloads.
|
||||||
|
+
|
||||||
|
+config LRU_GEN_ENABLED
|
||||||
|
+ bool "Turn on by default"
|
||||||
|
+ depends on LRU_GEN
|
||||||
|
+ help
|
||||||
|
+ The default value of /sys/kernel/mm/lru_gen/enabled is 0. This option
|
||||||
|
+ changes it to 1.
|
||||||
|
+
|
||||||
|
+ Warning: the default value is the fast path. See
|
||||||
|
+ Documentation/static-keys.txt for details.
|
||||||
|
+
|
||||||
|
+config LRU_GEN_STATS
|
||||||
|
+ bool "Full stats for debugging"
|
||||||
|
+ depends on LRU_GEN
|
||||||
|
+ help
|
||||||
|
+ This option keeps full stats for each generation, which can be read
|
||||||
|
+ from /sys/kernel/debug/lru_gen_full.
|
||||||
|
+
|
||||||
|
+ Warning: do not enable this option unless you plan to use it because
|
||||||
|
+ it introduces an additional small per-process and per-memcg and
|
||||||
|
+ per-node memory overhead.
|
||||||
|
+
|
||||||
|
endmenu
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -0,0 +1,322 @@
|
|||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <linux-kernel-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
|
||||||
|
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
|
||||||
|
INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,
|
||||||
|
USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id 8D664C433B4
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:53 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 5CED260FDB
|
||||||
|
for <linux-kernel@archiver.kernel.org>; Tue, 13 Apr 2021 06:57:53 +0000 (UTC)
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S244503AbhDMG6L (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:58:11 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44250 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S1345123AbhDMG5Y (ORCPT
|
||||||
|
<rfc822;linux-kernel@vger.kernel.org>);
|
||||||
|
Tue, 13 Apr 2021 02:57:24 -0400
|
||||||
|
Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 14C06C061756
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:57:05 -0700 (PDT)
|
||||||
|
Received: by mail-yb1-xb49.google.com with SMTP id p75so9209574ybc.8
|
||||||
|
for <linux-kernel@vger.kernel.org>; Mon, 12 Apr 2021 23:57:05 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=google.com; s=20161025;
|
||||||
|
h=date:in-reply-to:message-id:mime-version:references:subject:from:to
|
||||||
|
:cc;
|
||||||
|
bh=fZsS4S+ppDN6vse6LQilTb+995ZpejDyoXEkWEzhPiI=;
|
||||||
|
b=JPzEmLg8IXqkikE/b+k7FNKSdKIPd2lLmXlP9sfI87JvOkw09qdZ+KRrlaAD+a9Dhn
|
||||||
|
005sbjcbFZ0lFEPYPSKaDUzlN3hBr3DSo7pYAg76+SLl3Ga5vXEbxhKRzSwelQO0SjpX
|
||||||
|
rhHL0KytAzNOPmRXNi0zkAQkCW4EAqyrBAkMJuC7dTB6jIRG6ER1dzInKps5oaOL1wQs
|
||||||
|
HLIiBt2/Ahnea89fcjAFJPIS7nNG2lwTqqUVTkoanckNkavhBDYk0VsP07i7LdiYi9zN
|
||||||
|
+LOuJNV+snejmLdfr2/3+aMXbxqjF2clhWnkNv/9X/ng5LI35tZxiwJOcncdT6c0vONU
|
||||||
|
rPQA==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:in-reply-to:message-id:mime-version
|
||||||
|
:references:subject:from:to:cc;
|
||||||
|
bh=fZsS4S+ppDN6vse6LQilTb+995ZpejDyoXEkWEzhPiI=;
|
||||||
|
b=Mmy7jkv8AlhXjPNjblEwvM3ZtDGk7NKvJ6rsLmF6f0BWgbZq1tIB6pdyHgFU312oCj
|
||||||
|
y4lT+2OfaNXkHdc1m9GGWuWIiWBODWDms6SOZyoSt3DzZKzcdOzZvjUSS2YPZRhtMBP8
|
||||||
|
dB9FKMTZmwSiNzB4tdOneaAVzDRY5bshb8bACVfCaWFqtKUYRJ7IUedFh3omjJHSY8FV
|
||||||
|
6STGtMN3VWQZjRvtH7TufrAvCfWEWJ4oYHPhHmGG2DIS+7aQ6CbYgjel6Xiw7E9VkAg2
|
||||||
|
JoiFRDcRNv+ByQW+uYw+Z96cYJm5wf4hkkC+/iCib2vWT1vXRgZ7CRYsjyRwZmHJd2Jy
|
||||||
|
fKJA==
|
||||||
|
X-Gm-Message-State: AOAM532ohDzhQEIUgvNgG4R8COEdtptVwp/WFnYFKQYURGql6xBpawoF
|
||||||
|
Y2GA+8fymXJP5OJ1UDw0RBDHBeXkM1Q=
|
||||||
|
X-Google-Smtp-Source: ABdhPJzHOTHYLMuXC88wBZEF39dm7Sun3+0TVIBRLg85pDR3z2FX1I51OcfzuM68n03ioC4rVU3FQw4etPM=
|
||||||
|
X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:d02d:cccc:9ebe:9fe9])
|
||||||
|
(user=yuzhao job=sendgmr) by 2002:a25:e00f:: with SMTP id x15mr25695207ybg.85.1618297024186;
|
||||||
|
Mon, 12 Apr 2021 23:57:04 -0700 (PDT)
|
||||||
|
Date: Tue, 13 Apr 2021 00:56:33 -0600
|
||||||
|
In-Reply-To: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
Message-Id: <20210413065633.2782273-17-yuzhao@google.com>
|
||||||
|
Mime-Version: 1.0
|
||||||
|
References: <20210413065633.2782273-1-yuzhao@google.com>
|
||||||
|
X-Mailer: git-send-email 2.31.1.295.g9ea45b61b8-goog
|
||||||
|
Subject: [PATCH v2 16/16] mm: multigenerational lru: documentation
|
||||||
|
From: Yu Zhao <yuzhao@google.com>
|
||||||
|
To: linux-mm@kvack.org
|
||||||
|
Cc: Alex Shi <alexs@kernel.org>, Andi Kleen <ak@linux.intel.com>,
|
||||||
|
Andrew Morton <akpm@linux-foundation.org>,
|
||||||
|
Benjamin Manes <ben.manes@gmail.com>,
|
||||||
|
Dave Chinner <david@fromorbit.com>,
|
||||||
|
Dave Hansen <dave.hansen@linux.intel.com>,
|
||||||
|
Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
|
||||||
|
Johannes Weiner <hannes@cmpxchg.org>,
|
||||||
|
Jonathan Corbet <corbet@lwn.net>,
|
||||||
|
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
|
||||||
|
Matthew Wilcox <willy@infradead.org>,
|
||||||
|
Mel Gorman <mgorman@suse.de>,
|
||||||
|
Miaohe Lin <linmiaohe@huawei.com>,
|
||||||
|
Michael Larabel <michael@michaellarabel.com>,
|
||||||
|
Michal Hocko <mhocko@suse.com>,
|
||||||
|
Michel Lespinasse <michel@lespinasse.org>,
|
||||||
|
Rik van Riel <riel@surriel.com>,
|
||||||
|
Roman Gushchin <guro@fb.com>,
|
||||||
|
Rong Chen <rong.a.chen@intel.com>,
|
||||||
|
SeongJae Park <sjpark@amazon.de>,
|
||||||
|
Tim Chen <tim.c.chen@linux.intel.com>,
|
||||||
|
Vlastimil Babka <vbabka@suse.cz>,
|
||||||
|
Yang Shi <shy828301@gmail.com>,
|
||||||
|
Ying Huang <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
|
||||||
|
linux-kernel@vger.kernel.org, lkp@lists.01.org,
|
||||||
|
page-reclaim@google.com, Yu Zhao <yuzhao@google.com>
|
||||||
|
Content-Type: text/plain; charset="UTF-8"
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-kernel.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-kernel@vger.kernel.org
|
||||||
|
Archived-At: <https://lore.kernel.org/lkml/20210413065633.2782273-17-yuzhao@google.com/>
|
||||||
|
List-Archive: <https://lore.kernel.org/lkml/>
|
||||||
|
List-Post: <mailto:linux-kernel@vger.kernel.org>
|
||||||
|
|
||||||
|
Add Documentation/vm/multigen_lru.rst.
|
||||||
|
|
||||||
|
Signed-off-by: Yu Zhao <yuzhao@google.com>
|
||||||
|
---
|
||||||
|
Documentation/vm/index.rst | 1 +
|
||||||
|
Documentation/vm/multigen_lru.rst | 192 ++++++++++++++++++++++++++++++
|
||||||
|
2 files changed, 193 insertions(+)
|
||||||
|
create mode 100644 Documentation/vm/multigen_lru.rst
|
||||||
|
|
||||||
|
diff --git a/Documentation/vm/index.rst b/Documentation/vm/index.rst
|
||||||
|
index eff5fbd492d0..c353b3f55924 100644
|
||||||
|
--- a/Documentation/vm/index.rst
|
||||||
|
+++ b/Documentation/vm/index.rst
|
||||||
|
@@ -17,6 +17,7 @@ various features of the Linux memory management
|
||||||
|
|
||||||
|
swap_numa
|
||||||
|
zswap
|
||||||
|
+ multigen_lru
|
||||||
|
|
||||||
|
Kernel developers MM documentation
|
||||||
|
==================================
|
||||||
|
diff --git a/Documentation/vm/multigen_lru.rst b/Documentation/vm/multigen_lru.rst
|
||||||
|
new file mode 100644
|
||||||
|
index 000000000000..cf772aeca317
|
||||||
|
--- /dev/null
|
||||||
|
+++ b/Documentation/vm/multigen_lru.rst
|
||||||
|
@@ -0,0 +1,192 @@
|
||||||
|
+=====================
|
||||||
|
+Multigenerational LRU
|
||||||
|
+=====================
|
||||||
|
+
|
||||||
|
+Quick Start
|
||||||
|
+===========
|
||||||
|
+Build Options
|
||||||
|
+-------------
|
||||||
|
+:Required: Set ``CONFIG_LRU_GEN=y``.
|
||||||
|
+
|
||||||
|
+:Optional: Change ``CONFIG_NR_LRU_GENS`` to a number ``X`` to support
|
||||||
|
+ a maximum of ``X`` generations.
|
||||||
|
+
|
||||||
|
+:Optional: Change ``CONFIG_TIERS_PER_GEN`` to a number ``Y`` to support
|
||||||
|
+ a maximum of ``Y`` tiers per generation.
|
||||||
|
+
|
||||||
|
+:Optional: Set ``CONFIG_LRU_GEN_ENABLED=y`` to turn the feature on by
|
||||||
|
+ default.
|
||||||
|
+
|
||||||
|
+Runtime Options
|
||||||
|
+---------------
|
||||||
|
+:Required: Write ``1`` to ``/sys/kernel/mm/lru_gen/enable`` if the
|
||||||
|
+ feature was not turned on by default.
|
||||||
|
+
|
||||||
|
+:Optional: Change ``/sys/kernel/mm/lru_gen/spread`` to a number ``N``
|
||||||
|
+ to spread pages out across ``N+1`` generations. ``N`` should be less
|
||||||
|
+ than ``X``. Larger values make the background aging more aggressive.
|
||||||
|
+
|
||||||
|
+:Optional: Read ``/sys/kernel/debug/lru_gen`` to verify the feature.
|
||||||
|
+ This file has the following output:
|
||||||
|
+
|
||||||
|
+::
|
||||||
|
+
|
||||||
|
+ memcg memcg_id memcg_path
|
||||||
|
+ node node_id
|
||||||
|
+ min_gen birth_time anon_size file_size
|
||||||
|
+ ...
|
||||||
|
+ max_gen birth_time anon_size file_size
|
||||||
|
+
|
||||||
|
+Given a memcg and a node, ``min_gen`` is the oldest generation
|
||||||
|
+(number) and ``max_gen`` is the youngest. Birth time is in
|
||||||
|
+milliseconds. The sizes of anon and file types are in pages.
|
||||||
|
+
|
||||||
|
+Recipes
|
||||||
|
+-------
|
||||||
|
+:Android on ARMv8.1+: ``X=4``, ``N=0``
|
||||||
|
+
|
||||||
|
+:Android on pre-ARMv8.1 CPUs: Not recommended due to the lack of
|
||||||
|
+ ``ARM64_HW_AFDBM``
|
||||||
|
+
|
||||||
|
+:Laptops running Chrome on x86_64: ``X=7``, ``N=2``
|
||||||
|
+
|
||||||
|
+:Working set estimation: Write ``+ memcg_id node_id gen [swappiness]``
|
||||||
|
+ to ``/sys/kernel/debug/lru_gen`` to account referenced pages to
|
||||||
|
+ generation ``max_gen`` and create the next generation ``max_gen+1``.
|
||||||
|
+ ``gen`` should be equal to ``max_gen``. A swap file and a non-zero
|
||||||
|
+ ``swappiness`` are required to scan anon type. If swapping is not
|
||||||
|
+ desired, set ``vm.swappiness`` to ``0``.
|
||||||
|
+
|
||||||
|
+:Proactive reclaim: Write ``- memcg_id node_id gen [swappiness]
|
||||||
|
+ [nr_to_reclaim]`` to ``/sys/kernel/debug/lru_gen`` to evict
|
||||||
|
+ generations less than or equal to ``gen``. ``gen`` should be less
|
||||||
|
+ than ``max_gen-1`` as ``max_gen`` and ``max_gen-1`` are active
|
||||||
|
+ generations and therefore protected from the eviction. Use
|
||||||
|
+ ``nr_to_reclaim`` to limit the number of pages to be evicted.
|
||||||
|
+ Multiple command lines are supported, so does concatenation with
|
||||||
|
+ delimiters ``,`` and ``;``.
|
||||||
|
+
|
||||||
|
+Framework
|
||||||
|
+=========
|
||||||
|
+For each ``lruvec``, evictable pages are divided into multiple
|
||||||
|
+generations. The youngest generation number is stored in ``max_seq``
|
||||||
|
+for both anon and file types as they are aged on an equal footing. The
|
||||||
|
+oldest generation numbers are stored in ``min_seq[2]`` separately for
|
||||||
|
+anon and file types as clean file pages can be evicted regardless of
|
||||||
|
+swap and write-back constraints. Generation numbers are truncated into
|
||||||
|
+``order_base_2(CONFIG_NR_LRU_GENS+1)`` bits in order to fit into
|
||||||
|
+``page->flags``. The sliding window technique is used to prevent
|
||||||
|
+truncated generation numbers from overlapping. Each truncated
|
||||||
|
+generation number is an index to an array of per-type and per-zone
|
||||||
|
+lists. Evictable pages are added to the per-zone lists indexed by
|
||||||
|
+``max_seq`` or ``min_seq[2]`` (modulo ``CONFIG_NR_LRU_GENS``),
|
||||||
|
+depending on whether they are being faulted in.
|
||||||
|
+
|
||||||
|
+Each generation is then divided into multiple tiers. Tiers represent
|
||||||
|
+levels of usage from file descriptors only. Pages accessed N times via
|
||||||
|
+file descriptors belong to tier order_base_2(N). In contrast to moving
|
||||||
|
+across generations which requires the lru lock, moving across tiers
|
||||||
|
+only involves an atomic operation on ``page->flags`` and therefore has
|
||||||
|
+a negligible cost.
|
||||||
|
+
|
||||||
|
+The workflow comprises two conceptually independent functions: the
|
||||||
|
+aging and the eviction.
|
||||||
|
+
|
||||||
|
+Aging
|
||||||
|
+-----
|
||||||
|
+The aging produces young generations. Given an ``lruvec``, the aging
|
||||||
|
+scans page tables for referenced pages of this ``lruvec``. Upon
|
||||||
|
+finding one, the aging updates its generation number to ``max_seq``.
|
||||||
|
+After each round of scan, the aging increments ``max_seq``.
|
||||||
|
+
|
||||||
|
+The aging maintains either a system-wide ``mm_struct`` list or
|
||||||
|
+per-memcg ``mm_struct`` lists, and it only scans page tables of
|
||||||
|
+processes that have been scheduled since the last scan. Since scans
|
||||||
|
+are differential with respect to referenced pages, the cost is roughly
|
||||||
|
+proportional to their number.
|
||||||
|
+
|
||||||
|
+The aging is due when both of ``min_seq[2]`` reaches ``max_seq-1``,
|
||||||
|
+assuming both anon and file types are reclaimable.
|
||||||
|
+
|
||||||
|
+Eviction
|
||||||
|
+--------
|
||||||
|
+The eviction consumes old generations. Given an ``lruvec``, the
|
||||||
|
+eviction scans the pages on the per-zone lists indexed by either of
|
||||||
|
+``min_seq[2]``. It first tries to select a type based on the values of
|
||||||
|
+``min_seq[2]``. When anon and file types are both available from the
|
||||||
|
+same generation, it selects the one that has a lower refault rate.
|
||||||
|
+
|
||||||
|
+During a scan, the eviction sorts pages according to their generation
|
||||||
|
+numbers, if the aging has found them referenced. It also moves pages
|
||||||
|
+from the tiers that have higher refault rates than tier 0 to the next
|
||||||
|
+generation.
|
||||||
|
+
|
||||||
|
+When it finds all the per-zone lists of a selected type are empty, the
|
||||||
|
+eviction increments ``min_seq[2]`` indexed by this selected type.
|
||||||
|
+
|
||||||
|
+Rationale
|
||||||
|
+=========
|
||||||
|
+Limitations of Current Implementation
|
||||||
|
+-------------------------------------
|
||||||
|
+Notion of Active/Inactive
|
||||||
|
+~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
+For servers equipped with hundreds of gigabytes of memory, the
|
||||||
|
+granularity of the active/inactive is too coarse to be useful for job
|
||||||
|
+scheduling. False active/inactive rates are relatively high, and thus
|
||||||
|
+the assumed savings may not materialize.
|
||||||
|
+
|
||||||
|
+For phones and laptops, executable pages are frequently evicted
|
||||||
|
+despite the fact that there are many less recently used anon pages.
|
||||||
|
+Major faults on executable pages cause ``janks`` (slow UI renderings)
|
||||||
|
+and negatively impact user experience.
|
||||||
|
+
|
||||||
|
+For ``lruvec``\s from different memcgs or nodes, comparisons are
|
||||||
|
+impossible due to the lack of a common frame of reference.
|
||||||
|
+
|
||||||
|
+Incremental Scans via ``rmap``
|
||||||
|
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
+Each incremental scan picks up at where the last scan left off and
|
||||||
|
+stops after it has found a handful of unreferenced pages. For
|
||||||
|
+workloads using a large amount of anon memory, incremental scans lose
|
||||||
|
+the advantage under sustained memory pressure due to high ratios of
|
||||||
|
+the number of scanned pages to the number of reclaimed pages. On top
|
||||||
|
+of that, the ``rmap`` has poor memory locality due to its complex data
|
||||||
|
+structures. The combined effects typically result in a high amount of
|
||||||
|
+CPU usage in the reclaim path.
|
||||||
|
+
|
||||||
|
+Benefits of Multigenerational LRU
|
||||||
|
+---------------------------------
|
||||||
|
+Notion of Generation Numbers
|
||||||
|
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
+The notion of generation numbers introduces a quantitative approach to
|
||||||
|
+memory overcommit. A larger number of pages can be spread out across
|
||||||
|
+configurable generations, and thus they have relatively low false
|
||||||
|
+active/inactive rates. Each generation includes all pages that have
|
||||||
|
+been referenced since the last generation.
|
||||||
|
+
|
||||||
|
+Given an ``lruvec``, scans and the selections between anon and file
|
||||||
|
+types are all based on generation numbers, which are simple and yet
|
||||||
|
+effective. For different ``lruvec``\s, comparisons are still possible
|
||||||
|
+based on birth times of generations.
|
||||||
|
+
|
||||||
|
+Differential Scans via Page Tables
|
||||||
|
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
+Each differential scan discovers all pages that have been referenced
|
||||||
|
+since the last scan. Specifically, it walks the ``mm_struct`` list
|
||||||
|
+associated with an ``lruvec`` to scan page tables of processes that
|
||||||
|
+have been scheduled since the last scan. The cost of each differential
|
||||||
|
+scan is roughly proportional to the number of referenced pages it
|
||||||
|
+discovers. Unless address spaces are extremely sparse, page tables
|
||||||
|
+usually have better memory locality than the ``rmap``. The end result
|
||||||
|
+is generally a significant reduction in CPU usage, for workloads
|
||||||
|
+using a large amount of anon memory.
|
||||||
|
+
|
||||||
|
+To-do List
|
||||||
|
+==========
|
||||||
|
+KVM Optimization
|
||||||
|
+----------------
|
||||||
|
+Support shadow page table scanning.
|
||||||
|
+
|
||||||
|
+NUMA Optimization
|
||||||
|
+-----------------
|
||||||
|
+Support NUMA policies and per-node RSS counters.
|
||||||
|
--
|
||||||
|
2.31.1.295.g9ea45b61b8-goog
|
||||||
|
|
||||||
|
|
@ -48,7 +48,7 @@ pkg_postinst() {
|
|||||||
einfo "To build the kernel use the following command:"
|
einfo "To build the kernel use the following command:"
|
||||||
einfo "make Image Image.gz modules"
|
einfo "make Image Image.gz modules"
|
||||||
einfo "make DTC_FLAGS="-@" dtbs"
|
einfo "make DTC_FLAGS="-@" dtbs"
|
||||||
einfo "make install; make modules_intall; make dtbs_install"
|
einfo "make install; make modules_install; make dtbs_install"
|
||||||
einfo "If you use kernel config coming with this ebuild, don't forget to also copy dracut-pp.conf to /etc/dracut.conf.d/"
|
einfo "If you use kernel config coming with this ebuild, don't forget to also copy dracut-pp.conf to /etc/dracut.conf.d/"
|
||||||
einfo "to make sure proper kernel modules are loaded into initramfs"
|
einfo "to make sure proper kernel modules are loaded into initramfs"
|
||||||
einfo "if you want to cross compile pinephone kernel on amd64 host, follow the https://wiki.gentoo.org/wiki/Cross_build_environment"
|
einfo "if you want to cross compile pinephone kernel on amd64 host, follow the https://wiki.gentoo.org/wiki/Cross_build_environment"
|
||||||
|
92
sys-kernel/pinephone-sources/pinephone-sources-5.12.0.ebuild
Normal file
92
sys-kernel/pinephone-sources/pinephone-sources-5.12.0.ebuild
Normal file
@ -0,0 +1,92 @@
|
|||||||
|
# Copyright 1999-2021 Gentoo Authors
|
||||||
|
# Distributed under the terms of the GNU General Public License v2
|
||||||
|
|
||||||
|
EAPI="6"
|
||||||
|
UNIPATCH_STRICTORDER="yes"
|
||||||
|
K_NOUSENAME="yes"
|
||||||
|
K_NOSETEXTRAVERSION="yes"
|
||||||
|
K_NOUSEPR="yes"
|
||||||
|
K_SECURITY_UNSUPPORTED="1"
|
||||||
|
K_BASE_VER="5.12"
|
||||||
|
K_EXP_GENPATCHES_NOUSE="1"
|
||||||
|
K_FROM_GIT="yes"
|
||||||
|
ETYPE="sources"
|
||||||
|
CKV="${PVR/-r/-git}"
|
||||||
|
|
||||||
|
# only use this if it's not an _rc/_pre release
|
||||||
|
[ "${PV/_pre}" == "${PV}" ] && [ "${PV/_rc}" == "${PV}" ] && OKV="${PV}"
|
||||||
|
inherit kernel-2
|
||||||
|
detect_version
|
||||||
|
|
||||||
|
|
||||||
|
DEPEND="${RDEPEND}
|
||||||
|
>=sys-devel/patch-2.7.5"
|
||||||
|
|
||||||
|
DESCRIPTION="Full sources for the Linux kernel, with megi's patch for pinephone"
|
||||||
|
HOMEPAGE="https://www.kernel.org"
|
||||||
|
|
||||||
|
KEYWORDS="~alpha ~amd64 ~arm ~arm64 ~hppa ~ia64 ~mips ~ppc ~ppc64 ~s390 ~sparc ~x86"
|
||||||
|
MEGI_PATCH_URI="https://xff.cz/kernels/${PV:0:4}/patches/all.patch"
|
||||||
|
SRC_URI="${KERNEL_URI} ${MEGI_PATCH_URI} -> all-${PV}.patch"
|
||||||
|
|
||||||
|
PATCHES=(
|
||||||
|
${DISTDIR}/all-${PV}.patch
|
||||||
|
${FILESDIR}/enable-hdmi-output-pinetab.patch
|
||||||
|
${FILESDIR}/enable-jack-detection-pinetab.patch
|
||||||
|
${FILESDIR}/pinetab-bluetooth.patch
|
||||||
|
${FILESDIR}/pinetab-accelerometer.patch
|
||||||
|
${FILESDIR}/dts-pinephone-drop-modem-power-node.patch
|
||||||
|
${FILESDIR}/dts-headphone-jack-detection.patch
|
||||||
|
${FILESDIR}/media-ov5640-Implement-autofocus.patch
|
||||||
|
${FILESDIR}/0011-dts-pinetab-hardcode-mmc-numbers.patch
|
||||||
|
${FILESDIR}/0012-pinephone-fix-pogopin-i2c.patch
|
||||||
|
${FILESDIR}/0107-quirk-kernel-org-bug-210681-firmware_rome_error.patch
|
||||||
|
${FILESDIR}/0177-leds-gpio-make-max_brightness-configurable.patch
|
||||||
|
${FILESDIR}/0178-sun8i-codec-fix-headphone-jack-pin-name.patch
|
||||||
|
${FILESDIR}/0179-arm64-dts-allwinner-pinephone-improve-device-tree-5.12.patch
|
||||||
|
${FILESDIR}/panic-led-5.12.patch
|
||||||
|
${FILESDIR}/PATCH-1-4-HID-magicmouse-add-Apple-Magic-Mouse-2-support.patch
|
||||||
|
${FILESDIR}/PATCH-2-4-HID-magicmouse-fix-3-button-emulation-of-Mouse-2.patch
|
||||||
|
${FILESDIR}/PATCH-3-4-HID-magicmouse-fix-reconnection-of-Magic-Mouse-2.patch
|
||||||
|
${FILESDIR}/PATCH-v2-01-16-include-linux-memcontrol.h-do-not-warn-in-page_memcg_rcu-if-CONFIG_MEMCG.patch
|
||||||
|
${FILESDIR}/PATCH-v2-02-16-include-linux-nodemask.h-define-next_memory_node-if-CONFIG_NUMA.patch
|
||||||
|
${FILESDIR}/PATCH-v2-03-16-include-linux-huge_mm.h-define-is_huge_zero_pmd-if-CONFIG_TRANSPARENT_HUGEPAGE.patch
|
||||||
|
${FILESDIR}/PATCH-v2-04-16-include-linux-cgroup.h-export-cgroup_mutex.patch
|
||||||
|
${FILESDIR}/PATCH-v2-05-16-mm-swap.c-export-activate_page.patch
|
||||||
|
${FILESDIR}/PATCH-v2-06-16-mm-x86-support-the-access-bit-on-non-leaf-PMD-entries.patch
|
||||||
|
${FILESDIR}/PATCH-v2-07-16-mm-vmscan.c-refactor-shrink_node.patch
|
||||||
|
${FILESDIR}/PATCH-v2-08-16-mm-multigenerational-lru-groundwork.patch
|
||||||
|
${FILESDIR}/PATCH-v2-09-16-mm-multigenerational-lru-activation.patch
|
||||||
|
${FILESDIR}/PATCH-v2-10-16-mm-multigenerational-lru-mm_struct-list.patch
|
||||||
|
${FILESDIR}/PATCH-v2-11-16-mm-multigenerational-lru-aging.patch
|
||||||
|
${FILESDIR}/PATCH-v2-12-16-mm-multigenerational-lru-eviction.patch
|
||||||
|
${FILESDIR}/PATCH-v2-13-16-mm-multigenerational-lru-page-reclaim.patch
|
||||||
|
${FILESDIR}/PATCH-v2-14-16-mm-multigenerational-lru-user-interface.patch
|
||||||
|
${FILESDIR}/PATCH-v2-15-16-mm-multigenerational-lru-Kconfig.patch
|
||||||
|
${FILESDIR}/PATCH-v2-16-16-mm-multigenerational-lru-documentation.patch
|
||||||
|
)
|
||||||
|
|
||||||
|
src_prepare() {
|
||||||
|
default
|
||||||
|
eapply_user
|
||||||
|
}
|
||||||
|
|
||||||
|
pkg_postinst() {
|
||||||
|
kernel-2_pkg_postinst
|
||||||
|
einfo "For more info on this patchset, and how to report problems, see:"
|
||||||
|
einfo "${HOMEPAGE}"
|
||||||
|
einfo "To build the kernel use the following command:"
|
||||||
|
einfo "make Image Image.gz modules"
|
||||||
|
einfo "make DTC_FLAGS="-@" dtbs"
|
||||||
|
einfo "make install; make modules_install; make dtbs_install"
|
||||||
|
einfo "If you use kernel config coming with this ebuild, don't forget to also copy dracut-pp.conf to /etc/dracut.conf.d/"
|
||||||
|
einfo "to make sure proper kernel modules are loaded into initramfs"
|
||||||
|
einfo "if you want to cross compile pinephone kernel on amd64 host, follow the https://wiki.gentoo.org/wiki/Cross_build_environment"
|
||||||
|
einfo "to setup cross toolchain environment, then create a xmake wrapper like the following, and replace make with xmake in above commands"
|
||||||
|
einfo "#!/bin/sh"
|
||||||
|
einfo "exec make ARCH='arm64' CROSS_COMPILE='aarch64-unknown-linux-gnu-' INSTALL_MOD_PATH='${SYSROOT}' '$@'"
|
||||||
|
}
|
||||||
|
|
||||||
|
pkg_postrm() {
|
||||||
|
kernel-2_pkg_postrm
|
||||||
|
}
|
Loading…
Reference in New Issue
Block a user