1.lmk和oom adj是什么
(本文代码基于Android14。)
Android系统在内存不足的时候,会通过限制或终止不必要的进程等方式来释放内存,使系统以可接受的性能水平运行,负责在系统内存不足的时候终止进程的就是lmk。
lmk全称是低内存杀手(Low Memory Killer),Android早期是使用内核中实现的低内存终止守护程序(LMK)驱动程序来监控系统内存压力,从内核4.12开始,LMK驱动程序已从上游内核中移除,改由用户空间lmkd来执行内存监控和进程终止任务。
oom adj本文中指的是oom_score_adj(Out-of-Memory Score Adjustments),可以看做是代表进程优先级的一个量,每个进程都有一个oom_score_adj,并且这个值是动态变化的,lmk在系统达到一定内存压力的时候,参考oom_score_adj去选取需要终止的进程。
2.oom adj
2.1 如何查看进程的adj值
# 1. adb shell
# 2. cd /proc/<pid> 其中<pid>为某个进程的进程号
# 3. ls 查看/proc/<pid> 目录下的文件可以看到以下三个文件
oom_adj oom_score oom_score_adj
# 4. 用cat命令可以查看对应的值
cat oom_score_adj
-1000
oom_adj、oom_score和oom_score_adj分别代表什么?有什么区别呢?
oom_adj 是一个比较旧的用于表示进程优先级的参数。它的取值范围是从-17到15。oom_adj在较新的Linux内核版本中逐渐被oom_score_adj取代,因为oom_adj的取值范围有限,不能很好地满足现代复杂系统中对进程OOM优先级的精细调整需求。
oom_score 是一个反映进程在当前系统状态下被OOM Killer选中可能性的一个综合得分。这个得分是由系统根据多个因素计算得出的,包括进程的内存使用量、进程的优先级、是否是后台进程等诸多因素。oom_score越大,进程越容易被杀。
oom_score_adj 是用于调整进程的oom_score的一个参数。它是一个整数,取值范围是-1000到1000。oom_score_adj越大,进程越容易被杀。
2.2 adj有哪些值
oom_score_adj的值范围从-1000~1000,其中一些预定义的值定义在ProcessList中,代表的意义注释也写的很清楚,这里简单翻译一下。
// release/frameworks/base/services/core/java/com/android/server/am/ProcessList.java
// OOM adjustments for processes in various states:
// Uninitialized value for any major or minor adj fields
public static final int INVALID_ADJ = -10000;
// Adjustment used in certain places where we don't know it yet.
// (Generally this is something that is going to be cached, but we
// don't know the exact value in the cached range to assign yet.)
public static final int UNKNOWN_ADJ = 1001;
// 这是一个仅承载不可见活动的进程,
// 所以可以将其终止,且不会造成任何干扰。
public static final int CACHED_APP_MAX_ADJ = 999;
public static final int CACHED_APP_MIN_ADJ = 900;
// 这是我们允许首先被终止的内存溢出调整(oom_adj)级别。除非进程被主动分配了一个
// 为 CACHED_APP_MAX_ADJ 的内存溢出分数调整(oom_score_adj)值,否则这个值不能等同于 CACHED_APP_MAX_ADJ。
public static final int CACHED_APP_LMK_FIRST_ADJ = 950;
// Number of levels we have available for different service connection group importance
// levels.
static final int CACHED_APP_IMPORTANCE_LEVELS = 5;
// SERVICE_ADJ的 B 列表 —— 这些是陈旧且较落后的服务,不像 A 列表中的那些服务那样重要和有吸引力。
public static final int SERVICE_B_ADJ = 800;
// 这是用户之前所在应用的进程。
// 这个进程的优先级要高于其他一些进程,因为用户经常会切换回之前使用的应用。这对于近期的任务切换(在两个最近使用的顶部应用之间切换)以及常规的用户界面操作流程(比如在电子邮件应用中点击一个链接在浏览器中查看,然后按返回键回到电子邮件应用)来说都很重要。
public static final int PREVIOUS_APP_ADJ = 700;
// 这是一个承载主屏幕应用的进程 —— 我们要尽量避免终止它,即便它通常处于后台,
// 因为用户与它的交互非常频繁。
public static final int HOME_APP_ADJ = 600;
// 这是一个承载应用服务的进程 —— 就用户而言,终止它不会产生太大影响。
public static final int SERVICE_ADJ = 500;
// 这是一个具有重量级应用的进程。它处于后台,但我们要尽量避免终止它。其值在启动时于 system/rootdir/init.rc 中设置。
public static final int HEAVY_WEIGHT_APP_ADJ = 400;
// 这是一个当前正在进行备份操作的进程。终止它
// 并非完全致命,但通常来说不是一个好主意。
public static final int BACKUP_APP_ADJ = 300;
// 这是一个受系统(或其他应用程序)约束的进程,它比服务进程更重要,但
// 如果被终止,其影响也并非那么明显,不会立即对用户产生影响。
public static final int PERCEPTIBLE_LOW_APP_ADJ = 250;
// 这是一个承载服务的进程,这些服务对用户来说是不可感知的,但
// 与之绑定的客户端(系统)请求将其视为可感知的,并尽可能避免终止它。
public static final int PERCEPTIBLE_MEDIUM_APP_ADJ = 225;
// 这是一个仅承载对用户来说可感知组件的进程,我们非常希望避免终止它们,但它们并非
// 立即就能被用户看到。例如后台音乐播放就是这样的情况。
public static final int PERCEPTIBLE_APP_ADJ = 200;
// 这是一个仅承载对用户可见活动的进程,所以我们希望它们不会消失。
public static final int VISIBLE_APP_ADJ = 100;
static final int VISIBLE_APP_LAYER_MAX = PERCEPTIBLE_APP_ADJ - VISIBLE_APP_ADJ - 1;
// 这是一个最近处于前台(TOP)状态,而后转移到前台服务(FGS)的进程。在一段时间内,仍要将其几乎当作前台应用来对待。
public static final int PERCEPTIBLE_RECENT_FOREGROUND_APP_ADJ = 50;
// 这是运行当前前台应用的进程。我们实在是非常不希望终止它!
public static final int FOREGROUND_APP_ADJ = 0;
// 这是一个被系统或持久化进程绑定的进程,并且已表明它很重要。
public static final int PERSISTENT_SERVICE_ADJ = -700;
// 这是一个系统持久化进程,比如电话(通讯相关)进程。绝对
// 不想终止它,不过即便终止了它,也并非是完全致命的情况。
public static final int PERSISTENT_PROC_ADJ = -800;
// 系统进程以默认的调整值运行。
public static final int SYSTEM_ADJ = -900;
// 针对那些不由系统管理的原生进程(因此系统不会为其分配内存溢出调整值(oom adj))的特殊代码。
public static final int NATIVE_ADJ = -1000;
3.系统如何更新进程adj值
3.1 ProcessList
ProcessList主要负责和lmkd进程通信,比如建立socket连接、提供setOomAdj方法更新进程adj值等。当然还包括进程相关的一些其他方法,比如启动进程、lru进程更新等等,这里主要分析和lowmemorykiller相关的。
// release/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
public ActivityManagerService(Context systemContext, ActivityTaskManagerService atm) {
......
mProcessList = mInjector.getProcessList(this);
mProcessList.init(this, activeUids, mPlatformCompat);
mAppProfiler = new AppProfiler(this, BackgroundThread.getHandler().getLooper(),
new LowMemDetector(this));
mPhantomProcessList = new PhantomProcessList(this);
mOomAdjuster = new OomAdjuster(this, mProcessList, activeUids);
......
}
ProcessList和OomAdjuster在ActivityManagerService的构造方法里面被初始化
// release/frameworks/base/services/core/java/com/android/server/am/ProcessList.java
ProcessList() {
MemInfoReader minfo = new MemInfoReader();
minfo.readMemInfo();
mTotalMemMb = minfo.getTotalSize()/(1024*1024);
updateOomLevels(0, 0, false);
}
ProcessList的构造方法里面调用了updateOomLevels方法,updateOomLevels方法主要是更新ProcessList的成员变量mOomMinFree数组的值。先来看下ProcessList里面的4个数组:
// release/frameworks/base/services/core/java/com/android/server/am/ProcessList.java
// These are the various interesting memory levels that we will give to
// the OOM killer. Note that the OOM killer only supports 6 slots, so we
// can't give it a different value for every possible kind of process.
private final int[] mOomAdj = new int[] {
FOREGROUND_APP_ADJ, VISIBLE_APP_ADJ, PERCEPTIBLE_APP_ADJ,
PERCEPTIBLE_LOW_APP_ADJ, CACHED_APP_MIN_ADJ, CACHED_APP_LMK_FIRST_ADJ
};
// These are the low-end OOM level limits. This is appropriate for an
// HVGA or smaller phone with less than 512MB. Values are in KB.
private final int[] mOomMinFreeLow = new int[] {
12288, 18432, 24576,
36864, 43008, 49152
};
// These are the high-end OOM level limits. This is appropriate for a
// 1280x800 or larger screen with around 1GB RAM. Values are in KB.
private final int[] mOomMinFreeHigh = new int[] {
73728, 92160, 110592,
129024, 147456, 184320
};
// The actual OOM killer memory levels we are using.
private final int[] mOomMinFree = new int[mOomAdj.length];
mOomMinFree和mOomAdj才是最终起作用的,mOomMinFreeLow和mOomMinFreeHigh只是针对低于512MB运存设备和1GB左右运存设备的参考值,mOomMinFree的计算会用到这两个数组的值。mOomMinFree和mOomAdj数组的元素是一一对应的,分别代表剩余内存大小和adj值。
private void updateOomLevels(int displayWidth, int displayHeight, boolean write) {
......
if (write) {
ByteBuffer buf = ByteBuffer.allocate(4 * (2 * mOomAdj.length + 1));
buf.putInt(LMK_TARGET);
for (int i = 0; i < mOomAdj.length; i++) {
buf.putInt((mOomMinFree[i] * 1024)/PAGE_SIZE);
buf.putInt(mOomAdj[i]);
}
writeLmkd(buf, null);
......
}
}
updateOomLevels方法里的具体算法这里就不看了,主要影响因素有:系统总运存大小、屏幕大小、mOomMinFreeLow、mOomMinFreeHigh等。
updateOomLevels方法除了ProcessList的构造方法里调用,还有一个applyDisplaySize方法里也调用了,在这个方法里调用的时候传的参数write为true,最终会调用writeLmkd通过socket将mOomMinFree和mOomAdj传递到lmkd进程。
// release/system/memory/lmkd/lmkd.cpp
static void cmd_target(int ntargets, LMKD_CTRL_PACKET packet) {
......
for (i = 0; i < ntargets; i++) {
lmkd_pack_get_target(packet, i, &target);
lowmem_minfree[i] = target.minfree;
lowmem_adj[i] = target.oom_adj_score;
......
}
lowmem_targets_size = ntargets;
......
// 写入属性中
property_set("sys.lmk.minfree_levels", minfree_str);
if (has_inkernel_module) {
char minfreestr[128];
char killpriostr[128];
minfreestr[0] = '\0';
killpriostr[0] = '\0';
for (i = 0; i < lowmem_targets_size; i++) {
char val[40];
if (i) {
strlcat(minfreestr, ",", sizeof(minfreestr));
strlcat(killpriostr, ",", sizeof(killpriostr));
}
snprintf(val, sizeof(val), "%d", use_inkernel_interface ? lowmem_minfree[i] : 0);
strlcat(minfreestr, val, sizeof(minfreestr));
snprintf(val, sizeof(val), "%d", use_inkernel_interface ? lowmem_adj[i] : 0);
strlcat(killpriostr, val, sizeof(killpriostr));
}
// 写入到/sys/module/lowmemorykiller/parameters/minfree
writefilestring(INKERNEL_MINFREE_PATH, minfreestr, true);
// 写入到/sys/module/lowmemorykiller/parameters/adj
writefilestring(INKERNEL_ADJ_PATH, killpriostr, true);
}
}
由上面的函数可以知道,ProcessList里面mOomMinFree和mOomAdj两个数组的内容最终是写入到节点/sys/module/lowmemorykiller/parameters/minfree和/sys/module/lowmemorykiller/parameters/adj,但是由于在Android 14中,has_inkernel_module为false,所以并没有写入到上面两个节点中,并且在lmkd中use_minfree_levels默认为false,因此mOomMinFree和mOomAdj默认并不起作用。
3.2 ProcessList与lmkd进程建立socket连接
// release/frameworks/base/services/core/java/com/android/server/am/ProcessList.java
void init(ActivityManagerService service, ActiveUids activeUids,
PlatformCompat platformCompat) {
......
if (sKillHandler == null) {
sKillThread = new ServiceThread(TAG + ":kill",
THREAD_PRIORITY_BACKGROUND, true /* allowIo */);
sKillThread.start();
sKillHandler = new KillHandler(sKillThread.getLooper());
sLmkdConnection = new LmkdConnection(sKillThread.getLooper().getQueue(),
new LmkdConnection.LmkdConnectionListener() {......}
);
......
}
}
ProcessList的init方法里面主要是初始化KillHandler和LmkdConnection,在第一次调用writeLmkd的时候,会发送LMKD_RECONNECT_MSG消息到KillHandler,然后调用LmkdConnection的connect方法建立与lmkd的socket连接。
// release/frameworks/base/services/core/java/com/android/server/am/LmkdConnection.java
......
public boolean connect() {
synchronized (mLmkdSocketLock) {
......
// temporary sockets and I/O streams
final LocalSocket socket = openSocket();
......
}
return true;
}
......
private LocalSocket openSocket() {
final LocalSocket socket;
try {
socket = new LocalSocket(LocalSocket.SOCKET_SEQPACKET);
socket.connect(
new LocalSocketAddress("lmkd",
LocalSocketAddress.Namespace.RESERVED));
} catch (IOException ex) {
Slog.e(TAG, "Connection failed: " + ex.toString());
return null;
}
return socket;
}
3.3 ProcessList与lmkd进程通信
ProcessList通过writeLmkd方法将命令和参数写入socket
// release/frameworks/base/services/core/java/com/android/server/am/ProcessList.java
private static boolean writeLmkd(ByteBuffer buf, ByteBuffer repl) {
if (!sLmkdConnection.isConnected()) {
// try to connect immediately and then keep retrying
sKillHandler.sendMessage(
sKillHandler.obtainMessage(KillHandler.LMKD_RECONNECT_MSG));
// wait for connection retrying 3 times (up to 3 seconds)
if (!sLmkdConnection.waitForConnection(3 * LMKD_RECONNECT_DELAY_MS)) {
return false;
}
}
return sLmkdConnection.exchange(buf, repl);
}
// release/frameworks/base/services/core/java/com/android/server/am/LmkdConnection.java
public boolean exchange(ByteBuffer req, ByteBuffer repl) {
if (repl == null) {
return write(req);
}
boolean result = false;
// set reply buffer to user-defined one to fill it
synchronized (mReplyBufLock) {
mReplyBuf = repl;
if (write(req)) {
try {
// wait for the reply
mReplyBufLock.wait();
result = (mReplyBuf != null);
} catch (InterruptedException ie) {
result = false;
}
}
// reset reply buffer
mReplyBuf = null;
}
return result;
}
......
private boolean write(ByteBuffer buf) {
synchronized (mLmkdSocketLock) {
......
mLmkdOutputStream.write(buf.array(), 0, buf.position());
......
}
}
ProcessList和lmkd通信支持的命令定义在ProcessList中,这些命令和lmkd中定义的命令值一一对应:
// release/frameworks/base/services/core/java/com/android/server/am/ProcessList.java
// Low Memory Killer Daemon command codes.
// These must be kept in sync with lmk_cmd definitions in lmkd.h
//
// LMK_TARGET <minfree> <minkillprio> ... (up to 6 pairs)
// LMK_PROCPRIO <pid> <uid> <prio>
// LMK_PROCREMOVE <pid>
// LMK_PROCPURGE
// LMK_GETKILLCNT
// LMK_SUBSCRIBE
// LMK_PROCKILL
// LMK_UPDATE_PROPS
// LMK_KILL_OCCURRED
// LMK_STATE_CHANGED
static final byte LMK_TARGET = 0;
static final byte LMK_PROCPRIO = 1;
static final byte LMK_PROCREMOVE = 2;
static final byte LMK_PROCPURGE = 3;
static final byte LMK_GETKILLCNT = 4;
static final byte LMK_SUBSCRIBE = 5;
static final byte LMK_PROCKILL = 6; // Note: this is an unsolicited command
static final byte LMK_UPDATE_PROPS = 7;
static final byte LMK_KILL_OCCURRED = 8; // Msg to subscribed clients on kill occurred event
static final byte LMK_STATE_CHANGED = 9; // Msg to subscribed clients on state changed
// release/system/memory/lmkd/include/lmkd.h
/*
* Supported LMKD commands
*/
enum lmk_cmd {
LMK_TARGET = 0, /* Associate minfree with oom_adj_score */
LMK_PROCPRIO, /* Register a process and set its oom_adj_score */
LMK_PROCREMOVE, /* Unregister a process */
LMK_PROCPURGE, /* Purge all registered processes */
LMK_GETKILLCNT, /* Get number of kills */
LMK_SUBSCRIBE, /* Subscribe for asynchronous events */
LMK_PROCKILL, /* Unsolicited msg to subscribed clients on proc kills */
LMK_UPDATE_PROPS, /* Reinit properties */
LMK_STAT_KILL_OCCURRED, /* Unsolicited msg to subscribed clients on proc kills for statsd log */
LMK_STAT_STATE_CHANGED, /* Unsolicited msg to subscribed clients on state changed */
};
3.4 更新进程adj值
ActivityManagerService会根据进程的不同状态去计算进程的adj值,主要是通过AMS里面的两个方法:
// release/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
@GuardedBy("this")
final void updateOomAdjLocked(@OomAdjReason int oomAdjReason) {
mOomAdjuster.updateOomAdjLocked(oomAdjReason);
}
/**
* Update OomAdj for a specific process and its reachable processes.
*
* @param app The process to update
* @param oomAdjReason
* @return whether updateOomAdjLocked(app) was successful.
*/
@GuardedBy("this")
final boolean updateOomAdjLocked(ProcessRecord app, @OomAdjReason int oomAdjReason) {
return mOomAdjuster.updateOomAdjLocked(app, oomAdjReason);
}
一个是更新所有进程的adj值,一个是更新某个进程的adj值。
什么情况下会触发更新进程的adj值呢?ActivityManagerInternal里面列出了原因:
// release/frameworks/base/services/core/java/com/android/server/am/ActivityManagerInternal.java
@IntDef(prefix = {"OOM_ADJ_REASON_"}, value = {
OOM_ADJ_REASON_NONE,
OOM_ADJ_REASON_ACTIVITY,
OOM_ADJ_REASON_FINISH_RECEIVER,
OOM_ADJ_REASON_START_RECEIVER,
OOM_ADJ_REASON_BIND_SERVICE,
OOM_ADJ_REASON_UNBIND_SERVICE,
OOM_ADJ_REASON_START_SERVICE,
OOM_ADJ_REASON_GET_PROVIDER,
OOM_ADJ_REASON_REMOVE_PROVIDER,
OOM_ADJ_REASON_UI_VISIBILITY,
OOM_ADJ_REASON_ALLOWLIST,
OOM_ADJ_REASON_PROCESS_BEGIN,
OOM_ADJ_REASON_PROCESS_END,
OOM_ADJ_REASON_SHORT_FGS_TIMEOUT,
OOM_ADJ_REASON_SYSTEM_INIT,
OOM_ADJ_REASON_BACKUP,
OOM_ADJ_REASON_SHELL,
OOM_ADJ_REASON_REMOVE_TASK,
OOM_ADJ_REASON_UID_IDLE,
OOM_ADJ_REASON_STOP_SERVICE,
OOM_ADJ_REASON_EXECUTING_SERVICE,
OOM_ADJ_REASON_RESTRICTION_CHANGE,
OOM_ADJ_REASON_COMPONENT_DISABLED,
})
@Retention(RetentionPolicy.SOURCE)
public @interface OomAdjReason {}
由上面的源码可知,AMS的updateOomAdjLocked方法实现里直接调用OomAdjuster的updateOomAdjLocked方法,接下来到OomAdjuster中看看。
先看更新所有进程adj的方法:
// release/frameworks/base/services/core/java/com/android/server/am/OomAdjuster.java
/**
* Update OomAdj for all processes in LRU list
*/
@GuardedBy("mService")
void updateOomAdjLocked(@OomAdjReason int oomAdjReason) {
synchronized (mProcLock) {
updateOomAdjLSP(oomAdjReason);
}
}
updateOomAdjLocked里面直接调用updateOomAdjLSP。
注:updateOomAdjLocked的Locked意思是调用的时候持有ActivityManagerService这个对象的锁,updateOomAdjLSP的LSP是指同时持有ActivityManagerService对象和mProcLock对象两个锁,可不是什么老色批~_~。
// release/frameworks/base/services/core/java/com/android/server/am/OomAdjuster.java
@GuardedBy({"mService", "mProcLock"})
private void updateOomAdjLSP(@OomAdjReason int oomAdjReason) {
// 检查是否正在更新adj
if (checkAndEnqueueOomAdjTargetLocked(null)) {
// Simply return as there is an oomAdjUpdate ongoing
return;
}
try {
// 标记正在更新adj
mOomAdjUpdateOngoing = true;
// 继续执行剩下逻辑
performUpdateOomAdjLSP(oomAdjReason);
} finally {
......
}
}
// release/frameworks/base/services/core/java/com/android/server/am/OomAdjuster.java
@GuardedBy({"mService", "mProcLock"})
private void performUpdateOomAdjLSP(@OomAdjReason int oomAdjReason) {
final ProcessRecord topApp = mService.getTopApp();
......
updateOomAdjInnerLSP(oomAdjReason, topApp , null, null, true, true);
}
// release/frameworks/base/services/core/java/com/android/server/am/OomAdjuster.java
/**
* Update OomAdj for all processes within the given list (could be partial), or the whole LRU
* list if the given list is null; when it's partial update, each process's client proc won't
* get evaluated recursively here.
*/
@GuardedBy({"mService", "mProcLock"})
private void updateOomAdjInnerLSP(@OomAdjReason int oomAdjReason, final ProcessRecord topApp,
ArrayList<ProcessRecord> processes, ActiveUids uids, boolean potentialCycles,
boolean startProfiling) {
......
for (int i = numProc - 1; i >= 0; i--) {
......
computeOomAdjLSP(app, UNKNOWN_ADJ, topApp, fullUpdate, now, false,
computeClients);
......
}
if (computeClients) {
......
if (computeOomAdjLSP(app, UNKNOWN_ADJ, topApp, true, now,
true, true)) {
retryCycles = true;
}
......
}
......
boolean allChanged = updateAndTrimProcessLSP(now, nowElapsed, oldTime, activeUids,
oomAdjReason);
......
}
@GuardedBy({"mService", "mProcLock"})
private boolean updateAndTrimProcessLSP(final long now, final long nowElapsed,
final long oldTime, final ActiveUids activeUids, @OomAdjReason int oomAdjReason) {
......
applyOomAdjLSP(app, true, now, nowElapsed, oomAdjReason);
......
}
updateOomAdjInnerLSP方法里面:
首先调用computeOomAdjLSP计算进程的adj值,computeOomAdjLSP就不看了,虽然它是adj计算的精髓,但实在是太长了,我实在是看不动了,看了也记不住,有需求再细细分析。
官方其实也提供了computeOomAdjLSP实现逻辑的文档说明,路径是:release/frameworks/base/services/core/java/com/android/server/am/OomAdjuster.md
然后调用updateAndTrimProcessLSP方法遍历进程,调用applyOomAdjLSP方法更新进程adj、进程状态、调度组以及进程冻结状态。
// release/frameworks/base/services/core/java/com/android/server/am/OomAdjuster.java
/** Applies the computed oomadj, procstate and sched group values and freezes them in set* */
@GuardedBy({"mService", "mProcLock"})
private boolean applyOomAdjLSP(ProcessRecord app, boolean doingAll, long now,
long nowElapsed, @OomAdjReason int oomAdjReson) {
......
if (state.getCurAdj() != state.getSetAdj()) {
ProcessList.setOomAdj(app.getPid(), app.uid, state.getCurAdj());
......
}
......
}
// release/frameworks/base/services/core/java/com/android/server/am/ProcessList.java
/**
* Set the out-of-memory badness adjustment for a process.
* If {@code pid <= 0}, this method will be a no-op.
*
* @param pid The process identifier to set.
* @param uid The uid of the app
* @param amt Adjustment value -- lmkd allows -1000 to +1000
*
* {@hide}
*/
public static void setOomAdj(int pid, int uid, int amt) {
......
ByteBuffer buf = ByteBuffer.allocate(4 * 4);
buf.putInt(LMK_PROCPRIO);
buf.putInt(pid);
buf.putInt(uid);
buf.putInt(amt);
writeLmkd(buf, null);
......
}
applyOomAdjLSP方法里面最终调用ProcessList.setOomAdj,setOomAdj方法里面将LMK_PROCPRIO命令,以及进程的pid、uid、adj值写入buffer,并调用writeLmkd将数据通过socket传递到lmkd进程,我们在来到lmkd进程代码里面看看。
// release/system/memory/lmkd/lmkd.cpp
static void ctrl_command_handler(int dsock_idx) {
......
case LMK_PROCPRIO:
......
cmd_procprio(packet, nargs, &cred);
break;
......
}
static void cmd_procprio(LMKD_CTRL_PACKET packet, int field_count, struct ucred *cred) {
......
// 解析socket数据到params
lmkd_pack_get_procprio(packet, field_count, ¶ms);
// 校验oomadj是否在范围内
if (params.oomadj < OOM_SCORE_ADJ_MIN ||
params.oomadj > OOM_SCORE_ADJ_MAX) {
ALOGE("Invalid PROCPRIO oomadj argument %d", params.oomadj);
return;
}
......
// 将adj值写入/proc/<pid>/oom_score_adj
snprintf(path, sizeof(path), "/proc/%d/oom_score_adj", params.pid);
snprintf(val, sizeof(val), "%d", params.oomadj);
if (!writefilestring(path, val, false)) {
......
}
......
// 更新oom_score_adj数组链表,关于这个数据结构下文会分析
procp = pid_lookup(params.pid);
if (!procp) {
......
proc_insert(procp);
} else {
......
proc_unslot(procp);
procp->oomadj = params.oomadj;
proc_slot(procp);
}
}
更新单个进程的方法也类似,调用链是:
ActivityManagerService#updateOomAdjLocked(ProcessRecord app, @OomAdjReason int oomAdjReason)
--->OomAdjuster#updateOomAdjLocked(ProcessRecord app, @OomAdjReason int oomAdjReason)
-------->OomAdjuster#updateOomAdjLSP(ProcessRecord app, @OomAdjReason int oomAdjReason)
------------>OomAdjuster#performUpdateOomAdjLSP(ProcessRecord app, @OomAdjReason int oomAdjReason)
---------------->OomAdjuster#performUpdateOomAdjLSP(ProcessRecord app, int cachedAdj,ProcessRecord topApp, long now, @OomAdjReason int oomAdjReason)
-------------------->OomAdjuster#computeOomAdjLSP(ProcessRecord app, int cachedAdj, ProcessRecord topApp, boolean doingAll, long now, boolean cycleReEval, boolean computeClients)
-------------------->OomAdjuster#applyOomAdjLSP(ProcessRecord app, boolean doingAll, long now, long nowElapsed, @OomAdjReason int oomAdjReson)
4.lmkd进程
4.1 lmkd进程启动
lmkd进程在init进程解析init.rc时启动
// release/system/core/rootdir/init.rc
on init
......
start lmkd
......
lmkd服务定义在lmkd.rc中
// release/system/memory/lmkd/lmkd.rc
service lmkd /system/bin/lmkd
class core
user lmkd
group lmkd system readproc
capabilities DAC_OVERRIDE KILL IPC_LOCK SYS_NICE SYS_RESOURCE
critical
socket lmkd seqpacket+passcred 0660 system system
task_profiles ServiceCapacityLow
lmkd进程创建后,会执行/system/bin/lmkd可执行文件,由Andriod.bp可知/system/bin/lmkd的入口是lmkd.cpp的main函数。
// release/system/memory/lmkd/Android.bp
cc_binary {
name: "lmkd",
srcs: [
"lmkd.cpp",
"reaper.cpp",
"watchdog.cpp",
],
......
}
4.2 lmkd进程初始化
4.2.1 main函数
接下来从lmkd.cpp的main函数看起
// release/system/memory/lmkd/lmkd.cpp
int main(int argc, char **argv) {
......
// 更新属性
if (!update_props()) {
ALOGE("Failed to initialize props, exiting.");
return -1;
}
ctx = create_android_logger(KILLINFO_LOG_TAG);
// init初始化
if (!init()) {
if (!use_inkernel_interface) {
...
}
if (init_reaper()) {
ALOGI("Process reaper initialized with %d threads in the pool",
reaper.thread_cnt());
}
// 进入循环等待所有注册在epoll上的事件
mainloop();
}
android_log_destroy(&ctx);
ALOGI("exiting");
return 0;
}
4.2.2 init
// release/system/memory/lmkd/lmkd.cpp
static int init(void) {
......
// 创建epoll实例获取文件描述符
epollfd = epoll_create(MAX_EPOLL_EVENTS);
......
// 将socket连接标记为未连接
for (int i = 0; i < MAX_DATA_CONN; i++) {
data_sock[i].sock = -1;
}
// 获取套接字描述符 /dev/socket/lmkd
ctrl_sock.sock = android_get_control_socket("lmkd");
......
// 设置socket处于监听状态
ret = listen(ctrl_sock.sock, MAX_DATA_CONN);
......
// 设置监听ctrl_sock.sock上的可读事件
epev.events = EPOLLIN;
// 关联事件处理函数,当epoll检测到/dev/socket/lmkd上的EPOLLIN事件时,调用ctrl_connect_handler函数处理事件
ctrl_sock.handler_info.handler = ctrl_connect_handler;
// 设置事件关联数据指针
epev.data.ptr = (void *)&(ctrl_sock.handler_info);
// 向epoll实例中添加/dev/socket/lmkd监听
if (epoll_ctl(epollfd, EPOLL_CTL_ADD, ctrl_sock.sock, &epev) == -1) {
ALOGE("epoll_ctl for lmkd control socket failed (errno=%d)", errno);
return -1;
}
maxevents++;
// 通过判断/sys/module/lowmemorykiller/parameters/minfree是否可访问来给use_inkernel_interface赋值
// lmk有内核空间和用户空间两种实现,use_inkernel_interface为true时表示使用内核实现,为false时使用用户空间的实现
// Android 14 use_inkernel_interface为false
has_inkernel_module = !access(INKERNEL_MINFREE_PATH, W_OK);
use_inkernel_interface = has_inkernel_module;
if (use_inkernel_interface) {
......
} else {
// 初始化内存压力监听
if (!init_monitors()) {
return -1;
}
/* let the others know it does support reporting kills */
property_set("sys.lmk.reportkills", "1");
}
......
}
init函数里面主要做了两件事情:一是设置并监听/dev/socket/lmkd节点,等待socket客户端连接;二是调用init_monitors函数初始化内存压力监听。
4.2.3 设置lmkd socket连接处理函数和消息处理函数
当epoll监听到lmkd socket有新的客户端请求连接时,回调ctrl_connect_handler函数
// release/system/memory/lmkd/lmkd.cpp
static void ctrl_connect_handler(int data __unused, uint32_t events __unused,
struct polling_params *poll_params __unused) {
struct epoll_event epev;
// 获取空闲的连接
int free_dscock_idx = get_free_dsock();
......
// 接受一个socket连接
data_sock[free_dscock_idx].sock = accept(ctrl_sock.sock, NULL, NULL);
......
ALOGI("lmkd data connection established");
/* use data to store data connection idx */
data_sock[free_dscock_idx].handler_info.data = free_dscock_idx;
// 设置socket消息处理函数
data_sock[free_dscock_idx].handler_info.handler = ctrl_data_handler;
data_sock[free_dscock_idx].async_event_mask = 0;
// epoll监听客户端消息
epev.events = EPOLLIN;
epev.data.ptr = (void *)&(data_sock[free_dscock_idx].handler_info);
if (epoll_ctl(epollfd, EPOLL_CTL_ADD, data_sock[free_dscock_idx].sock, &epev) == -1) {
ALOGE("epoll_ctl for data connection socket failed; errno=%d", errno);
ctrl_data_close(free_dscock_idx);
return;
}
maxevents++;
}
ctrl_connect_handler在接收到连接请求时,检查看是否超过最大连接数,没有的话接受该连接请求,并给连接设置消息处理函数,并通过epoll监听客户端消息,当有客户端消息到来时,回调ctrl_data_handler处理消息。
// release/system/memory/lmkd/lmkd.cpp
static void ctrl_data_handler(int data, uint32_t events,
struct polling_params *poll_params __unused) {
if (events & EPOLLIN) {
ctrl_command_handler(data);
}
}
static void ctrl_command_handler(int dsock_idx) {
......
len = ctrl_data_read(dsock_idx, (char *)packet, CTRL_PACKET_MAX_SIZE, &cred);
......
cmd = lmkd_pack_get_cmd(packet);
......
switch(cmd) {
case LMK_TARGET:
......
cmd_target(targets, packet);
break;
case LMK_PROCPRIO:
......
cmd_procprio(packet, nargs, &cred);
break;
......
}
客户端发送的消息(比如AMS的ProcessList),最终走到ctrl_command_handler函数,根据不同的cmd执行不同处理函数。
4.2.4 进程终止策略
lmk要杀进程,就需要知道系统内存等资源(memory、io、cpu)是否存在压力。早期Android使用内核实现的LMK,在启用用户空间的lmkd后,lmkd最初使用内核vmpressure信号来评估内存压力,Android10以后,改为使用内核压力失速信息(PSI)监视器来检测内存压力。
用户空间lmkd还支持一种旧模式,在该模式下,它使用与内核中的LMK驱动程序相同的策略(即可用内存和文件缓存阈值)做出终止决策。要启用旧模式,需要将ro.lmk.use_minfree_levels属性设置为true。
进程终止策略的初始化函数是init_monitors:
// release/system/memory/lmkd/lmkd.cpp
static bool init_monitors() {
// 本文代码基于Android14 默认是使用 PSI
use_psi_monitors = GET_LMK_PROPERTY(bool, "use_psi", true) &&
init_psi_monitors();
// 否则使用 vmpressure事件
if (!use_psi_monitors &&
(!init_mp_common(VMPRESS_LEVEL_LOW) ||
!init_mp_common(VMPRESS_LEVEL_MEDIUM) ||
!init_mp_common(VMPRESS_LEVEL_CRITICAL))) {
ALOGE("Kernel does not support memory pressure events or in-kernel low memory killer");
return false;
}
if (use_psi_monitors) {
ALOGI("Using psi monitors for memory pressure detection");
} else {
ALOGI("Using vmpressure for memory pressure detection");
}
return true;
}
// release/system/memory/lmkd/lmkd.cpp
static bool init_psi_monitors() {
// use_minfree_levels默认为false,use_new_strategy默认为true
bool use_new_strategy =
GET_LMK_PROPERTY(bool, "use_new_strategy", low_ram_device || !use_minfree_levels);
......
// 设置3个压力等级的阈值,后面init_mp_psi时只有VMPRESS_LEVEL_MEDIUM和VMPRESS_LEVEL_CRITICAL两个等级会写入psi
if (use_new_strategy) {
// VMPRESS_LEVEL_LOW的threshold_ms为0,init_mp_psi时会直接return true
psi_thresholds[VMPRESS_LEVEL_LOW].threshold_ms = 0;
// psi_partial_stall_ms 为 70
psi_thresholds[VMPRESS_LEVEL_MEDIUM].threshold_ms = psi_partial_stall_ms;
// psi_complete_stall_ms 为 700
psi_thresholds[VMPRESS_LEVEL_CRITICAL].threshold_ms = psi_complete_stall_ms;
}
if (!init_mp_psi(VMPRESS_LEVEL_LOW, use_new_strategy)) {
return false;
}
if (!init_mp_psi(VMPRESS_LEVEL_MEDIUM, use_new_strategy)) {
destroy_mp_psi(VMPRESS_LEVEL_LOW);
return false;
}
if (!init_mp_psi(VMPRESS_LEVEL_CRITICAL, use_new_strategy)) {
destroy_mp_psi(VMPRESS_LEVEL_MEDIUM);
destroy_mp_psi(VMPRESS_LEVEL_LOW);
return false;
}
return true;
}
经过上面对psi_thresholds的设置之后,psi_thresholds的内容为:
static struct psi_threshold psi_thresholds[VMPRESS_LEVEL_COUNT] = {
{ PSI_SOME, 0 },
{ PSI_SOME, 70 },
{ PSI_FULL, 700 },
};
看到这里,建议先了解一下PSI:
https://facebookmicrosites.github.io/psi/docs/overview
https://www.cnblogs.com/Linux-tech/p/12961296.html
https://zhuanlan.zhihu.com/p/656580184
// release/system/memory/lmkd/lmkd.cpp
static bool init_mp_psi(enum vmpressure_level level, bool use_new_strategy) {
int fd;
// level为VMPRESS_LEVEL_LOW直接return true
if (!psi_thresholds[level].threshold_ms) {
return true;
}
fd = init_psi_monitor(psi_thresholds[level].stall_type,
psi_thresholds[level].threshold_ms * US_PER_MS,
PSI_WINDOW_SIZE_MS * US_PER_MS);
......
vmpressure_hinfo[level].handler = use_new_strategy ? mp_event_psi : mp_event_common;
vmpressure_hinfo[level].data = level;
if (register_psi_monitor(epollfd, fd, &vmpressure_hinfo[level]) < 0) {
destroy_psi_monitor(fd);
return false;
}
......
}
// release/system/memory/lmkd/libpsi/psi.cpp
int init_psi_monitor(enum psi_stall_type stall_type,
int threshold_us, int window_us) {
......
fd = TEMP_FAILURE_RETRY(open(PSI_PATH_MEMORY, O_WRONLY | O_CLOEXEC));
......
switch (stall_type) {
case (PSI_SOME):
case (PSI_FULL):
// "some 70000 1000000"
// "full 700000 1000000"
res = snprintf(buf, sizeof(buf), "%s %d %d",
stall_type_name[stall_type], threshold_us, window_us);
break;
......
}
res = TEMP_FAILURE_RETRY(write(fd, buf, strlen(buf) + 1));
......
}
// release/system/memory/lmkd/libpsi/psi.cpp
int register_psi_monitor(int epollfd, int fd, void* data) {
......
epev.events = EPOLLPRI;
epev.data.ptr = data;
res = epoll_ctl(epollfd, EPOLL_CTL_ADD, fd, &epev);
......
}
init_mp_psi函数里面:
1.调用init_psi_monitor函数向/proc/pressure/memory写入需要监听的阈值。这里一共写入了两组:"some(stall_type) 70000(threshold_us) 1000000(window_us)"、"full 700000 1000000"。以第一组为例,解释一下是什么意思:1秒内(window_us),some超过了70毫秒(threshold_us),PSI就会将这个事件上报给lmkd。
向/proc/pressure/memory写入数据后,内核PSI是怎么知道的呢?这里简单贴一段内核PSI的源码:
// linux(v6.12)/kernel/sched/psi.c
static const struct proc_ops psi_memory_proc_ops = {
.proc_open = psi_memory_open,
.proc_read = seq_read,
.proc_lseek = seq_lseek,
.proc_write = psi_memory_write,
.proc_poll = psi_fop_poll,
.proc_release = psi_fop_release,
};
static int __init psi_proc_init(void)
{
if (psi_enable) {
proc_mkdir("pressure", NULL);
proc_create("pressure/io", 0666, NULL, &psi_io_proc_ops);
proc_create("pressure/memory", 0666, NULL, &psi_memory_proc_ops);
proc_create("pressure/cpu", 0666, NULL, &psi_cpu_proc_ops);
#ifdef CONFIG_IRQ_TIME_ACCOUNTING
proc_create("pressure/irq", 0666, NULL, &psi_irq_proc_ops);
#endif
}
return 0;
}
其实就是对/proc/pressure/memory节点注册了函数操作表psi_memory_proc_ops。当对这个节点open操作时,会执行psi_memory_open函数;write操作时,会执行psi_memory_write函数...
Linux 中的 /proc 文件系统是一个虚拟文件系统,提供了有关正在运行的内核、进程、系统硬件和其他系统资源当前状态的详细信息。它充当了内核和用户空间之间的接口,允许用户和程序访问和操作内核参数和运行时信息。
https://zhuanlan.zhihu.com/p/694564574
2.调用register_psi_monitor函数将/proc/pressure/memory节点添加到epoll监听,当内核PSI检测到突破阈值时,回调到mp_event_psi函数。
// release/system/memory/lmkd/lmkd.cpp
static void mp_event_psi(int data, uint32_t events, struct polling_params *poll_params) {
......
// /proc/meminfo节点数据
union meminfo mi;
// /proc/vmstat节点数据
union vmstat vs;
......
// 本次触发事件的内存压力等级
enum vmpressure_level level = (enum vmpressure_level)data;
// 杀进程的原因,kill_reason != NONE时才会杀进程
enum kill_reasons kill_reason = NONE;
......
// 是否正在kill进程
bool kill_pending = is_kill_pending();
......
// 停止等待上一次kill
stop_wait_for_proc_kill(!kill_pending);
// 读取/proc/vmstat节点数据
if (vmstat_parse(&vs) < 0) {
ALOGE("Failed to parse vmstat!");
return;
}
......
// 读取/proc/meminfo节点数据
if (meminfo_parse(&mi) < 0) {
ALOGE("Failed to parse meminfo!");
return;
}
......
// 检查交换内存是否处于低内存状态
if (swap_free_low_percentage) {
swap_low_threshold = mi.field.total_swap * swap_free_low_percentage / 100;
swap_is_low = get_free_swap(&mi) < swap_low_threshold;
} else {
swap_low_threshold = 0;
}
// 识别回收状态,需要先了解/proc/vmstat节点里面数据代表的意义
// https://www.cnblogs.com/pengdonglin137/p/17877411.html
// https://wenku.baidu.com/view/a40d9e16bad528ea81c758f5f61fb7360b4c2b89.html?_wkts_=1733388128044&bdQuery=workingset_refault_file
if (vs.field.pgscan_direct != init_pgscan_direct) {
init_pgscan_direct = vs.field.pgscan_direct;
init_pgscan_kswapd = vs.field.pgscan_kswapd;
reclaim = DIRECT_RECLAIM;
} else if (vs.field.pgscan_kswapd != init_pgscan_kswapd) {
init_pgscan_kswapd = vs.field.pgscan_kswapd;
reclaim = KSWAPD_RECLAIM;
} else if (workingset_refault_file == prev_workingset_refault) {
/*
* Device is not thrashing and not reclaiming, bail out early until we see these stats
* changing
*/
goto no_kill;
}
prev_workingset_refault = workingset_refault_file;
......
......
// 这里有一大段不怎么想研究,主要是对节点里面数据表示的意义不太清楚,有机会再研究研究
......
// 读取/proc/pressure/memory节点的数据到psi_data
if (!psi_parse_mem(&psi_data)) {
// pis memory的full在10s周期内的阻塞时间百分比是否大于100%,大于则表示已经处于严重等级
critical_stall = psi_data.mem_stats[PSI_FULL].avg10 > (float)stall_limit_critical;
}
// 接下来的一大段是根据前面的信息来确定kill_reason和min_score_adj,这里省略
// 可以知道的是,这段逻辑里min_score_adj只可能为201
......
// kill_reason不为NONE时需要kill进程
if (kill_reason != NONE) {
struct kill_info ki = {
.kill_reason = kill_reason,
.kill_desc = kill_desc,
.thrashing = (int)thrashing,
.max_thrashing = max_thrashing,
};
// 当系统内存压力已经处于严重等级,查杀min_score_adj为0以上的用户可感知进程
if (critical_stall) {
min_score_adj = 0;
}
// 读取/proc/pressure/io、/proc/pressure/cpu数据到psi_data
psi_parse_io(&psi_data);
psi_parse_cpu(&psi_data);
// 查找并杀死一个进程
int pages_freed = find_and_kill_process(min_score_adj, &ki, &mi, &wi, &curr_tm, &psi_data);
if (pages_freed > 0) {
killing = true;
max_thrashing = 0;
if (cut_thrashing_limit) {
// 如果需要降低内存抖动阈值,重新计算内存抖动阈值
thrashing_limit = (thrashing_limit * (100 - thrashing_limit_decay_pct)) / 100;
}
}
}
no_kill:
......
}
在看find_and_kill_process函数之前,先了解一下lmkd是如何管理oom_score_adj和进程的关系的。
// release/system/memory/lmkd/lmkd.cpp
#define ADJTOSLOT(adj) ((adj) + -OOM_SCORE_ADJ_MIN)
#define ADJTOSLOT_COUNT (ADJTOSLOT(OOM_SCORE_ADJ_MAX) + 1)
static struct adjslot_list procadjslot_list[ADJTOSLOT_COUNT];
struct adjslot_list {
struct adjslot_list *next;
struct adjslot_list *prev;
};
struct proc {
struct adjslot_list asl;
int pid;
int pidfd;
uid_t uid;
int oomadj;
pid_t reg_pid; /* PID of the process that registered this record */
bool valid;
struct proc *pidhash_next;
};
lmkd用一个procadjslot_list数组来存储进程信息。procadjslot_list数组的长度ADJTOSLOT_COUNT是2001,正好是oom_score_adj的取值范围[-1000, 1000]内的所有整数的个数,即下标为[0, 2000]。举个栗子当oom_score_adj为1000时,那么经过ADJTOSLOT(1000)得出下标为2000,当oom_score_adj为-1000时,那么经过ADJTOSLOT(-1000)得出下标为0。数组里的每一个元素都是一个双向循环链表,每个链表存的是对应oom_score_adj的进程,每个元素默认有一个头结点,头结点的类型是adjslot_list结构体,头结点后面跟着进程信息结构体proc类型的节点。这个结构可以从相关的操作函数得出:proc_insert、proc_slot、adjslot_insert等,下面我画了一张图可以直观的表示出这个数据结构:

// release/system/memory/lmkd/lmkd.cpp
static int find_and_kill_process(int min_score_adj, struct kill_info *ki, union meminfo *mi,
struct wakeup_info *wi, struct timespec *tm,
struct psi_data *pd) {
int i;
int killed_size = 0;
bool lmk_state_change_start = false;
bool choose_heaviest_task = kill_heaviest_task;
// 这里从oom_score_adj 1000开始查找,最小可以到min_score_adj
// 由mp_event_psi函数我们可以知道,min_score_adj要么为0,要么为201
for (i = OOM_SCORE_ADJ_MAX; i >= min_score_adj; i--) {
struct proc *procp;
// 即使没有开启kill_heaviest_task优先查杀重量级进程,当oom_score_adj小于200时,也默认优先查杀重量级进程
if (!choose_heaviest_task && i <= PERCEPTIBLE_APP_ADJ) {
/*
* If we have to choose a perceptible process, choose the heaviest one to
* hopefully minimize the number of victims.
*/
// 如果我们必须选择一个可感知的进程,那就选择资源占用最多(最重)的那个进程,希望借此将受影响(被终止)的进程数量减到最少。
choose_heaviest_task = true;
}
// 在数组里下标为i的链表里查找
while (true) {
// 如果优先查找重量级进程,调用proc_get_heaviest方法,否则调用proc_adj_tail获取oom_score_adj为i的链表里的最后一个进程
procp = choose_heaviest_task ?
proc_get_heaviest(i) : proc_adj_tail(i);
if (!procp)
break;
// 杀死一个进程
killed_size = kill_one_process(procp, min_score_adj, ki, mi, wi, tm, pd);
if (killed_size >= 0) {
if (!lmk_state_change_start) {
lmk_state_change_start = true;
// 发送消息给socket客户端
stats_write_lmk_state_changed(STATE_START);
}
break;
}
}
if (killed_size) {
break;
}
}
if (lmk_state_change_start) {
// 发送消息给socket客户端
stats_write_lmk_state_changed(STATE_STOP);
}
return killed_size;
}
看看如何查找一个重量级进程:
// release/system/memory/lmkd/lmkd.cpp
static struct proc *proc_get_heaviest(int oomadj) {
// 取oom_score_adj对应链表的头结点
struct adjslot_list *head = &procadjslot_list[ADJTOSLOT(oomadj)];
// 头结点后的第一个进程信息节点
struct adjslot_list *curr = head->next;
struct proc *maxprocp = NULL;
int maxsize = 0;
// 从第一个节点开始找,通过/proc/<pid>/statm找到实际使用物理内存最大的进程
while (curr != head) {
int pid = ((struct proc *)curr)->pid;
int tasksize = proc_get_size(pid);
if (tasksize < 0) {
struct adjslot_list *next = curr->next;
pid_remove(pid);
curr = next;
} else {
if (tasksize > maxsize) {
maxsize = tasksize;
maxprocp = (struct proc *)curr;
}
curr = curr->next;
}
}
return maxprocp;
}
也就是说,重量级进程,就是指实际使用物理内存(RSS)最大的进程。
再来看看如何kill一个进程:
// release/system/memory/lmkd/lmkd.cpp
/* Kill one process specified by procp. Returns the size (in pages) of the process killed */
static int kill_one_process(struct proc* procp, int min_oom_score, struct kill_info *ki,
union meminfo *mi, struct wakeup_info *wi, struct timespec *tm,
struct psi_data *pd) {
......
// 调用reaper.kill杀进程
kill_result = reaper.kill({ pidfd, pid, uid }, false);
......
// 发LMK_STAT_KILL_OCCURRED消息给socket客户端
stats_write_lmk_kill_occurred(&kill_st, mem_st);
// 发LMK_PROCKILL消息给socket客户端
ctrl_data_write_lmk_kill_occurred((pid_t)pid, uid);
......
}
// release/system/memory/lmkd/reaper.cpp
int Reaper::kill(const struct target_proc& target, bool synchronous) {
// 如果未获取进程文件描述符(pidfd),调用传统的kill函数来发送SIGKILL信号终止目标进程
if (target.pidfd < 0) {
return ::kill(target.pid, SIGKILL);
}
// 异步杀进程
if (!synchronous && async_kill(target)) {
// we assume the kill will be successful and if it fails we will be notified
return 0;
}
// 如果前面异步杀进程失败,再次调用pidfd_send_signal向进程发送SIGKILL信号
int result = pidfd_send_signal(target.pidfd, SIGKILL, NULL, 0);
if (result) {
return result;
}
return 0;
}
// release/system/memory/lmkd/reaper.cpp
bool Reaper::async_kill(const struct target_proc& target) {
......
// 任务入队
queue_.push_back({ dup(target.pidfd), target.pid, target.uid });
// 唤醒任务线程
cond_.notify_one();
......
}
async_kill函数内将要kill的进程加入队列中,并唤醒处理任务的线程。
// release/system/memory/lmkd/reaper.cpp
bool Reaper::init(int comm_fd) {
......
thread_pool_ = new pthread_t[THREAD_POOL_SIZE];
for (int i = 0; i < THREAD_POOL_SIZE; i++) {
if (pthread_create(&thread_pool_[thread_cnt_], NULL, reaper_main, this)) {
ALOGE("pthread_create failed: %s", strerror(errno));
continue;
}
......
}
......
}
Reaper在init的时候(在lmkd.cpp的main函数中调用init_reaper函数时),调用pthread_create函数创建线程并保存在线程池thread_pool_中,指定线程启动后执行的函数为reaper_main函数。
// release/system/memory/lmkd/reaper.cpp
static void* reaper_main(void* param) {
......
for (;;) {
// 从队列中取出一个target_proc
target = reaper->dequeue_request();
......
// 调用pidfd_send_signal向进程发送SIGKILL信号
if (pidfd_send_signal(target.pidfd, SIGKILL, NULL, 0)) {
// Inform the main thread about failure to kill
reaper->notify_kill_failure(target.pid);
goto done;
}
......
}
pidfd_send_signal 函数与传统信号发送方式(kill函数)的比较
准确性和安全性:
传统的kill函数是通过进程 ID(pid)来发送信号,在进程 ID 可能被复用的情况下(即一个新进程被分配了已经终止的旧进程的 ID),可能会导致信号发送到错误的进程。而pidfd_send_signal使用pidfd,这个文件描述符与特定进程的生命周期绑定,在进程存活期间是唯一对应的,减少了信号发送错误目标的风险。
资源管理和灵活性:
pidfd可以更好地与文件描述符相关的系统调用和机制集成。例如,可以将pidfd放入epoll等 I/O 复用机制中,当目标进程发生某些事件(如终止)时,可以通过epoll机制及时得到通知,然后再决定是否发送信号或者进行其他操作。这种集成性是kill函数所没有的,提供了更灵活的进程信号管理方式。
下面推荐一些写lmk的文章:
https://source.android.google.cn/docs/core/perf/lmkd?hl=zh-cn
https://blog.csdn.net/youthcowboy/article/details/140665606
https://blog.csdn.net/omnispace/article/details/73320950
https://blog.csdn.net/weixin_40214774/article/details/141687953
https://blog.csdn.net/weixin_40214774/article/details/141230790
https://gityuan.com/2016/09/17/android-lowmemorykiller/
https://gityuan.com/2018/05/19/android-process-adj/
https://blog.csdn.net/buhui912/article/details/107153804/
如果你觉得这篇文章对你有帮助,麻烦点赞、关注、分享。如果文章中有理解有误的地方,还请指正。
转载请注明出处。
评论