您的位置:首页 > 其它

ANR原因分析及解决方法

2016-05-16 16:09 661 查看
转载自:http://wenku.baidu.com/link?url=fUXG2wO3ECS2nZorr6MtTH8_PNCQESk-h8UpQdt1NH0_AjtgwbwW2b7PIO7oJ0OyJJzr62uB0202rXyNBK3tQtqGV3hxr_WVw6UrB7imO_a

内容目录

1 ANR是如何产生的.................................................................................................................................
5

1.1 ANR产生条件............................................................................................................................
5

1.2 引起ANR的根本原因............................................................................................................
5

2 ANR原因分析方法.................................................................................................................................
5

2.1 通过logcat日志分析原因....................................................................................................
5

2.2 通过trace文件分析原因......................................................................................................
9

3 ANR解决方法.........................................................................................................................................
12

4 参考资料..................................................................................................................................................
12


1
ANR是如何产生的


1.1
ANR产生条件


ANR的产生需要同时满足三个条件:

主线程:只有应用程序进程的主线程响应超时才会产生ANR;
超时时间:产生ANR的上下文不同,超时时间也不同,但只要超过这个时间上限没有响应就会产生ANR;
输入事件/特定操作:输入事件是指按键、触屏等设备输入事件,特定操作是指BroadcastReceiver和Service的生命周期中的各个函数调用。

产生ANR的上下文不同,导致ANR原因也不同,主要有以下三种情况:

应用进程的主线程对输入事件在5s内没有处理完毕;
应用进程的主线程在执行BroadcastRecevier的onReceive函数时10s内没有处理完毕;
应用进程的主线程在执行Service的各个生命周期函数时20s内没有处理完毕;


1.2
引起ANR的根本原因


引起ANR的根本原因,总的来说可以归纳为两类:

应用进程自身引起的,比如:主线程阻塞、挂起、死循环,执行耗时操作等;
其他进程引起的,比如:其他进程CPU占用率过高,导致当前应用进程无法抢占到CPU时间片。常见的问题如文件读写频繁,io进程CPU占用率过高,导致当前应用出现ANR;


2
ANR原因分析方法


2.1 通过logcat日志分析原因


举一个开发过程中的ANR实例,《从云盘上下载视频到本地,进入下载中心删除任务和文件,进入文件管理提示‘文件管理无响应’》,在logcat日志中输出了以下ANR信息:
01-21 23:58:40.265 975 991 E ActivityManager: ANR in com.letv.filemanager

01-21 23:58:40.265 975 991 E ActivityManager: Reason: Executing service com.letv.filemanager/.netstorage.service.DlnaService

01-21 23:58:40.265 975 991 E ActivityManager: Load: 5.62 / 4.3 / 3.55

01-21 23:58:40.265 975 991 E ActivityManager: CPU usage from 30331ms to 0ms ago:

01-21 23:58:40.265 975 991 E ActivityManager: 26% 1760/com.letv.airplay: 24% user + 1.6% kernel / faults: 1094 minor

01-21 23:58:40.265 975 991 E ActivityManager: 15% 787/cdnclient: 0% user + 15% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 13% 298/kswapd0: 0% user + 13% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 9.5% 1437/com.letv.dmr: 9% user + 0.5% kernel / faults: 572 minor

01-21 23:58:40.265 975 991 E ActivityManager: 9.1% 953/flush-179:0: 0% user + 9.1% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 5.8% 788/collector: 0.8% user + 4.9% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 5% 419/mmcqd: 0% user + 5% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 2.9% 814/ld-linux.so.3: 1.1% user + 1.8% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 2.5% 822/dd: 0.2% user + 2.2% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 1.4% 2533/com.letv.threeScreen: 0.7% user + 0.7% kernel / faults: 134 minor

01-21 23:58:40.265 975 991 E ActivityManager: 0.5% 817/logcat: 0% user + 0.4% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 0.4% 7/events/0: 0% user + 0.4% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 0.4% 1254/com.letv.t2.launcher: 0.4% user + 0% kernel / faults: 20 minor

01-21 23:58:40.265 975 991 E ActivityManager: 0.2% 975/system_server: 0% user + 0.1% kernel / faults: 34 minor 1 major

01-21 23:58:40.265 975 991 E ActivityManager: 0.2% 1452/com.letv.downloads: 0.1% user + 0.1% kernel / faults: 40 minor

01-21 23:58:40.265 975 991 E ActivityManager: 0.1% 808/virtualkeypad: 0% user + 0.1% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 0.1% 1292/RTW_CMD_THREAD: 0% user + 0.1% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 0% 753/jbd2/mmcblk9-8: 0% user + 0% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 0% 8/events/1: 0% user + 0% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 0% 177/bdi-default: 0% user + 0% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 0% 786/letvmanager: 0% user + 0% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 0% 1472/com.letv.filemanager: 0% user + 0% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 0% 1737/com.letv.medialog: 0% user + 0% kernel / faults: 48 minor

01-21 23:58:40.265 975 991 E ActivityManager: 62% TOTAL: 7.2% user + 10% kernel + 43% iowait + 1% softirq

01-21 23:58:40.265 975 991 E ActivityManager: CPU usage from 3195ms to 3733ms later:

01-21 23:58:40.265 975 991 E ActivityManager: 15% 975/system_server: 3.7% user + 11% kernel / faults: 1 minor

01-21 23:58:40.265 975 991 E ActivityManager: 15% 991/ActivityManager: 3.7% user + 11% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 7.5% 1760/com.letv.airplay: 7.5% user + 0% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 5.6% 2615/SocketListener(: 5.6% user + 0% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 5.5% 419/mmcqd: 0% user + 5.5% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 3.7% 298/kswapd0: 0% user + 3.7% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 3.7% 788/collector: 0% user + 3.7% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 3.7% 866/collector: 0% user + 3.7% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 1.8% 865/collector: 0% user + 1.8% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 3.7% 814/ld-linux.so.3: 1.8% user + 1.8% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 1.8% 1009/814Linux hotplu: 1.8% user + 0% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 1.8% 1013/IRQThread_157: 0% user + 1.8% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 0.3% 1254/com.letv.t2.launcher: 0.3% user + 0% kernel / faults: 1 minor

01-21 23:58:40.265 975 991 E ActivityManager: 0.3% 1254/etv.t2.launcher: 0.3% user + 0% kernel

01-21 23:58:40.265 975 991 E ActivityManager: 54% TOTAL: 2.9% user + 5.9% kernel + 45% iowait
我们可以通过ActivityManagerService的appNotResponding方法看一下ANR信息是如何记录到logcat日志中的:

.\framework\base\services\Java\com\Android\server\am\ActivityManagerService.java
final void appNotResponding(ProcessRecord app, ActivityRecord activity,

ActivityRecord parent, boolean aboveSystem, final String annotation) {

......

synchronized (this) {

// PowerManager.reboot() can block for a long time, so ignore ANRs while shutting down.

if (mShuttingDown) {

//重启时忽略ANR

Slog.i(TAG, "During shutdown skipping ANR: " + app + " " + annotation);

return;

} else if (app.notResponding) {

//之前已处理过则不再处理

Slog.i(TAG, "Skipping duplicate ANR: " + app + " " + annotation);

return;

} else if (app.crashing) {

//已经Crash的进程则不再处理ANR

Slog.i(TAG, "Crashing app skipping ANR: " + app + " " + annotation);

return;

}

//设置ANR标志,用于防止重复处理

// In case we come through here for the same app before completing

// this one, mark as anring now so we will bail out.

app.notResponding = true;

// Log the ANR to the event log.

EventLog.writeEvent(EventLogTags.AM_ANR, app.userId, app.pid,

app.processName, app.info.flags, annotation);

//将发生ANR的应用pid放入firstPids集合中

// Dump thread traces as quickly as we can, starting with "interesting" processes.

firstPids.add(app.pid);

//将父PID也放入firstPids集合

int parentPid = app.pid;

if (parent != null && parent.app != null && parent.app.pid > 0) parentPid = parent.app.pid;

if (parentPid != app.pid) firstPids.add(parentPid);

//将主PID放入firstPids集合

if (MY_PID != app.pid && MY_PID != parentPid) firstPids.add(MY_PID);

//将persistent进程放入firstPids集合

for (int i = mLruProcesses.size() - 1; i >= 0; i--) {

ProcessRecord r = mLruProcesses.get(i);

if (r != null && r.thread != null) {

int pid = r.pid;

if (pid > 0 && pid != app.pid && pid != parentPid && pid != MY_PID) {

if (r.persistent) {

firstPids.add(pid);

} else {

lastPids.put(pid, Boolean.TRUE);

}

}

}

}

}

// 将ANR信息写入log

// Log the ANR to the main log.

StringBuilder info = new StringBuilder();

info.setLength(0);

info.append("ANR in ").append(app.processName);

if (activity != null && activity.shortComponentName != null) {

info.append(" (").append(activity.shortComponentName).append(")");

}

info.append("\n");

info.append("PID: ").append(app.pid).append("\n");

if (annotation != null) {

info.append("Reason: ").append(annotation).append("\n");

}

if (parent != null && parent != activity) {

info.append("Parent: ").append(parent.shortComponentName).append("\n");

}

final ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);

File tracesFile = dumpStackTraces(true, firstPids, processCpuTracker, lastPids,

NATIVE_STACKS_OF_INTEREST);

String cpuInfo = null;

if (MONITOR_CPU_USAGE) {

updateCpuStatsNow();

synchronized (mProcessCpuThread) {

cpuInfo = mProcessCpuTracker.printCurrentState(anrTime);

}

info.append(processCpuTracker.printCurrentLoad());

info.append(cpuInfo);

}

info.append(processCpuTracker.printCurrentState(anrTime));

Slog.e(TAG, info.toString());

......

}

AMS在发生ANR时,会按以下格式将ANR信息记录到logcat日志中:
ActivityManager: ANR in [Process Name]([short component name])

ActivityManager: PID: [Application Pid]

ActivityManager: Reason: [Annotation]

ActivityManager: Parent: [short component name of parent]

[Process CPU state]
其中的Reason信息可以给出ANR产生原因的一些详细信息,

· 输入事件处理引起的ANR,提示信息格式为:“Inputdispatching timed out [anr reason]

· Service处理引起的ANR,提示信息格式为:“Executingservice [Package name]/[Short class name]

· Broadcast处理引起的ANR,提示信息格式为:“Broadcast of [Intent focused byBroadcast receiver]

其中的Process CPU state 信息格式如下:
ActivityManager: Load: [Load1] / [Load5] / [Load15]

ActivityManager: CPU usage from [上次采样与现在的时间差] ms to [当前采样与现在的时间差] ms ago/later:

ActivityManager: [总的CPU时间占用率]% [PID]/[Process Name]: [用户CPU时间占用率]%
user + [系统CPU时间占用率]% kernel + [IO等待CPU时间占用率]% iowait + [硬中断CPU时间占用率]%
irq + [软中断CPU时间占用率]% softirq / faults: [次要页错误/主要页错误]minor/major

ActivityManager: .....

ActivityManager: [CPU时间占用率合计统计]
其中Load1, Load5, Load15分别为CPU 1分钟平均任务负载数,5分钟平均任务负载数,15分钟平均任务负载数,平均任务负载数和CPU占用率并没有必然联系,可以作为参考信息。可以通过分析各进程的CPU时间占用率,来判断是否为某些进程长期占用CPU导致该进程无法获取到足够的CPU处理时间,而导致发生ANR。

这里需要重点关注的是Load1,各进程总的CPU时间占用率,用户CPU时间占用率,系统CPU时间占用率,以及iowait CPU时间占用率。

2.2
通过trace文件分析原因


ActivityManagerService的appNotResponding方法在写入logcat日志的同时,还会将ANR时的stack trace信息写入到trace文件,具体代码流程如下:

final void appNotResponding(ProcessRecord app, ActivityRecord activity,

ActivityRecord parent, boolean aboveSystem, final String annotation) {

......

File tracesFile = dumpStackTraces(true, firstPids, processCpuTracker, lastPids,

NATIVE_STACKS_OF_INTEREST);

......

}

public static File dumpStackTraces(boolean clearTraces, ArrayList<Integer> firstPids,

ProcessCpuTracker processCpuTracker, SparseArray<Boolean> lastPids, String[] nativeProcs) {

//判断系统属性,确认trace文件路径

String tracesPath = SystemProperties.get("dalvik.vm.stack-trace-file", null);

if (tracesPath == null || tracesPath.length() == 0) {

return null;

}

//若trace文件不存在则创建,并设置可读写属性

File tracesFile = new File(tracesPath);

try {

File tracesDir = tracesFile.getParentFile();

if (!tracesDir.exists()) {

tracesFile.mkdirs();

if (!SELinux.restorecon(tracesDir)) {

return null;

}

}

FileUtils.setPermissions(tracesDir.getPath(), 0775, -1, -1); // drwxrwxr-x

if (clearTraces && tracesFile.exists()) tracesFile.delete();

tracesFile.createNewFile();

FileUtils.setPermissions(tracesFile.getPath(), 0666, -1, -1); // -rw-rw-rw-

} catch (IOException e) {

Slog.w(TAG, "Unable to prepare ANR traces file: " + tracesPath, e);

return null;

}

//执行dump操作

dumpStackTraces(tracesPath, firstPids, processCpuTracker, lastPids, nativeProcs);

return tracesFile;

}

private static void dumpStackTraces(String tracesPath, ArrayList<Integer> firstPids,

ProcessCpuTracker processCpuTracker, SparseArray<Boolean> lastPids, String[] nativeProcs) {

//设置file observer

// Use a FileObserver to detect when traces finish writing.

// The order of traces is considered important to maintain for legibility.

FileObserver observer = new FileObserver(tracesPath, FileObserver.CLOSE_WRITE) {

@Override

public synchronized void onEvent(int event, String path) { notify(); }

};

try {

observer.startWatching();

//向firstPids进程集合发送关闭信号,虚拟机接收到SIGNAL_QUIT信号后会将进程中各个线程的函//数堆栈信息输出到traces.txt文件中

// First collect all of the stacks of the most important pids.

if (firstPids != null) {

try {

int num = firstPids.size();

for (int i = 0; i < num; i++) {

synchronized (observer) {

Process.sendSignal(firstPids.get(i), Process.SIGNAL_QUIT);

observer.wait(200); // Wait for write-close, give up after 200msec

}

}

} catch (InterruptedException e) {

Log.wtf(TAG, e);

}

}

// Next collect the stacks of the native pids

if (nativeProcs != null) {

int[] pids = Process.getPidsForCommands(nativeProcs);

if (pids != null) {

for (int pid : pids) {

Debug.dumpNativeBacktraceToFile(pid, tracesPath);

}

}

}

//进行CPU占用率采样,并将占用率排在前几位的进程的stack trace信息dump到文件

// Lastly, measure CPU usage.

if (processCpuTracker != null) {

processCpuTracker.init();

System.gc();

processCpuTracker.update();

try {

synchronized (processCpuTracker) {

processCpuTracker.wait(500); // measure over 1/2 second.

}

} catch (InterruptedException e) {

}

processCpuTracker.update();

// We'll take the stack crawls of just the top apps using CPU.

final int N = processCpuTracker.countWorkingStats();

int numProcs = 0;

for (int i=0; i<N && numProcs<5; i++) {

ProcessCpuTracker.Stats stats = processCpuTracker.getWorkingStats(i);

if (lastPids.indexOfKey(stats.pid) >= 0) {

numProcs++;

try {

synchronized (observer) {

Process.sendSignal(stats.pid, Process.SIGNAL_QUIT);

observer.wait(200); // Wait for write-close, give up after 200msec

}

} catch (InterruptedException e) {

Log.wtf(TAG, e);

}

}

}

}

} finally {

observer.stopWatching();

}

}
其中firstPids的Pid内容可以参考appNotResponding方法中的实现逻辑。


3
ANR解决方法

因为ANR主要是因为主线程由于耗时操作被阻塞而产生的,所以常见的解决方法是不在主线程做耗时操作,具体实现时需要注意以下几点:

· 主线程需要做耗时操作时,比如网络访问、数据库操作及位图变换等,必须启动一子线程处理,并利用handler来更新UI;

· 子线程尽量使用Android提供的API,比如HandlerThread,AsyncTask,AsyncQueryHandler等,这些API都提供了对于线程的系统级管理。如果应用直接使用Thread实现的话,则需要对这些子线程进行显式管理,比如线程池及线程周期的控制,以防止系统资源和内存泄漏;

· Broadcast Receiver中如果有耗时操作,可以放到service中来处理;

· 在后台子线程处理耗时操作时,为了提高用户体验,可以在前台界面显示某些动画或者progress bar;


4 参考资料

/article/4071785.html

/article/4071786.html

/article/4071787.html
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: