阅读源码系列：ANR 是怎么产生的

根据日常的经验我们大概知道，如果 app 没有及时消费 MotionEvent，超过 5s 就会弹出 ANR 对话框；那么 ANR 的逻辑肯定是在事件分发过程中产生的，我们从事件的源头找起，看看 input 事件是怎么产生的

input 的分发

事件分发是从线程 InputReaderThread 开始的，它的主要工作是：

从目录 /dev/input 获取 input event
- 看来 android 的输入设备是挂载在 /dev/input 下的，当然会有多个输入设备：屏幕触摸、键盘、手柄等，使用 epoll 监听多个 fd
- EventHub.getEvents() 从 fd 读取 input_event 并转换为 RawEvent（看来输入设备的驱动都需要构造 input_event 给系统）
经过识别分类（按键、手势、手柄、滚轮等）、过滤等一系列操作后，添加到 mInboundQueue（等待 InputDispatcher 分发）
- InputDevice 将 RawEvent 交由各种 InputMapper 处理，例如：KeyboardInputMapper 将 RawEvent 包装为 NotifyKeyArgs，TouchInputMapper 将 RawEvent 包装为 NotifyMotionArgs
- InputDispatcher.notifyXXX 将各种 NotifyXXXArgs 包装为 XXXEntry 放入 mInboundQueue

这里需要补充下 Native Looper 不同于 Java Looper 的地方：提供了监听文件描述符的机制

/**
 * @param fd       需要监听的文件描述符
 * @param ident    表示为当前发生事件的标识符，必须 >= 0，或者为 POLL_CALLBACK(-2) 如果指定了 callback
 * @param events   表示为要监听的文件类型，默认是 EVENT_INPUT
 * @param callback 当有事件发生时，会回调该 callback 函数
 * @param data
 *
 * 主要做了两件事：
 * 1，把输入参数构造成 Request，添加到 mRequests
 * 2，将 fd 添加到 epoll
 *
 */
int addFd(int fd, int ident, int events, Looper_callbackFunc callback, void* data);
int addFd(int fd, int ident, int events, const sp<LooperCallback>& callback, void* data);

它有两种使用方式：

指定 callback 来处理事件 : 当该文件描述符上有事件到来时，该 callback 会被执行；调用 Looper.wake() 也会触发 callback 执行

// pollAll() -> pollOnce() -> pollInner()

int eventCount = epoll_wait(mEpollFd.get(), eventItems, EPOLL_MAX_EVENTS, timeoutMillis)
// ...
for (int i = 0; i < eventCount; i++) {
    int fd = eventItems[i].data.fd;
    uint32_t epollEvents = eventItems[i].events;
    if (fd == mWakeEventFd.get()) {
        // ...
    } else {
        ssize_t requestIndex = mRequests.indexOfKey(fd);
        if (requestIndex >= 0) {
            // ...
            pushResponse(events, mRequests.valueAt(requestIndex));
        } else {
            // ...
        }
    }
}
// ...
for (size_t i = 0; i < mResponses.size(); i++) {
    Response& response = mResponses.editItemAt(i);
    if (response.request.ident == POLL_CALLBACK) {
        int fd = response.request.fd;
        int events = response.events;
        void* data = response.request.data;
        int callbackResult = response.request.callback->handleEvent(fd, events, data);
        if (callbackResult == 0) {
            removeFd(fd, response.request.seq);
        }
        // Clear the callback reference in the response structure promptly because we
        // will not clear the response vector itself until the next poll.
        response.request.callback.clear();
        result = POLL_CALLBACK;
    }
}

通过指定的 ident 来处理事件：当该文件描述符有数据到来时，pollOnce() 会返回一个 ident，调用者会判断该 ident 是否等于自己需要处理的事件 ident，如果是的话，则开始处理事件

// pollAll() -> pollOnce()

for (;;) {
    while (mResponseIndex < mResponses.size()) {
        const Response& response = mResponses.itemAt(mResponseIndex++);
        int ident = response.request.ident;
        if (ident >= 0) {
            int fd = response.request.fd;
            int events = response.events;
            void* data = response.request.data;
            if (outFd != nullptr) *outFd = fd;
            if (outEvents != nullptr) *outEvents = events;
            if (outData != nullptr) *outData = data;
            return ident;
        }
    }
    // ...
}

ui 如何接收和处理 input

在分析事件分发的逻辑之前，我们先看看 ui 线程是怎么接收 input 事件的

可以看到在 native 层打开了一对 socket，server socket fd 给 InputDispatcher 线程，client socket fd 给 ui 线程（ViewRootImpl.mInputChannel），也就是说它们之间通过 socket 双向通讯

接收：在 native 层用 Native Looper 监听 client socket fd（epoll），封装成 InputMessage 传递给 java 层处理
经过一个 InputStage 责任链的处理，最终到达我们最熟悉的 View.dispatchTouchEvent
响应：ui 线程在消费完 input event 后，通过双向的 socket 告知 dispatcher 线程；如果此时 client socket 不可写，则将响应保存起来，等待下次 client socket 可写时
dispatcher - ui 这一段分发过程实际上是异步的，那么整个事件分发的过程也就是异步的，这是 ANR 产生的前提

input 的生命周期

一个 input event 的生命流程大概是这样的：

InputReader 放入 InputDispatcher.mInboundQueue 等待分发
InputDispatcher 将其移入 window 对应的 Connection→outboundQueue 等待发送
发送成功后，移入 Connection→waitQueue 等待 ui 线程的确认应答
收到确认应答，将 input event 移出 Connection→waitQueue

产生 ANR 的逻辑就在 dispatchKeyLocked（分发一个 input event 的过程）

1，findFocusedWindowTargetsLocked 找到 input event 的分发 window 对象，然后 checkWindowReadyForMoreInputLocked 检查 widnow 是否可以接收 input event，这里截取一段检查逻辑：

// outboundQueue 不为空说明此 window 仍有 input event 未发送，waitQueue 不为空说明有 input event 在消费中（未收到消费完成的响应）
// 也就是说 input event 必须是按顺序分发和消费的，不能乱序
if (eventEntry->type == EventEntry::TYPE_KEY) {
    if (!connection->outboundQueue.isEmpty() || !connection->waitQueue.isEmpty()) {
        return StringPrintf("Waiting to send key event because the %s window has not "
                            "finished processing all of the input events that were previously "
                            "delivered to it.  Outbound queue length: %d.  Wait queue length: "
                            "%d.",
                            targetType, connection->outboundQueue.count(),
                            connection->waitQueue.count());
    }
} else {
    if (!connection->waitQueue.isEmpty() &&
        currentTime >= connection->waitQueue.head->deliveryTime + STREAM_AHEAD_EVENT_TIMEOUT) {
        return StringPrintf("Waiting to send non-key event because the %s window has not "
                            "finished processing certain input events that were delivered to "
                            "it over "
                            "%0.1fms ago.  Wait queue length: %d.  Wait queue head age: "
                            "%0.1fms.",
                            targetType, STREAM_AHEAD_EVENT_TIMEOUT * 0.000001f,
                            connection->waitQueue.count(),
                            (currentTime - connection->waitQueue.head->deliveryTime) *
                                    0.000001f);
    }
}

2，最后走到 AppErrors.handleShowAnrUi 里就是弹出 ANR dialog 的地方

总结

三个线程，InputReader 负责监听 input fd，InputDispatcher 负责分发给 ui，ui 消费并反馈给 InputDispatcher
InputReader 和 InputDispatcher 在同一进程，mInboundQueue 加锁使用即可；InputDispatcher 和 ui 在不同的进程，通过 socket 通讯
事件分发是一个异步的过程，所以它会在 mInboundQueue（待分发）、outboundQueue（待发送）和 waitQueue（待响应）之间流转
input event 必须按顺序分发和消费，一个 input event 在分发前必须等待上一个 input event 的响应，如果等待时间超过 5s 则发生 ANR
ANR dialog 是在 AMS 进程弹出的

还学到了什么

epoll（或者说 IO 多路复用）ui 线程是通过 socket 与 InputDispatcher 线程交互的，它既要等待 input event（不能阻塞）又要处理 ui 相关工作，靠的就是 epoll；具体来说是 native Looper，因为 epoll 可以同时监听多个 fd；用一个 wakeUpFd + messageQueue，当 enqueueMessage 时往 wakeUpFd 写入，从而唤醒线程处理 message；添加 socket fd，当 socket 可读时，唤醒线程处理 socket 过来的消息，而且还可以同时处理 message 和 socket

参考

#ANR

图解 Glide 上一篇

一文搞懂事件分发，手势冲突和滑动冲突下一篇