Tuesday, 15 Dec 2020 ｜ 22 min read

Deep dive into React Fiber

前端本质上是人机交互的界面层，UI 层是用户直接面对的第一要素

不同前端框架的实质是实现不同的 f ，将 state（状态）转成对应稳定的 UI

那么需要考虑如何去实现 f，不同框架虽有不同的实现路径，但目标是一致的：

设计符合框架本身处理方式的 Interface，降低心智负担（DX）
高效渲染（UX）

由于大多数屏幕的刷新率为 60Hz，因此在使用 Web 浏览器时，达到的最快 Frame Rate 是每秒 60 帧（FPS），则大约有 16.67ms（1000/60）来执行动画代码来执行 JavaScript 和 Layout、Paint（我们知道，由于操纵 DOM，GUI渲染线程与JS线程是互斥的）

即每 16.67ms 时间内，需要完成如下工作：

JavaScript ----> Layout ----> Paint

当 JS 执行时间过长，超出了 16.67ms，这一帧就没有时间交给 Layout 和 Paint 了。

对于用户在 Input 输入内容这个行为来说，就体现为键入字符但是页面上不实时显示输入字符。

那么可以改善这样问题的方式，简单总结为这两条:

从框架实现上优化速度，选择最优渲染路径，尽可能的快
优先快速响应用户的操作，用户体验上不会觉得卡顿

第 1 条其实是现在众多框架采用的模式，比如:

引入 Virtual DOM，减轻直接操作 DOM 的昂贵 (框架实现不同对 V-DOM 性能的依赖也不同)
Diff 策略的优化
Template 易于静态分析，进行预测优化
Memoization
...

上述提及的优化方式可以总结为针对 CPU 瓶颈的优化

而在针对处理 UI 交互时，最大的问题不是执行速度不够快，而是任务太多了

我们需要一种新的角度解决这个问题

React Fiber 是 React 为了解决上述问题，提出的新调度 Algorithm 或者叫 Reconciler

Fiber 诞生前的旧架构

Stack Reconciler

Stack Reconciler 实现说明
React 组件元素和实例
Mount 时期调用 mountComponent，Update 时期调用 updateComponent

React 16 之前使用的 Reconciler (也叫 Stack Reconciler) 。

React 会根据真实的 DOM Tree 节点 mapping 为 Virtual DOM tree（调用 Functional component 以及 Class component 的 render，Parse JSX 为 Virtual DOM）。

在状态改变后，对新旧 Virtual DOM 进行 Diff，通知对应平台的 Renderer 将变化的节点进行渲染。

之所以命名为 Stack Reconciler，是因为整个 Diff 过程是递归 (recursive) ，过程中不可中断。递归复杂组件层级将占用更多时间，根据刚才的表述，导致帧的 Layout / Paint 缺失，也就是掉帧，带来的用户体验就是卡顿。

旧架构其实可以做到异步更新，但架构层面没办法做到被中断后继续执行刚才的进度

为了解决这个问题，React 16 将递归且无法中断的更新流程重构为异步且可中断更新的流程。

于是，全新的 Fiber 架构应运而生。

新架构 Fiber

Fiber Architecture 带来了 Incremental rendering，将渲染任务拆分为多个子任务，并将其分布在多个 frames

Fiber 其实就是协程啦

Fiber Architecture 有两个 Phase

Render
Commit

在 Render 阶段，React 将真实 DOM Node mapping 为新的节点结构：Fiber Node，每个不同类型的 React Element 都有一个对应类型的 Fiber Node，以支持新架构的运作方式。当前 DOM Tree 在 React 内部对应为 Fiber Current Tree，在状态更新后，新构建的为 Fiber WorkingProgress Tree，Diff 时，WorkingProgress Tree 将尝试复用之前的节点（两棵树的节点通过 alternate 属性相连接），之后 WorkingProgress Tree 将变为 Current Tree （Double Buffering），从而实现更新，完成 Render 阶段。

在 Render 阶段，执行是可以被更高优先级任务中断的，或者当前帧没有剩余时间，中断之后也可以从之前存储的进度继续

实现的方式是从之前的递归过程改为循环，同时在每次循环 Condition 加入 Yield（让出）检测，以检查当前是否有优先级更高的任务或者剩余时间是否足够

function workLoopConcurrent()
  // Perform work until Scheduler asks us to yield
  while (workInProgress !== null && !shouldYield()) {
    performUnitOfWork(workInProgress);
  }
}

由于 Render 阶段可被中断，导致执行次数不确定，所以 React 废弃了很多属于这一阶段的生命周期钩子

其中 shouldYield 是由 React 自己实现的 Scheduler 来完成的

Lets dive deeper

React 16.10 中，Scheduler 作为独立包被 React 使用，并基于 MessageChannel 作为实现检查的方式

之所以最后没有采用浏览器 API 是因为：

requestIdleCallback
- 浏览器兼容性比较差
- 最多达到 20 FPS ，不适合作为 UI 渲染，数据来自 Chrome 官方
requestAnimationFrame
- 曾被用作实现的方式之一，现已被废弃
- 当前页面在后台时，Chrome 上调用频率降低

从实现上讲，之前是依赖 requestAnimationFrame 使执行频次对其帧，而现在是高频（5ms间隔）调用 postMessage 来确保执行次数

使用 MessageChannel 相比于浏览器提供的 API，其实是将控制权交给 React，因为无法保证 requestAnimationFrame 在不同设备的调用频次

在 11 月，React 重构了 Scheduler（PR），分离了针对 DOM 和非 DOM 环境的逻辑

所有的导出依旧包含前缀 unstable_ ，未来可能还是会有很多改动

让我们看一下 React 究竟如何实现 Scheduler ，也就是 Fiber 的核心调度机制，调度高优先级任务交给 Reconciler。

主要的逻辑都在 SchedulerDOM.js 文件内

让我们看一下基于 MessageChannel 是如何工作的

const channel = new MessageChannel();
const port = channel.port2;
channel.port1.onmessage = performWorkUntilDeadline;

function requestHostCallback(callback) {
  scheduledHostCallback = callback;
  if (!isMessageLoopRunning) {
    isMessageLoopRunning = true;
    port.postMessage(null);
  }
}

实例化 MessageChannel，并创建两个 Port
设置 port1 的 handle 为 performWorkUntilDeadline
requestHostCallback 将发送消息，触发 performWorkUntilDeadline

接着来看一下 performWorkUntilDeadline 的实现

const performWorkUntilDeadline = () => {
  // scheduledHostCallback 来自于 requestHostCallback 中的 callback
  if (scheduledHostCallback !== null) {
    const currentTime = getCurrentTime();
    // Yield after `yieldInterval` ms, regardless of where we are in the vsync
    // cycle. This means there's always time remaining at the beginning of
    // the message event.
    deadline = currentTime + yieldInterval;
    const hasTimeRemaining = true;

    // If a scheduler task throws, exit the current browser task so the
    // error can be observed.
    //
    // Intentionally not using a try-catch, since that makes some debugging
    // techniques harder. Instead, if `scheduledHostCallback` errors, then
    // `hasMoreWork` will remain true, and we'll continue the work loop.
    let hasMoreWork = true;
    try {
      hasMoreWork = scheduledHostCallback(hasTimeRemaining, currentTime);
    } finally {
      if (hasMoreWork) {
        // If there's more work, schedule the next message event at the end
        // of the preceding one.
        port.postMessage(null);
      } else {
        isMessageLoopRunning = false;
        scheduledHostCallback = null;
      }
    }
  } else {
    isMessageLoopRunning = false;
  }
  // Yielding to the browser will give it a chance to paint, so we can
  // reset this.
  needsPaint = false;
};

yieldInterval 被定义为 5ms，代表了调用的最小间隔
scheduledHostCallback 来自于 requestHostCallback 中的 callback
调用 scheduledHostCallback ，并返回当前是否有更多的任务需要执行。如果有将会递归调用 performWorkUntilDeadline

是不是感觉到了关键的钥匙仿佛藏在了 scheduledHostCallback，那么究竟 scheduledHostCallback 是如何作为 requestHostCallback(scheduledHostCallback) 的参数呢

让我们回归到 Scheduler 的起点，也就是 React 将任务交给 Scheduler 的开始： unstable_scheduleCallback

function unstable_scheduleCallback(priorityLevel, callback, options) {
  var currentTime = getCurrentTime();

  var startTime;
  if (typeof options === 'object' && options !== null) {
    var delay = options.delay;
    if (typeof delay === 'number' && delay > 0) {
      startTime = currentTime + delay;
    } else {
      startTime = currentTime;
    }
  } else {
    startTime = currentTime;
  }

  var timeout;
  switch (priorityLevel) {
    case ImmediatePriority:
      timeout = IMMEDIATE_PRIORITY_TIMEOUT;
      break;
    case UserBlockingPriority:
      timeout = USER_BLOCKING_PRIORITY_TIMEOUT;
      break;
    case IdlePriority:
      timeout = IDLE_PRIORITY_TIMEOUT;
      break;
    case LowPriority:
      timeout = LOW_PRIORITY_TIMEOUT;
      break;
    case NormalPriority:
    default:
      timeout = NORMAL_PRIORITY_TIMEOUT;
      break;
  }

  var expirationTime = startTime + timeout;

  var newTask = {
    id: taskIdCounter++,
    callback,
    priorityLevel,
    startTime,
    expirationTime,
    sortIndex: -1,
  };
  if (enableProfiling) {
    newTask.isQueued = false;
  }

  if (startTime > currentTime) {
    // This is a delayed task.
    newTask.sortIndex = startTime;
    push(timerQueue, newTask);
    if (peek(taskQueue) === null && newTask === peek(timerQueue)) {
      // All tasks are delayed, and this is the task with the earliest delay.
      if (isHostTimeoutScheduled) {
        // Cancel an existing timeout.
        cancelHostTimeout();
      } else {
        isHostTimeoutScheduled = true;
      }
      // Schedule a timeout.
      requestHostTimeout(handleTimeout, startTime - currentTime);
    }
  } else {
    newTask.sortIndex = expirationTime;
    push(taskQueue, newTask);
    if (enableProfiling) {
      markTaskStart(newTask, currentTime);
      newTask.isQueued = true;
    }
    // Schedule a host callback, if needed. If we're already performing work,
    // wait until the next time we yield.
    if (!isHostCallbackScheduled && !isPerformingWork) {
      isHostCallbackScheduled = true;
      requestHostCallback(flushWork);
    }
  }

  return newTask;
}

快速扫完这段代码，可以知道

expirationTime 也就是描述任务的过期时间，等同于 currentTime + delay(opt) + timeout

其中 timeout 是根据当前任务的 priorityLevel 来定义的，Scheduler 目前有 5 种优先级的 Timeout 描述

var IMMEDIATE_PRIORITY_TIMEOUT = -1; // Times out immediately
var USER_BLOCKING_PRIORITY_TIMEOUT = 250; // Eventually times out
var NORMAL_PRIORITY_TIMEOUT = 5000;
var LOW_PRIORITY_TIMEOUT = 10000;
var IDLE_PRIORITY_TIMEOUT = maxSigned31BitInt; // Never times out

会创建一个新的 task，会附带描述 task 的一些 property
当 startTime > currentTime，意味着当前任务是被设置为 delay，task.sortIndex 被 startTime 赋值，并向 timerQueue push task
Else 的情况，也就是当前任务没有设置 delay，task.sortIndex 被 expirationTime 赋值，并向 taskQueue push task
在常见的情形下，将会执行 requestHostCallback(flushWork)，其中 flushWork 就是我们刚才寻找的那个 callback

接下来先解释一下 timerQueue 和 taskQueue，以及 task.sortIndex

其实就是两个 Queue，只是特殊一点，被称为 Priority queue 的数据类型，常被用作优先级事务的设计。

Priority queue 比较典型的实现是基于 Binary Heap

使用优先队列，可以将数据按特定规则有序排列，当 push 新数据时，会自动插入到合适的位置保证队列有序

而在 Scheduler 的具体实现中，使用了基于 min-heap 的最小优先队列，代码在 SchedulerMinHeap.js

而刚刚提到的 sortIndex 就是 timerQueue 和 taskQueue 的排序索引（具体排序实现）

timerQueue —> startTime
taskQueue —> expirationTime

来让我们回到刚才提到的 callback 实现上，也就是 flushWork

function flushWork(hasTimeRemaining, initialTime) {
  if (enableProfiling) {
    markSchedulerUnsuspended(initialTime);
  }

  // We'll need a host callback the next time work is scheduled.
  isHostCallbackScheduled = false;
  if (isHostTimeoutScheduled) {
    // We scheduled a timeout but it's no longer needed. Cancel it.
    isHostTimeoutScheduled = false;
    cancelHostTimeout();
  }

  isPerformingWork = true;
  const previousPriorityLevel = currentPriorityLevel;
  try {
    if (enableProfiling) {
      try {
        return workLoop(hasTimeRemaining, initialTime);
      } catch (error) {
        if (currentTask !== null) {
          const currentTime = getCurrentTime();
          markTaskErrored(currentTask, currentTime);
          currentTask.isQueued = false;
        }
        throw error;
      }
    } else {
      // No catch in prod code path.
      return workLoop(hasTimeRemaining, initialTime);
    }
  } finally {
    currentTask = null;
    currentPriorityLevel = previousPriorityLevel;
    isPerformingWork = false;
    if (enableProfiling) {
      const currentTime = getCurrentTime();
      markSchedulerSuspended(currentTime);
    }
  }
}

重置了 isHostTimeoutScheduled 的状态，确保在 flush 执行时，可以让新的任务被 schedule
flushWork 返回了 workLoop() 的结果，让我们最后来看看 workLoop 的实现

function workLoop(hasTimeRemaining, initialTime) {
  let currentTime = initialTime;
  advanceTimers(currentTime);
  currentTask = peek(taskQueue);
  while (
    currentTask !== null &&
    !(enableSchedulerDebugging && isSchedulerPaused)
  ) {
    if (
      currentTask.expirationTime > currentTime &&
      (!hasTimeRemaining || shouldYieldToHost())
    ) {
      // This currentTask hasn't expired, and we've reached the deadline.
      break;
    }
    const callback = currentTask.callback;
    if (typeof callback === 'function') {
      currentTask.callback = null;
      currentPriorityLevel = currentTask.priorityLevel;
      const didUserCallbackTimeout = currentTask.expirationTime <= currentTime;
      markTaskRun(currentTask, currentTime);
      const continuationCallback = callback(didUserCallbackTimeout);
      currentTime = getCurrentTime();
      if (typeof continuationCallback === 'function') {
        currentTask.callback = continuationCallback;
        markTaskYield(currentTask, currentTime);
      } else {
        if (enableProfiling) {
          markTaskCompleted(currentTask, currentTime);
          currentTask.isQueued = false;
        }
        if (currentTask === peek(taskQueue)) {
          pop(taskQueue);
        }
      }
      advanceTimers(currentTime);
    } else {
      pop(taskQueue);
    }
    currentTask = peek(taskQueue);
  }
  // Return whether there's additional work
  if (currentTask !== null) {
    return true;
  } else {
    const firstTimer = peek(timerQueue);
    if (firstTimer !== null) {
      requestHostTimeout(handleTimeout, firstTimer.startTime - currentTime);
    }
    return false;
  }
}

整个方法中，有很多次调用 advanceTimers，让我们稍后看一下实现
currentTask = peek(taskQueue)，peek taskQueue 中优先级最高的任务（不会从 Queue 移除 task）
执行 Loop，如果 currentTask 存在
检查当前任务未过期的情况下是否当前有剩余时间或者需要让出给高优先级的任务
- 我们等下再来看 shouldYieldToHost
执行 currentTask.callback，并将当前任务是否过期作为参数。这个 callback 会根据当前是否过期状态，缓存当前执行结果，返回未来可能会继续执行的方法：continuationCallback
- 是不是记起来了？ Fiber 可中断继续机制
- callback 返回也可能为非函数，代表任务可能已经完成，将从 taskQueue 中 pop 掉该任务
- currentTask.callback 可以看到要符合 continuationCallback 的运作过程，所以这个函数是要满足一些设计才可以在 Scheduler 正常工作的
最后将从 taskQueue 中再 peek 一个任务出来，继续执行 loop
1. 还记得刚才的 continuationCallback 吗，如果上一个任务缓存了结果，那么并不会在后面 pop
2. 所以这里 peak 出来的任务，还是上一个被中断的任务

刚才留下了一个坑：advanceTimers

function advanceTimers(currentTime) {
  // Check for tasks that are no longer delayed and add them to the queue.
  let timer = peek(timerQueue);
  while (timer !== null) {
    if (timer.callback === null) {
      // Timer was cancelled.
      pop(timerQueue);
    } else if (timer.startTime <= currentTime) {
      // Timer fired. Transfer to the task queue.
      pop(timerQueue);
      timer.sortIndex = timer.expirationTime;
      push(taskQueue, timer);
      if (enableProfiling) {
        markTaskStart(timer, currentTime);
        timer.isQueued = true;
      }
    } else {
      // Remaining timers are pending.
      return;
    }
    timer = peek(timerQueue);
  }
}

简单来说，advanceTimers 的作用是将 timerTask 中已经到了执行时间的 task，push 到 taskQueue
所以这是一个根据当前时间整理两个 Queue 中时序任务的函数，会在 Scheduler 的运作过程中反复调用

最后的最后来看一下 shouldYieldToHost

function shouldYieldToHost() {
  if (
    enableIsInputPending &&
    navigator !== undefined &&
    navigator.scheduling !== undefined &&
    navigator.scheduling.isInputPending !== undefined
  ) {
    const scheduling = navigator.scheduling;
    const currentTime = getCurrentTime();
    if (currentTime >= deadline) {
      // There's no time left. We may want to yield control of the main
      // thread, so the browser can perform high priority tasks. The main ones
      // are painting and user input. If there's a pending paint or a pending
      // input, then we should yield. But if there's neither, then we can
      // yield less often while remaining responsive. We'll eventually yield
      // regardless, since there could be a pending paint that wasn't
      // accompanied by a call to `requestPaint`, or other main thread tasks
      // like network events.
      if (needsPaint || scheduling.isInputPending()) {
        // There is either a pending paint or a pending input.
        return true;
      }
      // There's no pending input. Only yield if we've reached the max
      // yield interval.
      return currentTime >= maxYieldInterval;
    } else {
      // There's still time left in the frame.
      return false;
    }
  } else {
    // `isInputPending` is not available. Since we have no way of knowing if
    // there's pending input, always yield at the end of the frame.
    return getCurrentTime() >= deadline;
  }
}

首先进行了 navigator.scheduling 这个 API 的宿主环境检测
- Update 11-19-2020: 在 Chrome 87 上已经可以使用该 API
- navigator.scheduling.isInputPending() 用来判断当前是否有用户的输入操作
如果当前时间不够了，将控制权还给主线程，去满足优先执行 Painting 和 User Input，
当然就算没有高优先级操作，最后还是会保证在一定的阈值让给主线程：maxYieldInterval
isInputPending 不可用的情况下，直接计算时间差值是否满足

上述综合起来，描绘了 Scheduler 的大致运作原理

scheduleCallback 开始执行任务，确立新任务在 Scheduler 的优先级以及排序索引和 Callback 等属性，并依据是否 delay 将任务按照 sortIndex 放入两个 Queue 中
requestHostCallback 利用 MessageChannel 发送消息，在浏览器渲染后（也就是下一帧的开始）调用 performWorkUntilDeadline，并设置 deadline，执行 scheduledHostCallback 也就是 flushWork
在 flushWork 开始了 workLoop，不断去按照优先级执行 taskQueue 对应的任务，并且在执行前会检查当前时间是否超过 deadline，超过会让出主线程以执行高优先级任务。在执行期间会利用 advanceTimers 进行 taskQueue 和 timerQueue 的实时整理，task 的处理函数支持中断缓存结果，并返回 continuationCallback，未来继续执行

Lets take a breath

目前 Scheduler 的实现还是实验性的，代码里有很多地方可以看到实现的比较粗糙，但有助于理解整个 Fiber 架构核心调度机制是怎样运作的

而另一边的 Fiber Reconciler 也很重要，文章长度关系，下一篇会写关于这部分的介绍

现在只需知道：

Fiber is a plain JavaScript object, set state when be called, React will add the update to an updates queue, there are no updates, it's just going to clone the list into the WIP tree
Sebastian Markbåge 概述 Fiber 实现

Concurrent Mode

Dan Abramov: Beyond React 16 | JSConf Iceland

上面部分介绍了 React 如何从实现上解决 CPU 瓶颈（调度执行中断不同优先级任务），被称为 Time-slicing

而 Concurrent Mode 的另一个部分 Suspense，所带来的可能是更高维度的体验，如何解决 IO 问题（Data Fetching from Network）

使用 Suspense 让内存中隐式管理显示界面，可选的跳过一些 PlaceHolder/空白界面，或者自由组织组件的显示顺序，本质上是利用了 Just-noticeable difference，提升用户体验

JND 在 React 的实现

关于 Concurrent Mode 的意义，React 有专门的解释，这次 React 追求的极致，恰是在此之前的大多数 UI lib 并没有做到的

Vue 作者 Evan You 有过解释为什么 Vue3 删去了 Time-slicing ，阐释了与 React 在实现原理上的差异造成了不同的功能路线

其实 Suspense 的使用方式很有趣，在 React 的生态里，无须显式的书写 Handle，我们只需要"抛出"异步请求，React 会"自动" Handle，并根据结果去决定渲染真正的组件或者 fallback

类似于:

try {
  f(*)
  // Yield
  ...
} handle f() {
  effect()
  // Resume
}

是不是有 Algebraic Effects 的味道了

React 推出了一些实验性的 Concurrent Mode API，可以在官网的介绍中试一试具体的 demo，感受"魔法"