Performance endpoints:
Probe | Success? | Criterion | Actual |
---|---|---|---|
COMPOSITE_TIME |
✅ | Median per-user fraction of slow frames < 0.5% (absolute) | 0.16% slow |
CONTENT_FRAME_TIME_VSYNC |
✅ | ≤ 5% regression in median of per-user fraction of slow events | 1.7-3.9% regression |
CONTENT_FULL_PAINT_TIME |
✅ | ≤ 5% regression in median fraction of slow paints (> 16 ms) | 44.8-46.4% improvement |
CONTENT_FULL_PAINT_TIME |
❌ | ≤ 5% regression in median of per-user means | 8.7-9.9% regression |
FX_PAGE_LOAD_MS_2 |
✅ | ≤ 5% regression in median of per-user means | No difference |
FX_TAB_SWITCH_COMPOSITE_E10S_MS |
✅ | ≤ 5% regression in median of per-user means | 2.1-2.7% improvement |
Stability endpoints:
Endpoint | Success | Criterion | Actual |
---|---|---|---|
Overall crash reports | ❌ | ≤ 5% increase in crash rate | 6% increase in crash rate |
OOM crash reports | ✅ | ≤ 5% increase in crash rate | 20% decrease in OOM crashes |
CANVAS_WEBGL_SUCCESS |
✅ | ≤ 5% regression in median of fraction “True” per user | No difference |
DEVICE_RESET_REASON |
✅ | ≤ 5% increase in reset rate | 70% decrease in device resets |
The higher crash rate in the WebRender branch is attributable to an increase in the rate of GPU process crashes. Main and content process crash rates fell.
Retention and engagement metrics were not strongly affected, though week 1 retention was 0.06%-0.45% lower in the WebRender branch.
WebRender is a new technology for getting webpages onto the screen using a GPU. In this experiment, we enabled WebRender for users in the Firefox 66 release channel running Windows 10 with certain GPU chipsets.
We have been running a separate ongoing experiment in the beta and nightly channels to guide development, observing how performance changes on a build-by-build basis. This report does not describe that work.
Before computing results for performance endpoints, user sessions were filtered to ensure that the compositor for the telmetry session matched the enrolled branch. The first telemetry session after a user enrolled was dropped for users in both branches, because the user needs to restart the browser after enrolling for WebRender to be enabled. (The enrollment session was chosen by identifying the session containing the lowest profile_subsession_counter
for each client_id
.) Users who unenrolled from the experiment were excluded after unenrollment.
This avoids a minimizing bias that could result from contaminating the results for the treatment branch with results from users that were not exposed to the treatment. The approach may overestimate the actual effect of WebRender on the population if a non-random set of users (e.g. users with poor performance) were more likely to unenroll from the experiment, but this is unlikely because unenrollments were rare, and balanced between the experiment and control branches (see “Enrollment” below)
Metric | Median per-user mean, as WR % of Gecko | 95% CI (low) | 95% CI (high) |
---|---|---|---|
content_full_paint_time | 109.39 | 108.89 | 109.92 |
page_load_ms | 100.44 | 99.93 | 100.96 |
tab_switch_composite | 97.60 | 97.30 | 97.88 |
Median per-user mean values of content_full_paint_time
were about 10% higher in the WebRender case. Page load times did not change. Tab switch time decreased a little more than 3%.
The distribution of each metric will be discussed in the following subsections, which is a pattern that will continue in this report.
WebRender users tended to have a somewhat higher average CONTENT_FULL_PAINT_TIME, though WebRender users were less likely to have slow (> 16 ms) events (below).
The per-user-mean page load time distributions were essentially identical between WebRender and Gecko users.
The median per-user average tab switch was slightly faster with WebRender. The fastest tab switches took longer with WebRender enabled, but the slowest tab switches took less time.
The criterion for COMPOSITE_TIME
was that the median per-user slow fraction should be < 0.5%.
The median fraction of slow composites is much higher in the WebRender branch compared to the Gecko branch, but lower than the 0.5% threshold:
Branch | Median per-user slow composites (percent) | 95% CI (low) | 95$ CI (high) |
---|---|---|---|
WebRender | 0.1610 | 0.159 | 0.1628 |
Gecko | 0.0152 | 0.015 | 0.0153 |
Metric | Median per-user fraction, as WR % of Gecko | 95% CI (low) | 95% CI (high) |
---|---|---|---|
content_frame_time_vsync (> 200) | 102.97 | 101.94 | 104.14 |
content_full_paint_time (> 16) | 54.37 | 53.59 | 55.32 |
The mean per-user fraction of slow content_frame_time_vsync
events was 2-4% higher in the WebRender branch.
The mean per-user fraction of slow content_full_paint_time
events was about halved in the WebRender branch.
The median WebRender and Gecko user experienced very similar fractions of slow CONTENT_FRAME_TIME_VSYNCs (> 200% vsync).
The fraction of slow frames was somewhat higher for the slowest half of WebRender users than it was for the slowest half of Gecko users.
The median WebRender user experienced considerably fewer slow paints (> 16 ms) than the median Gecko user.
The worst-performing 20% of users in the WebRender and Gecko branches had similar slow paint fractions.
Sessions were filtered for stability in the same manner as for performance.
Despite a clear increase in GPU process crashes, the overall crash rate was only slightly higher because the number of main and content process crashes fell.
OOM crashes are a subset of main
process crashes. They were less common in the WebRender branch.
Failure to create a WebGL canvas was rare in either branch. This is reflected in the per-user average fraction of canvas creation successes:
branch | average_success_fraction |
---|---|
Gecko | 0.9996119 |
WebRender | 0.9996374 |
Retention and engagement metrics were observed for all enrolled users from the moment of enrollment; filtering was not performed to ensure that the compositor matched the enrolled branch, and enrollment sessions were not discarded.
Total URIs visited in each branch were indistinguishable.
Computing bootstrapped 95% confidence intervals for the difference between the distributions along the curve, the intervals all contain zero:
The distribution of per-user active time was similar between branches:
Active time may have decreased slightly for WebRender branch users among less avid users.
Distribution of total browser-open time was similar between branches.
Similar to active time, less-avid users may have used the browser slightly less in the WebRender branch.
Retention was similar between the study branches.
Retention was slightly lower for the WebRender branch at 1 week but not 2 weeks. The 95% confidence interval for the true difference between the branches was at least 0.06% and at most 0.45%.
Daily enrollment and unenrollment were symmetric between branches.
Unenrollments were minimal and distributed equally between branches.
CONTENT_FULL_PAINT_TIME
increased, the number of measurements greater than 16 ms (=1/60 Hz) actually decreased. Because most users have a 60 Hz refresh rate, this may not be a generally user-visible regression.The webrender-performance-66
experiment enrolled users in Firefox 66 who met the normandy.telemetry.main.environment.system.gfx.features.wrQualified.status == 'available'
criterion. At the time of the study, this enrolled users running Windows 10 on systems without a battery and having one of a list of whitelisted graphics cards.
ETL was computed by two notebooks: