Table of Contents

Chrome User Experience Report (CrUX)

The Chrome User Experience Report (also known as the Chrome UX Report, or CrUX for short) is a dataset that reflects how real-world Chrome users experience popular destinations on the web.

CrUX data is collected from real browsers around the world, based on certain browser options which determine user eligibility. A set of dimensions and metrics are collected which allow site owners to determine how users experience their sites.

The data collected by CrUX is available publicly through a number of Google tools and third-party tools and is used by Google Search to inform the page experience ranking factor.

Not all origins or pages are represented in the dataset. There are separate eligibility criteria for origins and pages, primarily that they must be publicly discoverable and there must be a large enough number of visitors in order to create a statistically significant dataset.

API

In the following text, we document access to CrUX via Google BigQuery.

  1. Register Google BigQuery account. Queries returning up to 1TB are free, the limit is reset monthly. It would require downloading the monthly CrUX list more than a hundred times to run out. But be careful with your SQL statements; if you query historic data, you can quickly run out of the limit and BigQuery can get expensive.
    1. Follow BigQuery documentation to create a project, set up billing if prompted, and enable BigQuery API.
  2. Follow more details from CrUX API documentation or skip to the example below.

Code Example

The following Python code downloads the CrUX list of November 2024 with the rank across all countries. It expects that you have set an environmental variable with a path to the credentials JSON file: export GOOGLE_APPLICATION_CREDENTIALS=“/your/path/to/creds.json”. It also expects you to have installed google-cloud, google-api-python-client, google-cloud-bigquery[pandas], and pandas Python packages.

crux.py
import pandas as pd  # to load it into Pandas DataFrame
 
from google.cloud import bigquery
from google.oauth2 import service_account
 
YYYYMM = 202411  # let's download only November 2024 data
LIMIT = 5  # limit the query to only first 5 results to reduce risk
 
client = bigquery.Client()  # this will fail if you have not set GOOGLE_APPLICATION_CREDENTIALS
 
query = f"SELECT * FROM `chrome-ux-report.experimental.country` WHERE yyyymm = {YYYYMM} LIMIT {LIMIT}"
 
df = client.query(query).to_dataframe()
print('Dataframe:')
print(df)
print('Dataframe columns:')
print(df.columns)
df.to_csv(f'crux_all_{YYYYMM}.csv')

Example output:

Dataframe:
  country_code  yyyymm  ...                          interaction_to_next_paint                                   navigation_types
0           es  202411  ...  {'histogram': {'bin': [{'start': 0, 'end': 25,...  {'navigate': {'fraction': 0.682}, 'navigate_ca...
1           es  202411  ...  {'histogram': {'bin': [{'start': 0, 'end': 25,...                                               None
2           es  202411  ...  {'histogram': {'bin': [{'start': 0, 'end': 25,...                                               None
3           es  202411  ...  {'histogram': {'bin': [{'start': 0, 'end': 25,...                                               None
4           es  202411  ...  {'histogram': {'bin': [{'start': 0, 'end': 25,...                                               None

[5 rows x 15 columns]
Dataframe columns:
Index(['country_code', 'yyyymm', 'origin', 'effective_connection_type',
       'form_factor', 'first_paint', 'first_contentful_paint',
       'dom_content_loaded', 'onload', 'first_input', 'layout_instability',
       'largest_contentful_paint', 'experimental', 'interaction_to_next_paint',
       'navigation_types'],
      dtype='object')

Columns

Dataset Size

Number of records in 202210 (collected by multiple queries like SELECT count(DISTINCT origin) as num_orig FROM `chrome-ux-report.experimental.country` WHERE yyyymm = 202210 AND country_code = 'de'):

Other Useful Code

To select top 10k US websites in November 2024:

SELECT DISTINCT country_code, origin, experimental.popularity.rank AS rank 
    FROM `chrome-ux-report.experimental.country` 
WHERE yyyymm = 202411
    AND country_code = 'us'
    AND rank <= 10000

To sample websites from EU and EFTA for privacy studies. This code will get origins of 500 countries in each of the rank x country combinations.

sample_crux.py
import pandas as pd  # to load it into Pandas DataFrame
 
from google.cloud import bigquery
from google.oauth2 import service_account
 
YYYYMM = 202411  # let's download only November 2024 data
RANKS = (1000, 5000, 10000, 50000, 100000, 500000, 1000000, 5000000)  # all ranks
COUNTRIES = (
'at','be','bg','hr','cy','cz','dk','ee','fi','fr','de','gr','hu','ie','it','lv','lt','lu','mt','nl','pl','pt','ro','sk','si','es','se',  #  EU states
'is','li','no','ch',  # EFTA states
'gb' # Other states
)
SITES_N = 500
 
client = bigquery.Client()  # this will fail if you have not set GOOGLE_APPLICATION_CREDENTIALS
query = f"SELECT DISTINCT country_code, origin, experimental.popularity.rank as rank FROM `chrome-ux-report.experimental.country` WHERE yyyymm = {YYYYMM}"
 
df = client.query(query).to_dataframe()
df.to_csv(f'crux_all_{YYYYMM}.csv')  # store crux original data
 
# this takes a while to process
websites = {
    country: {
        rank: set(df[(df['country_code'] == country) & (df['rank'] == rank)]['origin'].values)
        for rank in RANKS
    } 
    for country in COUNTRIES
}
 
sampled_websites = set()
for rank in RANKS:
    for country in COUNTRIES:
        source = websites[country][rank]
        sampled = set(sample(tuple(source), min(SITES_N, len(source))))
        sampled_websites = sampled_websites | sampled
 
with open(f'sampled_urls_{YYYYMM}.txt', 'w') as wf:
    for url in sampled_websites:
        wf.write(url + '\n')

More Details