User Tools

Site Tools


privacy:cookies

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
privacy:cookies [2025/01/09 11:20] – fixed headers karelkubicekprivacy:cookies [2025/03/04 13:53] (current) – [First- and Third-Party Cookies] highlighting common misconception karelkubicek
Line 7: Line 7:
 ===== First- and Third-Party Cookies ===== ===== First- and Third-Party Cookies =====
  
-First-party cookies are set by the domain the user is directly visiting, while all other cookies are from third partiesA common misconception is that first-party cookies are always benign and third-party cookies are always intrusive. However, first-party cookies can also track users or even be set by third parties using CNAME cloaking. First-party cookies are restricted to the website's context, while third-party cookies can track users across multiple websites.+<WRAP important>A common misconception in research is that first-party cookies are always benign and third-party cookies are always intrusive!</WRAP> 
 + 
 +First-party cookies are set by the domain the user is directly visiting, while all other cookies are considered third-party cookiesAlthough third-party cookies are significantly more used for tracking than first-party cookies, it is wrong to claim that first-party cookies are always benign and third-party cookies are always intrusive. First-party cookies can also track users or even be set by third parties using [[https://arxiv.org/abs/2102.09301|CNAME cloaking]] and there are many third-party cookies serving necessary functionality such as SSO. 
 + 
 +The only difference is from the browser perspective. First-party cookies are accessible only from the first-party website's context, while third-party cookies are accessible across multiple websites that embed the same third party. But this implementation depends on the browser, with [[https://webkit.org/blog/8943/privacy-preserving-ad-click-attribution-for-the-web/|Safari]] and [[https://blog.mozilla.org/en/products/firefox/firefox-rolls-out-total-cookie-protection-by-default-to-all-users-worldwide/|Firefox]] setting the  storage for third parties for every website separately.
  
 Munir et al. {[shaoor2023cookiegraph]} observed that 89.86% of the top-million websites use first-party tracking cookies. Of these, 96.61% are ghostwritten by third-party scripts embedded in the first-party context, and some are set by fingerprinting scripts. Munir et al. {[shaoor2023cookiegraph]} observed that 89.86% of the top-million websites use first-party tracking cookies. Of these, 96.61% are ghostwritten by third-party scripts embedded in the first-party context, and some are set by fingerprinting scripts.
Line 30: Line 34:
 Using datasets of cookies or online classification services has significant disadvantages: they cannot classify unseen data or assign one cookie multiple classes based on dynamic content. However, they offer advantages over ML methods, such as post-crawl classification of detected cookies. Using datasets of cookies or online classification services has significant disadvantages: they cannot classify unseen data or assign one cookie multiple classes based on dynamic content. However, they offer advantages over ML methods, such as post-crawl classification of detected cookies.
  
-We discuss issues with dynamic cookie names, publicly released datasets, and two main online classification services: Cookiepedia and Cookiedatabase. +<WRAP info>
- +
-==== Dynamic Cookie Names ==== +
 Some websites deviate from the typical key-value (cookie name and cookie value) scheme by storing data directly in the cookie name. There are several cases, explained by following examples: Some websites deviate from the typical key-value (cookie name and cookie value) scheme by storing data directly in the cookie name. There are several cases, explained by following examples:
  
   * ''_gat_UA-<ID>'' and ''_ga_<ID>'' (Google Analytics cookies), where the ID is unique to the Google Analytics configuration but not dynamic per user.   * ''_gat_UA-<ID>'' and ''_ga_<ID>'' (Google Analytics cookies), where the ID is unique to the Google Analytics configuration but not dynamic per user.
-  * ''AMCV_<ID>@<host>'' (Adobe Experience Cloud Identity Service cookie), where the ID is unique per user. Such cookie names cannot be found in databases due to their dynamic nature.+  * ''AMCV_<ID>@<host>'' (Adobe Experience Cloud Identity Service cookie), where the ID is unique per user. Such cookie names cannot be found in databases due to their dynamic nature,, except for cases when the database stores patterns. 
 +</WRAP> 
 + 
 +We discuss publicly released datasets and two main online classification services: Cookiepedia and Cookiedatabase. 
  
 ==== OneTrust and CookieBot Dataset ==== ==== OneTrust and CookieBot Dataset ====
Line 81: Line 86:
   - Preferences Cookies (in some jurisdictions known as functionality)   - Preferences Cookies (in some jurisdictions known as functionality)
  
 +==== Cookiesearch ====
 +
 +https://cookiesearch.org/ (no experience)
 ===== Machine-Learning Classification ===== ===== Machine-Learning Classification =====
  
privacy/cookies.1736421607.txt.gz · Last modified: 2025/01/09 11:20 by karelkubicek