privacy:cookies
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
privacy:cookies [2025/01/07 17:51] – formatting fix karelkubicek | privacy:cookies [2025/03/04 13:53] (current) – [First- and Third-Party Cookies] highlighting common misconception karelkubicek | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Classifying Cookies ====== | ====== Classifying Cookies ====== | ||
- | Browser cookies are still the most commonly used method for tracking the session state of websites and the identity of visitors. According to prior studies, between 80% in 2012 {[roesner2012_detecting]} and 90% in 2019 {[solomos2019_clash, | + | Browser cookies are still the most commonly used method for tracking the session state of websites and the identity of visitors. According to prior studies, between 80% in 2012 {[roesner2012_detecting]} and 90% in 2019 {[solomos2019_clash, |
Below, we discuss two main classification methods: using datasets of labeled cookies or machine learning (ML) to classify cookies based on their context and request URL. But first some preliminaries. | Below, we discuss two main classification methods: using datasets of labeled cookies or machine learning (ML) to classify cookies based on their context and request URL. But first some preliminaries. | ||
Line 7: | Line 7: | ||
===== First- and Third-Party Cookies ===== | ===== First- and Third-Party Cookies ===== | ||
- | First-party cookies are set by the domain the user is directly visiting, while all other cookies are from third parties. A common misconception | + | <WRAP important> |
+ | |||
+ | First-party cookies are set by the domain the user is directly visiting, while all other cookies are considered | ||
+ | |||
+ | The only difference is from the browser perspective. First-party cookies are accessible only from the first-party | ||
Munir et al. {[shaoor2023cookiegraph]} observed that 89.86% of the top-million websites use first-party tracking cookies. Of these, 96.61% are ghostwritten by third-party scripts embedded in the first-party context, and some are set by fingerprinting scripts. | Munir et al. {[shaoor2023cookiegraph]} observed that 89.86% of the top-million websites use first-party tracking cookies. Of these, 96.61% are ghostwritten by third-party scripts embedded in the first-party context, and some are set by fingerprinting scripts. | ||
Line 15: | Line 19: | ||
Law((ePrivacy Directive, Article 5.3)) recognizes only two categories of cookies: those strictly necessary for the service and others. The industry, however, has developed various categorization schemes: | Law((ePrivacy Directive, Article 5.3)) recognizes only two categories of cookies: those strictly necessary for the service and others. The industry, however, has developed various categorization schemes: | ||
- | * [[https:// | + | * [[https:// |
* Strictly-necessary cookies: | * Strictly-necessary cookies: | ||
* Required to enable essential functions of the website, such as registration or shopping carts. They are always enabled to allow for a smooth and problem-free browsing experience. | * Required to enable essential functions of the website, such as registration or shopping carts. They are always enabled to allow for a smooth and problem-free browsing experience. | ||
Line 30: | Line 34: | ||
Using datasets of cookies or online classification services has significant disadvantages: | Using datasets of cookies or online classification services has significant disadvantages: | ||
- | We discuss issues with dynamic cookie names, publicly released datasets, and two main online classification services: Cookiepedia and Cookiedatabase. | + | <WRAP info> |
- | + | ||
- | ==== Dynamic Cookie Names ==== | + | |
Some websites deviate from the typical key-value (cookie name and cookie value) scheme by storing data directly in the cookie name. There are several cases, explained by following examples: | Some websites deviate from the typical key-value (cookie name and cookie value) scheme by storing data directly in the cookie name. There are several cases, explained by following examples: | ||
* '' | * '' | ||
- | * '' | + | * '' |
+ | </ | ||
+ | |||
+ | We discuss publicly released datasets and two main online classification services: Cookiepedia and Cookiedatabase. | ||
==== OneTrust and CookieBot Dataset ==== | ==== OneTrust and CookieBot Dataset ==== | ||
Line 81: | Line 86: | ||
- Preferences Cookies (in some jurisdictions known as functionality) | - Preferences Cookies (in some jurisdictions known as functionality) | ||
+ | ==== Cookiesearch ==== | ||
+ | |||
+ | https:// | ||
===== Machine-Learning Classification ===== | ===== Machine-Learning Classification ===== | ||
Using machine learning to classify cookies, rather than relying on static datasets, addresses the limitation of classifying unseen data. Research indicates that ML methods may even outperform human classification. However, practical deployment of ML-based approaches faces challenges similar to those in ML-based advertising blocking: they are prone to adversarial attacks, may disrupt website functionality, | Using machine learning to classify cookies, rather than relying on static datasets, addresses the limitation of classifying unseen data. Research indicates that ML methods may even outperform human classification. However, practical deployment of ML-based approaches faces challenges similar to those in ML-based advertising blocking: they are prone to adversarial attacks, may disrupt website functionality, | ||
- | ==== CookieBlock ==== | + | ==== CookieBlock |
In **Automating Cookie Consent and GDPR Violation Detection** {[bollinger2022automating]}, | In **Automating Cookie Consent and GDPR Violation Detection** {[bollinger2022automating]}, | ||
Line 218: | Line 226: | ||
* '' | * '' | ||
- | ==== CookieGraph ==== | + | ==== CookieGraph |
**CookieGraph: | **CookieGraph: |
privacy/cookies.1736272315.txt.gz · Last modified: 2025/01/07 17:51 by karelkubicek