Data is in high demand for research and innovation, and data about humans and their activities is particularly sought after. Such data undoubtedly has commercial value — but it also has a different kind of value for those to whom it pertains. That value has typically been articulated in non-monetary terms, focusing on the importance of personal data to an individual’s autonomy and dignity. However, in an increasingly data-driven society, how data about humans and their activities is categorized and valued is changing. Arguments about the importance of both group privacy and collective privacy broaden the rights-based focus from individuals to larger groups. In addition, the kinds of data in which individuals have an interest — and the nature of those interests — are also evolving. This essay is about those changing interests, as well as emerging data rights that give individuals — and, perhaps, communities — more control over both the personal and non-personal data that they generate.
Personal and Anonymized Data
The social and economic importance of personal data is well understood. Personal data — typically defined as “information about an identifiable individual” — is central to a person’s identity and can reveal intimate details about them. For businesses and governments that collect personal data, this information is often necessary to provide goods or services to specific individuals. However, personal data can also be used in the design and creation of goods or services. For example, such data can be used to create profiles or to drive targeted advertisements. Personal data is used in massive quantities to train artificial intelligence (AI) systems; it is also important to a variety of research activities.
Data protection laws place limits on the use of personal data, including on its sharing with others. Organizations seeking to use data in new ways, or to monetize their stores of personal data, often run up against these laws, making the use of personal data legally complex. As a result, organizations have turned to anonymization as a means of freeing up data for reuse. Anonymization, which can be defined as “irreversibly and permanently modify[ing] personal information, in accordance with generally accepted best practices, to ensure that no individual can be identified from the information, whether directly or indirectly, by any means,”1 is used where data about people is required, but where it is unnecessary for that data to be linked to identifiable individuals. By breaking the link to the individual, it can free the data from governance under data protection or privacy laws. The European Union’s General Data Protection Regulation (GDPR), for example, states that data protection law “should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.”2 Similarly, both section 6(5) of Canada’s proposed Consumer Privacy Protection Act (CPPA) and section 1798.140 (v)(3) of the California Consumer Privacy Act would place anonymized data out of scope of the legislation. This “out of scope” nature of anonymized data lasts only so long as the data remains anonymous. If re-identification takes place, the data falls once again within the definition of personal information. The growing risk of re-identification — due to vast quantities of data, growing compute power and sophisticated algorithms — means that many data protection laws now impose penalties for the deliberate re-identification of anonymized data.
From Individual to Group Privacy
Data protection laws privilege the protection of personal data because of its link to rights-bearing individuals. An individualist notion of privacy — tied to a person’s autonomy and dignity — is the foundation for its protection. Data protection laws set rules for the collection and processing of personal data, for exceptions to those rules in the public interest, and for compromises that attempt to align the expectations of individuals with certain data practices seen as necessary or socially beneficial. Anonymization fits within these frameworks as a means of freeing data for reuse.
However, the growing power of analytics techniques has raised concerns about the use of the data of many to interpret and shape the lives of both groups and individuals. Anonymized data permits inferences and generalizations that can have as powerful an effect on both individuals and groups as personal data — regardless of accuracy. Much of this activity is known as “profiling” (Hildebrandt 2008).
The privacy dimensions of profiling (which can impact dignity and autonomy) are not properly addressed by individual consent and control.
These deterministic uses of data have spurred calls for greater attention to the concept of “group privacy” — a theory that challenges data protection law’s individualistic approach to privacy rights. The concept of group privacy highlights how anonymized data can be used in the profiling of individuals and the creation of ad hoc groups (Floridi 2014; Mittelstadt 2017). An ad hoc group serves particular needs (such as targeting of advertising) and might be defined in terms of a cluster of presumed shared characteristics using profiles and inferences. Leveraging the group privacy concept, a broader human rights-based approach to data protection would be less exclusively concerned with how particular data points relate to a specific individual and what rights of control the individual has, and more concerned with how data — even if anonymized — impacts the lives and choices of people more generally. Group privacy is an uneasy fit with data protection law. The privacy dimensions of profiling (which can impact dignity and autonomy) are not properly addressed by individual consent and control. They require a greater focus on regulating activities or outcomes.
Collective Interests in Data
In addition to this “group privacy” approach to data, which shifts the narrative from the identifiable individual’s control over their personal information to group and individual harms, there have been growing claims in various contexts for collective rights to some categories of data that are about both individuals and the communities to which they belong. One example is the growing Indigenous data sovereignty movement, where Indigenous communities assert sovereign rights over data about the members of the community (Kukutai and Taylor 2016; Walter and Russo Carroll 2021). Such claims are founded on principles of self-determination. There are also other collective rights claims to data that are based on empowerment and enfranchisement concerns. For example, Ontario’s Black Health Equity Working Group (2021) makes a strong case for a form of collective governance of the health data of Black communities in Ontario. In the context of the failed Sidewalk Labs smart cities proposal for the Quayside land in Toronto, there were proposals for community governance of data that would have been collected in and about the development (Scassa 2020). Taking an even broader definition of community data rights, the Government of Ontario (2022, section 3.3.1) has explored the principles that might inform decisions to provide access for research and innovation to the province’s significant stores of health administrative data. These discussions have included considering how public benefit might be derived from the sharing (for example, in the form of intellectual property, royalties, access to medical treatments and so on), as well as social licence (Paprica, Nunes de Melo and Schull 2019).
Claims to collective rights in data emphasize collective interests in the data about a community and the right of the community to control and benefit from it. To the extent that such data is also personal data, it may be separately protected as personal data — the collective rights claim can be layered on top of rights to personal information — and it can also attach to anonymized data of the collective.
Human-Derived Data
Another category of human-derived data is linked to humans because it is derived from them or their activities, but it may never have been personal data and, as a result, it is also not anonymized data — at least in the sense of having been processed to achieve anonymity (Scassa 2023). An example is data extracted from wastewater — a practice that became much more widespread during the COVID-19 pandemic. Such data reveals the kinds and volumes of excreted virus found in wastewater and has proven to be extremely useful in understanding the presence and trajectory of the virus. Wastewater testing is also used to detect other diseases or substances (Scassa, Robinson and Mosoff 2022). When such data is collected from public wastewater systems, it is generally not linked to individuals (although it is not impossible for the location and methods of collection and correlation to create a risk of identification). Human-derived data can also be used to make decisions that impact groups; for example, the presence of certain banned substances in wastewater from particular communities could lead to decisions by public authorities to increase policing in that community — or to increase public health interventions. Although on one level, this will seem like basic data-driven decision making, it is arguable that the extraction of human-derived data from public infrastructure brings with it, at least, obligations of public notice and engagement (ibid.).
The Co-creation of Data
Increasingly, laws may recognize additional rights or interests of individuals in data generated by their activities. An early example is the introduction, in the European Union’s GDPR, of a new right of control over personal data — a data portability right. A similar right is included in section 1798.130 of California’s Consumer Privacy Act. In Canada, Bill C-27 recognizes a new data mobility right (section 72), and section 27 of Quebec’s newly amended private sector data protection law provides for data portability. Data portability is in only its very early stages in Canada. If passed, the CPPA will enable the mobility of data on a carefully curated, sector-by-sector basis. The first experiment will be with open banking (or consumer-directed finance), which has been promised for early 2025 (FinTech Global 2024). In the case of both Canada’s CPPA and Quebec’s Loi 25, the right is limited to a subset of personal data. For example, under section 72 of the CPPA, it will be personal data that is collected from the individual. Under section 27 of Loi 25, it is “computerized personal information collected from the applicant, and not created or inferred using personal information concerning him.” Data portability rights essentially make a subset of personal data portable in the hands of the data subject. This important new right is more closely linked to competition and consumer rights than it is to privacy per se.
The right of control over personal data that is reflected in data portability rights, has been extended in the European Union beyond personal data to include non-personal data generated through human interaction with digital products and services by the EU Data Act. Thus, data about one’s connected car (for example, about its road or engine performance) is generated through the driver’s activities, but it is not personal data. Yet the Data Act would give the individual the right to obtain that data — or to port it to a service provider of their choice. Rights under the Data Act can also be exercised by organizations whose activities generate data. These rights — under both the GDPR and the Data Act — recognize the value of data and give greater control to those who are essentially its co-creators. According to recital 15 of the Data Act, “the data represent the digitisation of user actions and events and should accordingly be accessible to the user.”3
Conclusion
Although the link between personal data and the individual is grounded in clearly recognized privacy rights, there may be rights in data that go beyond those of the individuals who are the original source. Interests may also extend beyond privacy rights, with the ability to exercise control over data offering a growing range of benefits for individuals or groups. These may include better or more competitive services (as where porting one’s data permits access to different service providers) or downstream benefits (as where the exercise of collective interests in data gives a community the ability to insist on some form of give-back).
While data protection laws have typically also balanced privacy rights against competing interests in the use of personal data, new legislated approaches to data are beginning to recognize the interests of individuals not just as data subjects, but as co-creators of data in certain contexts. This adds a different type of weight in any balancing of interests — indeed, it alters the nature of the interests. None of this should come as a surprise. As the social and economic importance of data continues to grow, it is natural that how we understand and negotiate the different interests in that data will also change. The signs of this change are becoming evident in both emerging legal frameworks and in novel claims by individuals, groups and communities. As these changes begin to shape domestic approaches, they may also pose new challenges to international data governance frameworks.