Different Perspectives on Notions of Value

Data as Representation: Fiduciary Models as Relational Valuation Frameworks

November 12, 2024

This essay is part of The Role of Governance in Unleashing the Value of Data, an essay series that considers aspects of data governance and the value of data.

There are a lot of ways to understand the value of data, economic and otherwise, and they vary substantially in both method and purpose (see, for example, Beauvisage and Mellet 2020; Birch, Cochrane and Ward 2021; Parsons and Viljoen 2023). This essay does not try to start from the process of valuation. Instead, it reverse engineers the way we assign value to data in context and identifies two critical observations. First, that data’s primary value comes from its use in a particular context, when it is used as a representation; and second, that the value of data as a representation is conditioned, if not determined, by the supply chain of relationships that connect its creation to its use — and, critically, by the appropriateness of the parties using data to act as representatives of its subjects in that context. In other words, the value of using data as a representation is significantly determined by whether the user is an appropriate representative.

This essay proposes an approach that focuses on: the articulation and limitation of animating purpose — essentially, the rubric for determining the legitimacy of use in a context; the relationships between the scope of representation, standards of expertise and care, and boundaries of available representative actions; and the responsibility to support, if not provide, means of independent oversight and accountability. These broad dynamics will not address a commodifying approach to data governance — rather, they will provide ways to contextually assess the value and risks of the different approaches to building data supply chains, based on the highest-integrity models implemented in relevant contexts.

Data as a Representation, by a Relationship, in a Context

This analysis starts from a few assumptions, worth articulating upfront — the first and most important of which is that the primary purpose of data use is as information to inform a decision. Data is never the product of immaculate conception; it is created, transformed, mobilized and reconfigured at specific sites, each featuring particular actors, using particular instruments and for specific purposes (see, for example, Baker and Millerand 2007; D’Ignazio and Klein 2020; Leonelli and Tempini 2020).

All data is not equal; what makes some data more valuable than other data in a particular context is the perceived quality of the supply chain by which the data is produced and through which it travels. The more sensitive the context, the more important it is for decision makers to have confidence in the data they use as representations, which, critically, includes confidence in the legitimacy, technical capacity and accountability of the supply chain that produces it.

Data is never the product of immaculate conception; it is created, transformed, mobilized and reconfigured at specific sites, each featuring particular actors, using particular instruments and for specific purposes.

However, the entire premise of “big data” as a novel form of knowledge production is predicated on the mobility and reuse of data beyond its context of origin (Bates, Lin and Goodale 2016; Borgman, Scharnhorst and Golshan 2019; Thylstrup et al. 2022). When data is then concatenated, munged or otherwise combined, knowledge of its embedded limitations and subjectiveness is frequently irreversibly lost (Benjamin 2019; Bowker 2005; Chun and Barnett 2021). This digital political economy has previously been referred to as “the supply chain shredder” (Gansky and McDonald 2022). When industries for whom representational integrity is paramount are examined, different structures and practices of accountability and data maintenance are observed. Rooting the valuation of data in use has a number of important, secondary effects — importantly, it moves from a universal abstract directly into focusing on the value of the underlying decision, the impact of the data representation on the decision and the contextualizing role of the interests of the representative.

This section takes fitness for high-impact and high-value use as a framing assumption for the analysis and uses existing models for high-integrity representation relationships to reverse engineer the core characteristics and operational requirements for valuing data supply chains.

Data Is Not Fungible

The fundamental difference between data and other units of exchange is that data is information that is, predominantly, used in a specific context to affect a specific decision. The reason that the value of data is non-fungible is because the decisions that data influences are not fungible. The value of that information cannot be separated from its relationship to the context and role of its use and, as a result, cannot be meaningfully abstracted to a unit that standardizes that value acontextually.

This basic fact has in recent years been somewhat confused by the popularization of discursive framings of data (for example, as oil, sand or plutonium; cf. Doctorow 2008; O’Reilly 2021), which metaphorize data as a fungible market commodity — and which is partly responsible for engendering a wave of failed consumer-to-business data projects (Beauvisage and Mellet 2020) — to say nothing of the misapprehension of data as “personal” rather than social (for a historical account of this discourse, see Igo [2018]; for a rigorous theoretical account, see Viljoen [2021] and Parsons and Viljoen [2024]). Critiques of this kind of data-as-commodity thinking have more closely examined the actual practices of data-entangled corporations (Birch et al. 2021) and non-commercial projects alike (Vertesi and Dourish 2011) to demonstrate that sets of data are, in fact, highly differentiated and their exchanges conditioned by cultural and infrastructural, as well as economic, factors. Said more plainly, data is not fungible, regardless of total volume, because the decisions data is meant to influence are not fungible, nor are the supply chains that meet the requirements of high-integrity, use-based markets.

Data as a Representation

This analysis starts from the “endpoint” of data use in order to focus on the characteristics of the underlying relationships, the rights holders’ relationship to the context of use and the ways in which the digital transformation of acts of representation can and does impact their value — as opposed to attempting to derive contextual significance from the characteristics of data. While data may be valuable as a representation because of its “truth-value,” the contextual criteria for “truth” in one context is often different from another (Gressin 2023). Understanding the value of data use as a representation requires examining both the “technical” quality of data and its claims (toward usability and veracity) and the quality of the supply chain of relationships that produce it (Ferryman 2017; Viljoen 2021).

The appropriateness of any use of data, or exchange of data, is related to whether the people involved are fulfilling their expected and aligned role within the context of their relationship to the data subject.

The integrity of a relationship is not just based on the two people involved, it is also based on the relationship of each party to the context in which it happens. You may have totally appropriate relationships with your doctor and your lawyer, for example, but that does not mean it is appropriate for your doctor to represent you in court or your lawyer to make decisions about your medical care. The appropriateness of any use of data, or exchange of data, is related to whether the people involved are fulfilling their expected and aligned role within the context of their relationship to the data subject. In most high-impact contexts, we limit who gets to act on behalf of another person — typically, we require representatives to be a certified expert in the subject matter, as well as to have a direct, individual accountability to the person being impacted.

The specific qualities of the assertions made by a given data set or data stream are conditioned by these supply chains. Over the past two decades, scholars have developed a body of evidence and theory converging on a number of shared propositions regarding the politics and epistemology of data practices and infrastructures. This is often a necessary step in achieving the purposes for which the data is intended (Edwards 2010). In other words, when attempting to value data, the question of whether the person or organization using the data is an appropriate representative of the data subjects in that situation is as, if not more, important than its technical or substantive characteristics.

Data Use as an Act of Representation

One of the primary differences between the existence of data and its use as a representation, and especially a digital representation, is that the latter acknowledges the context, the intended impact and the associated liabilities for that representation. Data has created an explosion in the kinds of behaviours that can be observed, as well as increasing the number of actors and contexts implicated by the creation and use of data. The exchange of data raises questions not only about the validity of the facts asserted, but also about how and why the parties exchanging those facts are the appropriate actors to be doing so. The role and interests of the representative are fundamental, and categorically undervalued, where not outright ignored, in mapping data supply chains and economies. Perhaps the most clarifying advantage of framing a data valuation through the context of representation is that it starts from the recognition that representations, especially those made on the behalf of others, require a legitimate basis. People are not entitled to represent you — especially in high-impact, rights-affecting contexts — simply because they purport to hold information about you.

But we do not always apply the same limitations to the use and sharing of data, even when the data in question comes from regulated relationship models. In particular, duty-bearing professions — those that are regulated by public and private institutions — provide models for the way that we might govern digital representation relationships; they are realized by institutionally regulated, tangible, operational infrastructures designed to ensure the integrity, equity and symmetry of power in inherently asymmetrical representation relationships (Balkin 2020; Richards and Hartzog 2015, 2021). At a basic level, for example, for any data used by a fiduciary representative to meet the standards created by their duties, the production supply chain needs to be both explicit and accessible (Gansky and McDonald 2022).

While there is a significant range of practice, both within and between duty-bearing professions, there are common governance design patterns that offer valuable guidance for those attempting to design integrity measures for data and digitally intermediated relationships. The role of a fiduciary representative is to represent another person’s, or group’s interests in a defined context. To be a fiduciary representative, a professional must be able to understand their client’s interests, triangulate the data and resources available to advance those interests, and be able to describe their representation to both the client and the decision-making context. In other words, in order for data to be suitable as a representation, by a representative, in a high-value context, the data itself not only needs to be fit for purpose, but it also has to come from a representative source with a legitimate basis to make that assertion.

Courts do not allow, for example, anyone to wander in and, with no relationship to the parties, the court or the subject matter, make argumentation or submit evidence. That is not because we assume that people will not try; it is because the physical, procedural and practical design of legal systems makes it difficult to do so. The integrity of the representatives and representations made in rights-affecting contexts is protected by the context of use, not by the supply chain of production. The model of relationship designed for high-impact situations — especially across power asymmetry — is called a fiduciary relationship. The value of a fiduciary relationship is almost exclusively predicated on how well the representative understands and pursues the best interests of the person they are representing. The core characteristics of fiduciary relationships are a blueprint of the relational requirements for data supply chains that lead to data use as an act of representation in high-value contexts (and thus markets); they are also useful as a foundational framework for identifying the characteristics of data’s value as a representation.

Fiduciary Models as Valuation Frameworks

The term “fiduciary” can seem nebulous or abstract, but in very concrete terms, it is a legal term for relationships where one person represents another’s interests. While there is a lot of contextual variance in application, the design, oversight and enforcement of fiduciary relationships highlight a number of core elements of a high-integrity representation relationship — as well as the appropriate and legitimate basis for using representations to make high-impact decisions. The highest-impact decisions are often the most valuable — consider how every major tech company has tried, and mostly failed, to enter medicine and medical informatics (Foley 2019; Garcia 2019; Lomas 2022). Participation in high-value decisions, especially those influenced by data- and computation-intensive processes, is valuable but also difficult to evaluate independent of a wide range of subjective and contextual factors.

Fiduciary models rely on three things: duties of care, duties of loyalty and independent oversight. Duties are different than standards — they require active, case-by-case consideration in ways that do not appeal to universalizable rules that abstract to the technical level/layer. Being a fiduciary representative is more art than science, but the duties that fiduciaries fulfill offer a functional model and set of system requirements that can be used to evaluate the integrity, quality and, thus, perhaps, the value of a data supply chain as a representative relationship and data use as an act of representation.

Representative Purpose Limitation: Defining and Reverse Engineering Value from Context

Perhaps the single most important characteristic of fiduciary representation as a model for data is that it both recognizes the role of “interests” and compels those involved to clearly articulate, limit and be held accountable to achieving those interests on behalf of a specific person. That liability means that while fiduciaries must know the interests of the people they represent, they also need to have enough information about the context, and the tools they use in representing those interests, to be able to explain how and why they made the decisions they did. One measure of data’s value, especially relative to its fitness in high-impact contexts, is the degree to which its technical and legal format supports understanding its relationship to the interests of those involved in its production.

Relationship Definitions, Agency and Limitation

Another critical, if counterintuitive, difference between data and representations is that data is often produced and designed in order to maximize reuse, whereas representations are specifically designed for a specific context — and conducted inside the bounds of a defined, limited relationship. Fiduciary representatives have explicitly defined responsibilities that are limited in a range of common ways — for example, for a fixed period of time or related to a specific subject matter. As a vehicle for establishing the value of data — especially in the context of making a contextual representation, use-based limitations are often an indicator of specialization, fitness for purpose and, as a result, value.

Direct and Independent Governance and Oversight

One of the greatest indicators of integrity in any system is that the producers do not ask you to take their word for it — they make it easy for you to hold them to their word. Most data and digital systems are architected, whether by virtue of dependence or as a proactive means of arbitrage, in ways that explicitly avoid liability (for example, through disclaiming warranties, using open licensing and publishing to avoid transactional liabilities, and/or working in ambiguously defined jurisdictions). And yet, in order for a fiduciary to be able to explain and justify their use of data in a particular context, they need to be able to actively resolve disputes arising from its use — meaning they need to be able to identify the relevant parties and the relevant dispute resolution system and be able to compel the parties involved to accept the decision of that body. In other words, in order for data to be fit for purpose as a fiduciary representation, that data also needs to be transparently governed by an explicitly articulated, relevant authority.

Ultimately, the value of data is conditioned by the integrity, accessibility and ongoing oversight of the supply chains that produce and mobilize data from their point of origin to their use as representations to influence decision making. The particularities of what makes such supply chains fit for purpose is dependent on the context in which data as representations are articulated. Fiduciary models are by no means a universal or perfect solution (Khan and Pozen 2019). They do, however, provide an additional vehicle and mechanism for rights holders to participate in the governance and oversight of those that represent their interests. The characteristics of that governance and oversight vary, but they provide for explicit reporting structures, time- and context-bounded relationship definitions, authority definitions, continuous burdens of reporting and proof, awareness and conflicts of interest, and mechanisms for administering disputes and contests.

Conclusion

We cannot regulate or establish value for data as a fungible object because, as an assertion of fact informing a decision, data is differently significant and valuable depending on the context in which it is used. We can, however, use the functional requirements of fiduciary relations as both a framework to understand the value of data (as representations) in context and the characteristic requirements for data production systems to maximize the opportunity for contextual value. Understanding the value of data as assertions informing high-impact decisions requires aligning our data valuation frameworks toward the quality and integrity indicators that representatives (for example, lawyers, medical professionals and accountants) use to manage high-impact supply chains.

The mapping, definition and bounding of the constituent interests embedded in data supply chains — from the situated perspective of the representatives using data to inform decisions impacting individuals and populations — is a relatively novel economic, legal and operational project. For the fiduciary, the value of data is predicated on the extent to which it enables them to make representations that adhere to their duties of loyalty and care, under the oversight of independent forms of accountability. This valuation logic is instructive; it points away from standardizable, universalizable valuations, toward granular, situated and justifiable understandings of data’s value, directly linked to the supply chains that produce them.

Works Cited

Baker, Karen S. and Florence Millerand. 2007. “Articulation Work Supporting Information Infrastructure Design: Coordination, Categorization, and Assessment in Practice.” 2007 40th Annual Hawaii International Conference on System Sciences (HICSS’07), 242a–242a. https://doi.org/10.1109/HICSS.2007.88.

Balkin, Jack M. 2020. “The Fiduciary Model of Privacy.” SSRN, November 18. https://papers.ssrn.com/abstract=3700087.

Bates, Jo, Yu-Wie Lin and Paula Goodale. 2016. “Data journeys: Capturing the socio-material constitution of data objects and flows.” Big Data & Society 3 (2). https://doi.org/10.1177/2053951716654502.

Beauvisage, Thomas and Kevin Mellet. 2020. “Datassets: Assetizing and Marketizing Personal Data.” In Assetization: Turning Things into Assets in Technoscientific Capitalism, edited by Kean Birch and Fabian Muniesa. Cambridge, MA: MIT Press. https://doi.org/10.7551/mitpress/12075.001.0001.

Benjamin, Ruha. 2019. Race After Technology: Abolitionist Tools for the New Jim Code. Cambridge, UK: Polity Press.

Birch, Kean, D. T. Cochrane and Callum Ward. 2021. “Data as asset? The measurement, governance, and valuation of digital personal data by Big Tech.” Big Data & Society 8 (1). https://doi.org/10.1177/20539517211017308.

Borgman, Christine L., Andrea Scharnhorst and Milena Golshan. 2019. “Digital data archives as knowledge infrastructures: Mediating data sharing and reuse.” Journal of the Association for Information Science and Technology 70 (8): 888–904. https://doi.org/10.1002/asi.24172.

Bowker, Geoffrey C. 2005. Memory Practices in the Sciences. Cambridge, MA: MIT Press.

Chun, Wendy Hui Kyong and Alex Barnett. 2021. Discriminating Data: Correlation, Neighborhoods, and the New Politics of Recognition. Cambridge, MA: MIT Press.

Doctorow, Cory. 2008. “Personal data is as hot as nuclear waste.” The Guardian, January 15. www.theguardian.com/technology/2008/jan/15/data.security.

D’Ignazio, Catherine and Laura F. Klein. 2020. Data Feminism. Cambridge, MA: MIT Press.

Edwards, Paul N. 2010. A Vast Machine: Computer Models, Climate data, and the Politics of Global Warming. Cambridge, MA: MIT Press.

Ferryman, Kadija. 2017. “Reframing Data as a Gift.” SSRN, July 22. https://doi.org/10.2139/ssrn.3000631.

Foley, Mary Jo. 2019. “Microsoft is closing its HealthVault patient-records service on November 20.” ZDNET, April 5. www.zdnet.com/article/microsoft-is-closing-its-healthvault-patient-records-service-on-november-20/.

Garcia, Ahiza. 2019. “Google’s ‘Project Nightingale’ center of federal inquiry.” CNN Business, November 15. www.cnn.com/2019/11/12/tech/google-project-nightingale-federal-inquiry/index.html.

Gansky, Ben L. and Sean M. McDonald. 2022. “CounterFAccTual: How FAccT Undermines Its Organizing Principles.” In 2022 ACM Conference on Fairness, Accountability, and Transparency, 1982–92. https://doi.org/10.1145/3531146.3533241.

Gressin, Seena. 2023. “FTC lawsuit insists on FCRA compliance and transparency from background report providers.” Federal Trade Commission Business Blog, September 11. www.ftc.gov/business-guidance/blog/2023/09/ftc-lawsuit-insists-fcra-compliance-transparency-background-report-providers.

Igo, Sarah E. 2018. “Me and My Data.” Historical Studies in the Natural Sciences. 48 (5): 616–26. https://doi.org/10.1525/hsns.2018.48.5.616.

Khan, Lina M. and David E. Pozen. 2019. “A Skeptical View of Information Fiduciaries.” Harvard Law Review 133 (2): 497–541. https://harvardlawreview.org/print/vol-133/a-skeptical-view-of-information-fiduciaries/.

Leonelli, Sabina and Niccolò Tempini, eds. 2020. Data Journeys in the Sciences. Cham, Switzerland: Springer International. https://doi.org/10.1007/978-3-030-37177-7.

Lomas, Natasha. 2022. “Google faces new suit over DeepMind NHS patient data scandal.” TechCrunch, May 16. https://techcrunch.com/2022/05/16/google-deepmind-nhs-misuse-of-private-data-lawsuit/.

O’Reilly, Tim. 2021. “Data Is the New Sand.” The Information, February 24. www.theinformation.com/articles/data-is-the-new-sand.

Parsons, Amanda and Salomé Viljoen. 2024. “Valuing Social Data.” Columbia Law Review 124: 993–1079. https://columbialawreview.org/content/valuing-social-data/.

Richards, Neil M. and Woodrow Hartzog. 2015. “Taking Trust Seriously in Privacy Law.” SSRN, September 5. https://doi.org/10.2139/ssrn.2655719.

———. 2021. “A Duty of Loyalty for Privacy Law.” Washington University Law Review 99: 961–1021. https://doi.org/10.2139/ssrn.3642217.

Thylstrup, Nanna Bonde, Kristian Bondo Hansen, Mikkel Flyverbom and Louise Amoore. 2022. “Politics of data reuse in machine learning systems: Theorizing reuse entanglements.” Big Data & Society 9 (2). https://doi.org/10.1177/20539517221139785.

Vertesi, Janet and Paul Dourish. 2011. “The value of data: Considering the context of production in data economies.” CSCW ’11: Proceedings of the ACM 2011 Conference on Computer Supported Cooperative Work, 533–42. https://doi.org/10.1145/1958824.1958906.

Viljoen, Salomé. 2021. “A Relational Theory of Data Governance.” Yale Law Journal 131 (2): 573–654. www.yalelawjournal.org/feature/a-relational-theory-of-data-governance.

The opinions expressed in this article/multimedia are those of the author(s) and do not necessarily reflect the views of CIGI or its Board of Directors.