Own Data? Ethical Reflections on Data Ownership

  • Research Article
  • Open access
  • Published: 15 June 2020
  • Volume 34 , pages 545–572, ( 2021 )

Cite this article

You have full access to this open access article

data ethics thesis

  • Patrik Hummel   ORCID: orcid.org/0000-0001-9668-0810 1 ,
  • Matthias Braun   ORCID: orcid.org/0000-0002-6687-6027 1 &
  • Peter Dabrock 1  

31k Accesses

64 Citations

67 Altmetric

Explore all metrics

In discourses on digitization and the data economy, it is often claimed that data subjects shall be owners of their data. In this paper, we provide a problem diagnosis for such calls for data ownership : a large variety of demands are discussed under this heading. It thus becomes challenging to specify what—if anything—unites them. We identify four conceptual dimensions of calls for data ownership and argue that these help to systematize and to compare different positions. In view of this pluralism of data ownership claims, we introduce, spell out and defend a constructive interpretative proposal: claims for data ownership are charitably understood as attempts to call for the redistribution of material resources and the socio-cultural recognition of data subjects. We argue that as one consequence of this reading, it misses the point to reject claims for data ownership on the grounds that property in data does not exist. Instead, data ownership brings to attention a claim to renegotiate such aspects of the status quo .

Similar content being viewed by others

data ethics thesis

Economic Rights Over Data: A Framework for Community Data Ownership

data ethics thesis

Reframing Autonomy: My Data, Our Data, and the Question of Human Dignity

data ethics thesis

A Theoretical Framework for Ethical Reflection in Big Data Research

Avoid common mistakes on your manuscript.

1 Introduction

Data seem to be produced on unprecedented scales. Sensors, wearables, and devices continuously translate physical movements and states of affairs into data points. When browsing the internet or using social media, analytic tools process each and every click. Shopping interests and behaviours feed into tailor-made adverts and products. Networked cars and autonomous driving rest on large-scale gathering and processing of vehicle and traffic data. Precision medicine aims to search for patterns and correlations in huge sets of patient data, and promises to personalize prevention, diagnostics, and treatments to the specific characteristics and circumstances of individual patients. Industry 4.0 datafies and automates steps in manufacturing and production. The Internet of Things extends digitization, datafication, and networked objects even further. The recurring observation is that data processing will become increasingly pervasive and powerful. Already now, we witness transformations in how we perceive, frame, think, value, communicate, negotiate, work, coordinate, consume, keep information confidential, and make it transparent.

One—if not the—pressing question in the context of digitization is whether foundational rights of individual subjects are respected, and what it takes to safeguard them against interferences. One frequently discussed suggestion is that these questions arise against the backdrop of contested relations of ownership, i.e. the relation between an owner and her property. There is a set of expectations associated with data ownership. “It gives hope to those wishing to unlock the potential of the data economy and to those trying to re-empower individuals that have lost control over their data” (Thouvenin et al. 2017 , 113). In this spirit, calls for data ownership demand that data can be property. Yet, commentators caution that “[s]implified versions of ownership […] may create compelling soundbites but provide little direction in practice” (The British Academy and The Royal Society 2017 , 32). While data have tangible aspects, such as their relation to technical-material infrastructures, they also seem to differ from ordinary resources and tangible property (Prainsack 2019b , 5). In a digitized and datafied lifeworld, claims to data are indispensable towards claiming fundamental rights and freedoms. These preliminary observations prompt us to clarify what data ownership exactly means, how it is justified, what it tries to achieve, and whether it succeeds in promoting its aims.

The present paper explores the content of claims for data ownership. It has two goals: first, it provides an in-depth analysis of different notions of data ownership and uncovers inherent conceptual tensions and puzzles. As we will argue, a variety of considerations are put forward under the heading of data ownership. The notion is ambiguous and even paradoxical: it used to articulate and taken to support claims that stand in tension and are mutually incompatible.

Second, we argue that all of these dimensions of data ownership matter for informational self-determination, understood as the ability of data subjects to shape how datafication and data-driven analytics affect their lives, to safeguard a personal sphere from others, and to weave informational ties with their environment. Specifically, and drawing on a debate between Axel Honneth and Nancy Fraser, we demonstrate how the meanings of data ownership raise both issues of material ownership (pertaining to the sphere of distribution) and issues of socio-cultural ownership (pertaining to the sphere of recognition). Our proposal is that important entanglements between both spheres get overlooked if we merely focus on one of them. For informational self-determination, both are relevant. Thus, we need to take seriously the full range of dimensions of data ownership in order to understand how data subjects exercise informational self-determination, and how such exercises can be facilitated and promoted. We discuss these challenges under the heading of data sovereignty (German Ethics Council 2017 , 2018 ; Hummel et al. 2018 ; Hummel et al. 2019 , 26–27).

In order to pursue these goals, we begin by briefly considering data ownership from a legal perspective (2.). As we will show, there is a debate about the compatibility between data ownership and current legal frameworks. Moreover, a number of rationales that typically discourage the institutionalization of data ownership are beginning to disintegrate in contemporary big data environments, which we take to suggest that the case for data ownership is worth debating. We then go on to provide interpretative contributions on what it would mean to establish or to maintain data ownership (3.). As it turns out, the substantive demands and goals vary significantly across discussants. We distinguish four dimensions of data ownership. Each of them is debated by reference to a pair of conceptual poles : the institutionalization of property versus cognate notions of quasi-property (3.1), the marketability versus the inalienability of data (3.2), the protection of data subjects versus their participation and inclusion into societal endeavours (3.3), and individual versus collective claims and interests in data and their processing (3.4). We propose that this characterization explains why different proposals on data ownership articulate diverging or even mutually incompatible demands, and helps us to get a grip on what is at stake when the notion is invoked. Drawing on Honneth and Fraser (4.), we go on to argue that all of these dimensions are vital for informational self-determination. Statements on data ownership touch upon and go back and forth between two different spheres : the redistribution of material resources and the socio-cultural recognition of data subjects. In view of our findings, the notion of data ownership can be understood as an expressive resource for articulating and negotiating claims concerning both spheres.

In the following, we use the terms ‘ownership’ and ‘property rights’ as follows: “Property rights […] are the rights of ownership. In every case, to have a property right in a thing is to have a bundle of rights that defines a form of ownership” (Becker 1980 , 189–190). ‘Property’ refers to the thing(s) to which property rights apply.

2 Legal Frameworks

The dominant view in legal theory tends to be that data cannot be owned: “Individual ownership of data […] is contrary to well-established legal precedent in the United States, United Kingdom, and many other jurisdictions, which for good reasons do not recognize property interests in mere facts or information” (Contreras et al. 2018 ). European frameworks like the Convention for the Protection of Human Rights are typically understood as presenting data-related rights as an extension or subset of fundamental human rights, which suggests that they are inalienable and unsuitable for propertization, commodification, commercialization (Purtova 2010 , 202–204; Harbinja 2017 , 2019 , 103–104; Prainsack 2019a , 18). In the US discourse, some commentators have been amenable to data ownership (Purtova 2009 ). Overall, national legal frameworks differ considerably in the extent to which they leave room for some form of data ownership (Osborne Clarke 2016 ).

Still, even with regard to the European framework, the nature of rights on data is discussed controversially. Pearce ( 2018 ) argues that according to the GDPR, data protection rights could prima facie be regarded as either personal or proprietary rights. Both kinds of rights share their orientation towards economic efficiency, civil liberties and avoidance of unjust interferences. But while proprietary rights do so on the basis of assigning transferable rights in external things, personality rights are internal, not directed at external things, relate to an individual’s personhood, character, and identity, and are inalienable. Pearce points to Victor ( 2013 ) who argues that the GDPR treats data very similar to property: data subjects enjoy default entitlements to personal data (against this, cf. Purtova 2015 , 88–99) and thus a similar kind of authority that owners have over their property, given that the consent of the data subject is needed for the processing of personal data (Art. 6). Moreover, the individual has rights to access (Art. 15), erasure (Art. 17), and data portability (Art. 20). The latter can be seen as an analogue to owners’ ability to demand their property if it is being used and accessed by others (cf. also Thouvenin 2017 , 27). That these rights remain in place even once others gain access to personal data constitutes “one of the most property-like features of the new regime” (Victor 2013 , 525). Finally, legal remedies for violations mirror those typically used for property protections. Still, Pearce ( 2018 , 201–202) ends up discarding Victor’s suggestions. He denies that the GDPR assigns default entitlements to data subjects: the GDPR does allow the processing of personal data without consent, e.g. if it is carried out “by a natural person in the course of a purely personal or household activity” (Art. 2 (c)). Moreover, member states can restrict GDPR provisions for the sake of national security (Art. 23). Another serious problem with a proprietary reading is the scope of the term ‘personal’: appropriately for big data contexts in which data sets are de- and recontextualized frequently, the GDPR uses a wide notion in which data that initially appears to be non-personal become personal if it can in principle be related to an individual. But the more expansive ‘personal’ is understood, the less plausible it is that personal data can be owned, commodified, and controlled. Since Pearce also has arguments against a personal rights reading of the GDPR—e.g. personal rights typically cannot exist independently from their owner, whereas the dead do retain certain rights to data protection (Harbinja 2017 , 2019 )—neither the proprietary nor the personal rights paradigm sits well with the GDPR, and Pearce thus speaks of a “Conceptual Muddle of Data Protection Rights under EU Law” (ibid.).

These observations illustrate some variance in opinions on the question of what present legal frameworks actually make of data ownership. Most authors maintain that current legal frameworks and the idea of data ownership are incompatible . Others claim that these legislations are actually somewhat unclear on the status of the notion. And as we have just seen, some even think that certain frameworks already assign some kind of ownership in data. The latter position might be defended less frequently, but it nonetheless exists and has proponents.

Besides data protection law, there are further legal frameworks that govern data. For example, contract law defines the scope and limitations of contracts that determine the rights of data originators and processors who negotiate access and use of data, e.g. by exchanging services or levying charges. However, while contract law undoubtedly has an important functional role to play, it also leaves certain foundational issues unaddressed. For example, “[contract law] does not settle the question to whom data is being ascribed originally, i.e. who owns them. […] Only those who acquire something that they did not own at the outset can be expected to provide something in return” (Ensthaler and Haase 2016 , 1460, our translation). In this picture, contract law presupposes pre-legal ownership relations rather than establishing them. Unlike property law governing tangible objects, intellectual property law does assign ownership rights to entities that—just like data—are non-rivalrous, non-excludable and non-depletable. However, these protections are tied to acts of creation . Data ownership claims are intended to apply even if no act of creation was involved in generating the data (Ensthaler and Haase 2016 ; Cohen 2018 , 212–213).

One primary reason for legal scholars to be critical of data ownership is that there are important differences between data and paradigm cases of property (Zech 2012 , 117–119). First, unlike tangible entities, possession of data does not imply that one is the sole possessor and exclusive user. Data can be duplicated, and several people can use it at once. Second, it is difficult if not impossible to exclude third parties. Unless one manages to keep data secret, it can be duplicated and used by others. Due to being non-rival and non-excludable, data are public goods (as the term is used in economic theory, and contrasts with private goods, club goods, and common-pool resources). Moreover, data are non-depletable: they can be used more than once without losses in quality. Because of these disanalogies, Zech argues that possession or ownership of data are misnomers that presuppose tangibility. For information, access is the right category.

Intellectual property does pertain to a resource that is prima facie non-rival and non-excludable (prior to propertization). In principle, it could thus serve as a fruitful blueprint for data ownership. However, this would raise at least two challenges: first, data ownership concerns data that relates to individuals, but unlike intellectual property, such data are typically not created or invented by them. Second, there is a debate within legal theory whether intellectual property is or should be genuine property, or actually just confers monopoly rights (e.g. Lessig 2004 ; Hughes 2005 ; Lemley and Weiser 2007 ). If so, similar reservations would apply to data ownership proposals.

If one decides to protect information, this can happen at a variety of levels (Lessig 2002a , 23; Wiebe 2016 , 67–68; Specht 2016 , 290–291; Thouvenin et al. 2017 , 120–121): first, at the syntactic level or code layer which refers to the code that expresses it; second, at the semantic level or content layer which refers to the meaning of the data; third, at the structural level or physical layer which refers to the physical embodiment of information; fourth, at the pragmatic level which refers to the effects, uses, and purposes of information. These distinctions raise the need for specifying which level a proposed ownership right in data would govern. Moreover, they bring out that de lege lata , certain aspects of these different layers are already protected. For example, across different legislations, data protection law contains provisions pertaining to personal data and hence relate to the semantic level.

Regarding the question whether besides existing legal mechanisms, new provisions should be introduced, one point of interest for legal scholars is whether such introduction would help to alleviate market failures and thus contributes towards more efficient allocations (Becker 1980 , 193). For example, we might expect that people are better off when resources are governed by property regimes. Otherwise, exploitation of resources looms, whereas assigning property sets an incentive for owners to plan and to maintain them. This helps to avoid a tragedy of the commons as coined by Hardin ( 1968 ), although his influential discussion has also been criticized for conflating commons and open access regimes (e.g. Ostrom 1990 ; Rieser 1999 ). Either way, both legal scholars and economists maintain that the burdens of introducing and enforcing additional legal mechanisms to govern resources should be proportional to the societal benefits these mechanisms generate. This rationale is discussed for data ownership as well (Murphy 1996 ; Solove 2001 , 1445–1455; Purtova 2009 , sect. 3.1; Thouvenin 2017 ). As one example for a critical take on this issue, Thouvenin et al. ( 2017 , 115–116) distinguish a market failure in a narrow sense from a market failure in a wider sense. In a narrow sense, a market failure arises if the good would not be produced or used unless there were property titles in it. In a wider sense, a market failure arises if the transaction costs are not as low and allocations not as efficient as they could be. Thouvenin et al. argue that data ownership is not necessary to address market failures in the narrow sense; even without data ownership, data would be produced, used, licenced, traded, etc. And with regards to market failure in the wider sense, they complain about insufficient empirical and conceptual evidence about the superiority of the ownership paradigm.

Such criticisms of data ownership often involve the following two premises. First, data are non-rivalrous (for an overview, cf. Purtova 2015 , 101). Second, this speaks against the idea that data can be the property of individuals.

Against the second claim, note with Moore and Himma ( 2017 , sect. 4.2) that the inference of a requirement of maximal access from the supposition of non-rivalry is dubious. Sensitive personal information, but also content like snuff films, obscene pornography, or information related to national security are non-rivalrous. Yet, moral claims to maximal access are unreasonable. This does not establish that individuals can or should have information as property, but it does challenge the suggestion that mere non-rivalry of information speaks against data ownership.

As per the first claim, data is typically seen as non-rivalrous in the sense that one agent’s access does not prevent others from accessing and using the very same data, too. Nevertheless, proponents of virtual property argue that there are examples of immaterial goods that are rivalrous and thus deserve distinctive legal mechanisms beyond intellectual property law (Berberich 2010 ). Fairfield discusses the examples of “domain names, URLs (uniform resource locators), websites, email accounts, and entire virtual worlds” that are rivalrous and persistent, i.e. “do not go away when you turn your computer off” (Fairfield 2005 , 1049), and people can interact with them. These resources thus “mimic physical properties” (Fairfield 2005 , 1053) of tangible goods (for criticism, cf. Glushko 2007 ; Nelson 2009 ).

Purtova ( 2015 , 107–109) argues that rivalry is still a matter of concern for the simple reason that there is competition and rivalry amongst the platforms which extract data. Focusing on individual data points and finding no rivalry in their access and use is short-sighted. Google, Microsoft, Facebook and the like require large-scale ‘user livestock’ to extract meaningful quantities of information. There is rivalry in the sense that once such ‘user livestock’ is gathered by a ‘digital giant’, this information cannot be used by other organizations or individuals. That is, rivalry exists at the structural level of the ways in which data is harvested. According to Purtova, it is here where grave power imbalances will be generated and reinforced unless ownership regimes are implemented.

We need not fully subscribe to Fairfield’s and Purtova’s analyses to recognize that they highlight important respects in which purely legal perspectives on data ownership could reach their limits. Data might be rivalrous or not, and have or fail to have further features that matter with regards to whether law can currently encode property titles in data. What becomes apparent, however, is that these questions are not solely answerable from within law or legal theory. Whether data or its usage are persistent, scarce, or rivalrous, and which significance this should have for justified ownership claims, are questions that have implications for law and relate to legal frameworks, but that importantly go beyond legal inquiry and discourse.

Finally, their analyses demonstrate that a debate on data ownership concerns much more than just data. Both Fairfield and Purtova highlight that data have real-world impacts on the lives of data subjects. There might not be rivalry when accessing and using the good itself; I can use my data even if someone else does, too. But consequences and inferences from someone else’s usage can affect my individual freedom and the options I enjoy in social space. The flow of sensitive information can have implications on employment, insurance status and eligibility, or the prospects to receive fair, unbiased treatment in courts and trials. What this suggests is that data ownership need not be motivated primarily by considerations about the effective management of the resource of data—a suggestion that would appear to be undercut if all data were non-rivalrous. If anything, the resource to be managed efficiently by institutions of data ownership is not only the resource of data itself, but societal resources of justice, privacy, self-determination, fairness, inclusion, and the like. On this view, it is the significance of data in our lifeworlds that motivates the institution of data ownership as a way to exercise control over fundamental categories of coordination in datafied societies. This, in turn, suggests that doubts about the existence of data ownership relative to existing legal frameworks should not prevent us from exploring the notion in further detail.

3 Aspects and Indeterminacies of Data Ownership

Becker ( 1980 , 187) distinguishes three questions about the justification of property. The general question is: why should there be property? The specific question is: what kind(s) of property rights should there be? And the particular question is: who should have a title to a specific kind of property?

Discussions of the different questions about the justification of property often refer Locke:

Though the Earth […] be common to all Men, yet every Man has a Property in his own Person . This no Body has any Right to but himself. The Labour of his Body, and the Work of his Hands, we may say, are properly his. Whatsoever then he removes out of the State that Nature hath provided, and left it in, he hath mixed his Labour with, and joined to it something that is his own and there makes it his Property . (1689, ch. 5, sect. 27)

For Locke, the starting point is the self-ownership inherent to personhood. The result of the person mixing labour with natural resources is that the property in her own person is extended to the product.

There are alternatives to the Lockean starting point for justifying property. For example, Becker ( 1980 , 193) characterizes utilitarian approaches according to which property is beneficial towards human happiness, for example by promoting stability and efficiency. What he calls “Personality Theory” (ibid. 209) regards property acquisition as necessary to maintain and promote personhood. Examples include Aristotle’s suggestion that some virtues presuppose property. And against the backdrop of quite different theoretical frameworks, Kant and Hegel agree that property manifests personhood, agency, and the legal implementation and recognition of freedom (cf. also Radin 1982 ).

With regard to Becker’s specific and particular questions, the Lockean picture is actually a double-edged sword for data ownership. On the one hand, it can be taken to motivate data ownership as an extension of personal self-ownership (Solove 2008 , 26). On the other hand, it also discourages such a view.

Consider the criterion of mixing one’s labour with resources. On reflection, this suggestion undermines the idea that individuals own data about them (e.g. Thouvenin 2017 , 25). While I might have “invest[ed] bodily samples” (Montgomery 2017 , 83) but no labour (Cohen 2018 , 212–213) in biomedical data about me, it is the medical service provider who analyses specimen and data, compiles it into evidence bases, and generates value based on the raw materials I am providing. If labour is any indication, then “[i]f anyone may claim proprietary rights over the information on the labour theory of property, it would seem to be the health professionals or service for which they work” (Montgomery 2017 , 84). Similarly, Solove notes that “[p]ersonal information is often formed in relationship with others” (2008, 27), and illustrates that web-browsing data is a joint feat by the user and the service provider.

These observations illustrate some initial challenges with the concept of data ownership. Returning to Becker’s terminology: with a Lockean answer to the general question about property, we could still negate the specific question of whether data should be owned. Even if we affirm it, we might have to answer the particular question in a way that renders data subjects mere co-owners at best. For the moment, we sidestep these issues in order to point out that even if they can be addressed, ambiguities and indeterminacies in the notion of data ownership remain.

3.1 Property versus Quasi-Property Rights

Although there are legal provisions that provide data protection and control rights, data do not straightforwardly and undisputedly fall under the categories of property and ownership (2.).

One way to understand calls for data ownership is that they express dissatisfaction with provisions de lege lata . As things stand, legal frameworks fall short of assigning individuals appropriate authority over their data, and perceived shortcomings of the status quo have implications de lege ferenda . For example, in his recent, innovative proposal, Fezer ( 2017 , 2018 ) argues for the introduction of genuine, sui generis property rights in data. He accepts that individuals do not invest labour in the generation of data, but denies that this affects their entitlement to data ownership. He highlights that what he calls citizen data are behaviour-generated : they are the result of citizen interactions and communication. This applies independently of whether these data are anonymized or personal. Behaviour-generated data do not reduce to mere code but reflect a form of cultural agency, which makes them the appropriate object of dedicated legal governance ( rechtserhebliches Kulturhandeln ). Human dignity, informational self-determination, cultural significance and expressiveness of data must be the departure points for global discourses on human and foundational rights in digitization. This concerns “nothing less than the civil-societal status of the citizen as a sovereign in constitutional democracy” (Fezer 2018 , 27, our translation).

Fezer emphasizes that property is the legal institutionalization of personal rights to freedom: it provides a legally mediated space to enact freedom (“freiheitlicher Gestaltungsraum durch Recht” (Fezer 2018 , 48)). This functional role of property and its inherent connection to the freedom of its owner should convince us that the notion of property is open and should not be a priori limited to material goods and intellectual property. If circumstances call for it, we need to reflect on new forms of property to secure the sovereignty of citizens. In particular, he proposes that future data ownership law should take behaviour-generatedness as the counterpart of acts of creation in intellectual property law. Finally, at the heart of Fezer’s proposal is the suggestion that for pragmatic reasons, citizens’ impact on data governance via data ownership must be representative , i.e. enacted and articulated through macro-level governance bodies. In this way, he arrives at an interesting unification of individual data ownership and the collective representation of data subjects.

Unlike Fezer, other calls for data ownership do not necessarily demand the introduction of new forms of ownership, but something rather different. To see why, note that ownership can be taken as a proxy for certain access, usage and control rights. For example, in his seminal discussion, Honoré describes 11 rights and duties which he takes ownership to comprise: “the right to possess, the right to use, the right to manage, the right to the income of the thing, the right to the capital, the right to security, the rights or incidents of transmissibility and absence of term, the prohibition of harmful use, liability to execution, and the incident of residuarity” (Honoré 1961 , 370). Honoré himself regards these rights and duties to be jointly sufficient for ownership, but maintains that they need not be individually necessary. Satisfying all of them would give rise to full ownership . But perfect satisfaction of all these conditions is not necessary to speak of (less-than-full) ownership, which can already result if only some of these conditions are satisfied to some degree. Consequently, less-than-full ownership can come in different forms as it can comprise a range of different rights and combinations thereof.

This leads to the following three observations. First, ownership is compatible with varying degrees of satisfying Honoré-style conditions. Instead of necessitating unconditional and unrestricted control, control and usage rights can be constrained, e.g. by collective rules, public institutions, and state agencies. Ownership thus conveys only inexact information on how much control the owner has over her resource. Waldron ( 2017 , ch. 1) illustrates that one might own a building in a historic district. While this confers entitlements to control and to use the resource, restrictions apply. The owner could not, for example, tear it down and replace it with a skyscraper. In a similar vein, Evans ( 2011 , 79–80) points out that even if health data were patient-owned, the state would maintain certain rights to access it non-consensually, for example if such access is in the legitimate interest of the public. Ownership is not a single right but comprises a bundle of rights, and it is helpful to be clear about which right is being ascribed and asserted in a given case.

Second, the question arises what exactly is put forward in calls for data ownership. It is possible that the point of these calls is not so much the demand that Honoré-style conditions are satisfied to a degree that entitles us to speak of ownership, let alone full ownership. The point of these calls could be these particular conditions themselves. For some, it might be a vital question whether in a given case, we have enough of these conditions preserved to speak of ownership. But for others, the question of ownership might be moot. What matters is whether enough of these conditions is preserved to promote certain ends, including but not limited to the integrity of persons, self-determination, and participation in social endeavours. For example, Thouvenin highlights the pragmatic question whether data ownership should be introduced alongside existing regulation or replace it. The former option could yield inconsistencies between old and new regimes, the latter far-reaching transformations, e.g. of data protection principles. Thouvenin thus prefers a more cautious governance of small steps: not a full-blown introduction of data ownership, but more reflection on transferability and admissible contractual agreements between individuals and private industry on data access and use—while maintaining a systematic place for a personal sphere that cannot be affected by contractual agreements. He thus cares about the particular conditions typically implicated by ownership, not about ownership itself, which he rejects for largely pragmatic reasons.

This directly leads to a third suggestion. Calls for data ownership are not only focused on Honoré-style rights. They are also put forward with an eye on the outcomes that assigning such rights to data owners will promote. As a point in favour of this suggested reading, note that criticisms of data ownership deny that it succeeds in bringing about certain ends better than alternative governance designs. For example, Evans argues that status quo regulations (in her case: US federal regulatory protections) and proposed data ownership models do not differ much with regards to balancing privacy with public access and non-consensual data use: “Creating property rights in data would produce a new scheme of entitlements that is substantively similar to what already exists, thus perpetuating the same frustrations all sides have felt with the existing federal regulations” (Evans 2011 , 75). She concludes: “The right question is not who owns health data […]. Instead, the debate should be about appropriate public uses of private data and how best to facilitate these uses while adequately protecting individuals’ interests” (Evans 2011 , 77).

As we have just seen, ownership does not necessarily confer unconditional control. Sometimes, what matters are the particular rights typically implicated by ownership, not ownership per se. And sometimes, it is primarily outcomes that are negotiated and criticized by means of ownership terminology. We thus put forward the following conciliatory suggestion in view of the reservations and scepticism against data ownership. One possibility is to shift the focus away from legally codified ownership and towards the desideratum that discussants pursue: ownership in the sense of gaining and maintaining control over one’s data. To this end, we understand quasi-ownership as the relation that supervenes on components of a Honoré-style bundle while leaving it conceptually open whether enough of them is instantiated for ownership, let alone full ownership . Quasi-ownership would primarily seek to put individuals in a position to distribute, retract, shield, but also share their data for a variety of purposes, including but not limited to personalized medicine, algorithmic applications, biomedical research, and their own clinical care within a patient-centred health system. In other words: what is initially framed as data ownership concerns primarily controllability , i.e. the availability of effective means for data subjects to exercise control over her data.

This marks an important dialectical point: the terminology of ownership is dispensable; what matters are the particular conditions that ownership typically advances. If so, maybe calls for data ownership should actually eschew the language of ownership and property for the sake of clarity. Instead, proponents could make explicit that they are concerned with certain (bundles of) rights, regardless of whether or not these qualify as property rights, let alone give rise to full ownership. This would even build on some common ground in the debate. For example, opponents of data ownership argue that all that this view can charitably be taken to mean is “property-like rights through contract” (Contreras et al. 2018 ). Proponents could take this as a cue to highlight that whether contractual or not, it is precisely such rights pertaining to access, usage, and exclusion that are important. Moreover, Pearce ( 2018 , 205–208), after deeming EU data protection law conceptually muddled (2.), proposes that it fits best with quasi-property rather than a personal rights or full-blown property paradigm, for example because the exclusionary rights specified by the GDPR are not unconditional and contingent on interactions that establish “a data subject/data controller relationship” (Pearce 2018 , 207).

Sceptics could still object that even quasi-property will constrain data access because individuals can restrain others from accessing their data, which might “result in less-effective research and flawed health policy” (Contreras et al. 2018 ). For example, Rodwin cautions that “patient data appear to be an example of private ownership that preclude downstream inventions and benefits for individual owners and society” (Rodwin 2009 , 87). Patient data ownership is akin to private industry organizations owning patents in gene sequences: such ownership constrains access to building blocks for innovation and “monopolizes raw material needed for research” (ibid.).

On the one hand, regardless of whether property or quasi-property in data is in question, such effects on research, development, and policy form important considerations. However, as a proposed reason to reject data ownership proposals, we can confront these lines of thinking with at least two challenges. First, framing the issue as a choice between data ownership and efficient research presents a false dilemma. It remains to be demonstrated that considerations about the availability of data for research speak against data (quasi-)ownership tout court , or whether they primarily urge us to design research processes in the right ways. Whether research will be hampered is not only a function of whether data ownership is in place, but also of how easily and effectively data owners can share their data, whether data interoperability is achieved, and whether individuals trust research processes. If data owners can control data flows, contribute data to research, but also retract it if necessary or desired, then the hypothesis that individuals would consistently withhold their data is hardly plausible, both empirically (Mikk et al. 2017 ) and conceptually (Hummel et al. 2019 ).

Second, suppose we were convinced that (quasi-)ownership comes at a cost for research. Even then, serious questions remain about whether an ethical rationale for systematically bypassing the informed will of the individuals whose data is channelled into research and development could withstand scrutiny. Rodwin- and Contreras-style considerations might well be the outcome of a normative debate in which they are evaluated in the light of public benefits that are allegedly undercut. But they cannot preempt such a debate. Individuals are not isolated, independent and unaffected subjects, but always find themselves embedded in social contexts in which datafication and analytics, while being able to generate benefits and efficiencies, can drive inferences that constrain self-determination. Negotiating claims to data (quasi-)ownership could be one step towards safeguard these spaces.

To recap the foregoing, we have suggested that calls for data ownership might have less to do with property rights or full ownership than it appears. None of the components of Honoré’s bundle are individually necessary for ownership, and there can be ownership without instantiating each and every component of the bundle. Are these calls aiming at ownership or merely quasi-ownership? Our constructive proposal is: it might be merely the latter. If these are de lege ferenda claims at all, they are much less drastic than Fezer’s proposal. And we have argued that debating them is timely and essential for ethically sound data governance. Data (quasi-)ownership is a live option and deserves to be taken seriously.

3.2 Marketability versus Inalienability

Amongst the components in a Honoré-style bundle is the right to transfer one’s property to somebody else, usually in return for another good, such as money or services. The conviction that such transfers involving data should be possible is one important motivation of data ownership regimes. “The raison d’etre of property is alienability” (Litman 2000 , 1295). Indeed, while data are hailed as the ‘new oil’ and the most important resource of the 21st century, no straightforward mechanisms for attributing and transferring data exist (Thouvenin 2017 , 26). Data ownership could address this gap by enabling data subjects to introduce usage and access rights to their data into the market. Ideally, this facilitates transactions and results in more efficient allocations. For example, with regard to ownership of health data, Kish and Topol believe that “[w]ithout ownership, there can be no trusted exchange. […] To build a truly thriving health data economy, we need to harness the power of data ownership” (2015, 923).

For a moment, let us sidestep our result above that quasi -ownership might be enough to achieve such ends, and that genuine ownership need not be necessary. The present idea is that propertization and the option to market data enhance the control and power of data subjects: “if the essence of a ‘property right’ is that the person who wants it must negotiate with its holder before he can take it, propertizing privacy would also reinforce the power of an individual to refuse to trade or alienate his privacy” (Lessig 2002b , 261). As Purtova ( 2015 ) highlights, losses of control loom in the absence of propertization: access and usage of data will become a matter of mere de facto power of data market participants. Thus, “the right question to ask is not whether there should be property in personal data, but whose property it should be” (Purtova 2015 , 84).

Besides control, marketability could also put individuals in a position to gain a share in the economic value that is generated by processing their data. Lanier highlights that private firms do not compensate the individuals whose data drives their business. His proposed solution is to

[p]ay people for information gleaned from them if that information turns out to be valuable. If observation of you yields data that makes it easier for a robot to seem like a natural conversationalist, or for a political campaign to target voters with its message, then you ought to be owed money for the use of that valuable data. It wouldn’t exist without you, after all. […] An amazing number of people offer an amazing amount of value over networks. But the lion’s share of wealth now flows to those who aggregate and route those offerings, rather than those who provide the ‘raw materials.’ A new kind of middle class, and a more genuine, growing information economy, could come about if we could break out of the ‘free information’ idea and into a universal micropayment system. We might even be able to strengthen individual liberty and self-determination even when the machines get very good. (Lanier 2014 , 9)

Lanier does not merely demand that data ownership shall get recognized. He is also presupposing certain (non-legal) ownership relations and entitlements when he argues that as of now, people do not receive fair shares of the value their data helps to generate, and that this shall be addressed by means of his proposed micropayment system that compensates individuals appropriately.

The idea that individuals should be able to market their data receives a variety of criticisms. First, it could have undesirable consequences if data subjects were in a position to introduce their data into market transactions. Commercialization of data appears to empower the initial data owner, but could also result in losses of control and participation. Once access and usage rights have been sold or traded against services, individuals cease to have control and authority over their data. Intuitions about entitlements one retains after one’s data has been marketed thus speak against ownership as marketability. As Montgomery points out with regard to personal health data, “information ‘about me’ does not cease to be connected to my privacy when I give (or sell) it to others” (Montgomery 2017 , 82). Since even after having sold their data, individuals retain justified claims to privacy, Montgomery takes it to be implausible that data fall under the category of private property. Alienability could further encourage marketization that ends up undermining rather than strengthening privacy. “The concept of alienable ownership rights in personal data is disturbing, because the opportunities to alienate are ubiquitous. […] The market in personal data is the problem. Market solutions based on a property rights model will not cure it; they’ll only legitimize it” (Litman 2000 , 1299–1301).

Second, a further worry relates to the attitudes and expectations that are raised by data ownership. Consider what exactly individuals would own if they had property rights in their data. If we abstract from the labour others contribute towards the generation and processing of data, then not much of value is left. “Data propertization proposals fail because patients’ raw health information is not in itself a valuable data resource, in the sense of being able to support useful, new applications. Creating useful data resources requires significant inputs of human and infrastructure services, and owning data is fruitless unless there is a way to acquire the necessary services” (Evans 2011 , 75). According to a joint report by the British Academy and the Royal Society, data ownership might “create an expectation of compensation for use of data” even though “value is typically derived from the combination and use of data rather than from individual data points” (2017, 32). The report does acknowledge the economic importance of ownership notions towards “extracting the commercial value of data” and “protecting [data] as an asset and realising its value” (ibid.). But it cautions against misunderstandings that could flow from overly simplistic models of data ownership, transferability, and the value of data.

Third, against data ownership as marketability speaks the insight that data differ from other goods that can be marketed. Unlike paradigm cases of property, transfers of data do not imply that the vendor or donor loses anything. Data can be transmitted but not withdrawn, be possessed simultaneously by many (Solove 2008 , 27), and be in several places at the same time (Prainsack 2019b ). Floridi notes that acquisition and usage of information are lossless; “contrary to other things that one owns, one’s personal information is not lost when acquired by someone else” (Floridi 2014a , 118). These challenges to marketization might not be insurmountable. For example, one way to accommodate them is to say that rather than transfers of data, what happens upon marketization is simply the suspension of certain privacy claims against a return. Still, data remain an unusual commodity with distinctive characteristics.

These criticisms target data ownership as marketability as undesirable, at odds with privacy intuitions, and unsuitable in view of the features of data. But another strand in the debate takes a different route. It agrees that data ownership—if understood properly—should be recognized, but on these grounds strongly opposes the idea that data should be marketable. Its starting point is the importance of data for the constitution and integrity of persons.

Floridi’s account of personhood rests on an “informational interpretation of the self. The self is a complex informational system, made of consciousness activities, memories, and narratives. From such a perspective, you are your own information” (Floridi 2014a , 69). He goes on to defend the view that the primary importance of privacy flows from our status as “informational organisms ( inforgs ), mutually connected and embedded in an informational environment (the infosphere)” (Floridi 2014a , 94). Because of the significance of information for the self-constitution of inforgs , privacy breaches infringe upon their identity. This picture, however, leads Floridi to reject an ownership-based interpretation of privacy according to which “[a] person is said to own his or her information […] and therefore to be entitled to control its whole life cycle, from generation to erasure through usage” (Floridi 2014a , 116). Agents do not just own information; they are constituted by it. Floridi thus calls for “understanding a breach of one’s informational privacy as a form of aggression towards one’s personal identity” (2014a, 119). As a result,

one may still argue that an agent ‘owns’ his or her information, yet no longer in the metaphorical sense just seen, but in the precise sense in which an agent is her or his information. ‘Your’ in ‘your information’ is not the same ‘your’ as in ‘your car’ but rather the same ‘your’ as in ‘your body’, ‘your feelings’, ‘your memories’, ‘your ideas’, ‘your choices’, and so forth. It expresses a sense of constitutive belonging, not of external ownership , a sense in which your body, your feelings, and your information are part of you but are not your (legal) possessions. (Floridi 2014a , 121)

In other words, Floridi would criticize language of data ownership only to emphasize that it actually concerns instances of self-ownership in the most literal sense. Because of this intimate entanglement between information and the informational organisms it constitutes, Floridi demands that the protection of information shall be grounded directly in the normative status of the latter. For us, this means that

[t]he protection of privacy should be based directly on the protection of human dignity, not indirectly, through other rights such as that to property or to freedom of expression. In other words, privacy should be grafted as a first-order branch to the trunk of human dignity, not to some of its branches, as if it were a second-order right. (Floridi 2016 , 308)

One upshot of this is that data becomes unsuitable for market transactions. Indeed, Floridi suspects that if he is right that “personal information is […] a constitutive part of someone’s personal identity and individuality, then one day it may become strictly illegal to trade in some kinds of personal information” (Floridi 2014a , 122).

Overall, the relation between data ownership and marketability is complicated. On the one hand, some call for data ownership in order to allow individuals to market their data. On the other hand, others worry that data ownership would open the door to stripping individuals of something inalienable, and that data are so ingrained with individual personhood that ownership is too external to capture this relation. From the perspective of the individual, the resource is indispensable. Data ownership properly understood thus precludes rather than motivates alienability and marketization.

3.3 Protection versus Participation

Other calls for data ownership share the conviction that certain fundamental resources need to be available to individuals for self-constitution, forming and implementing life plans, and participating in communal lifeforms. They also agree that data ownership is vital towards articulating claims to access these basic resources. But they differ in their understanding of which kinds of resources are needed.

On one end of the spectrum, data ownership can be a merely defensive, protective concept. Individuals are in need of a sphere of secrecy, and authority over access and use of their data is what allows them to protect this sphere from the state, corporations, and others. For example, consider Lessig’s position that property rights and rhetoric are instrumentally valuable since they promote and reinforce privacy: if my data was my property, it would become intuitively clear how taking, using, or selling it without my consent is wrong. “If people see a resource as property, it will take a great deal of converting to convince them that companies like Amazon should be free to take it. Likewise, it will be hard for companies like Amazon to escape the label of thief” (Lessig 2002b , 255). Property rights can be used to delineate a personal sphere that others may not interfere with. “Property talk is often resisted because it is thought to isolate individuals. It may well. But in the context of privacy, isolation is the aim. Privacy is all about empowering individuals to choose to be isolated” (Lessig 2002b , 257). Similarly, Westin’s claim that “personal information, thought of as the right of decision over one’s private personality, should be defined as a property right” (Westin 1967 , 324) seems to rest on an instrumental claim: rather than an end in itself, propertization is an effective means . Its value derives from its function of facilitating and enabling individual control and the ability to safeguard privacy.

Besides such instrumental claims, we could further consider an explanatory proposal: breaching privacy is wrong because of ownership. For example, Thomson ( 1975 ) argues that the right to privacy consists in a cluster of more specific rights, such that.

the wrongness of every violation of the right to privacy can be explained without ever once mentioning it. […] Someone looks at your pornographic picture in your wall-safe? He violates your right that your belongings not be looked at, and you have that right because you have ownership rights—and it is because you have them that what he does is wrong. (Thomson 1975 , 313)

Along these lines, we could imagine that at least for some privacy breaches, e.g. the ones Lessig is referring to, data ownership is what explains their wrongness. Admittedly, one initial challenge is that Thompson speaks about tangible objects like pictures in a wall-safe, whereas data, as mentioned earlier (2.), differ in important respects from such objects. But suppose we are willing to entertain the suggestion that data can be owned, and that such ownership can ground privacy claims. Thompson’s discussion then leads to a second challenge: we can think of alternative normative resources to motivate privacy. It seems to be up for debate whether it is ownership that explains the wrongness of privacy breaches. For example, taking up Thomson’s own cue, we could instead focus on personal rights:

Someone uses an X-ray device to look at you through the walls of your house? He violates your right to not be looked at, and you have that right because you have rights over your person analogous to the rights you have over your property—and it is because you have these rights that what he does is wrong. (ibid.)

The wrongness of privacy breaches could also consist in the right of persons not to be harmed, or not to be treated merely as a means. Once this becomes clear, the question arises whether data ownership is necessary and appropriate for a normative grounding of privacy. A related objection is levelled by Scanlon ( 1975 ), who argues that the relevance of ownership is actually merely incidental. It happens to convey information about the conventional boundary of our zone of privacy. But even without any ownership involved in the wall-safe case, there would be rights and interests in play which determine and motivate justified privacy claims—which shows that reference to ownership is dispensable. For our purposes, we need not take a stance in this debate. We merely note that ownership can in principle be taken to motivate protective rights that exclude others from an individual’s personal informational sphere.

Even then, a third challenge looms: which claims to data ownership, if any, are justified ? Recall that the Lockean labour criterion suggests: not the ones Lessig thinks. If labour is any indication, it is not clear whether data subjects and not data processors should be regarded as data owners. Here, we would like to propose a possible response to this challenge, which to our knowledge has not been noted in the literature. Interestingly, when discussing Locke in connection with data ownership, commentators appear to assume that labour is the only Lockean criterion for ownership. Locke, however, proposes the labour criterion in the context of a discussion of original acquisition, i.e. acquisition of a resource that was previously unowned. This raises two questions towards authors who refer to the labour criterion to claim that data subjects do not (or should not) have data ownership.

First, the Lockean criterion for original acquisition is compatible with there being further criteria that are subsequent to original acquisition, such as inheritance, reparation for injuries, alienation through gift, sale, or trade, and especially an individual’s entitlement to the surplus of the good of others if this is necessary for the satisfaction of her basic needs (Simmons 1992 , 224–225, 327–328).

As Justice gives every Man a Title to the product of his honest Industry, and the fair Acquisitions of his Ancestors descended to him; so Charity gives every Man a Title to so much out of another’s Plenty, as will keep him from extream want, where he has no means to subsist otherwise; and a Man can no more justly make use of another’s necessity, to force him to afford to the wants of his Brother, than he that has more strength can seize upon a weaker, master him to his Obedience, and with a Dagger at his Throat offer him Death or Slavery. (Locke 1689 , I, 42.)

There is some debate amongst Locke interpreters about the significance of need and charity for property rights, for example whether subsistence must be provided through property, or whether Locke is in fact neutral on how subsistence is guaranteed (Waldron 1988 , 139). What matters for our purposes is that there are other Lockean criteria besides labour. The insight that not the individual but corporations, states, or hospitals invest labour and resources into generating and processing data then falls short of determining data ownership. In particular, if we are inclined to justify privacy via ownership, and note that certain levels of privacy can be seen as a basic need, a fundamental presupposition for leading a fulfilled life, then the relevant forms of ownership might not depend on labour.

Second, emphasizing that the labour criterion concerns original acquisition, and the mutual shaping between Locke’s view and the colonial policies of his times (Arneil 1996 ), remind us of the significance of who owns what in the status quo . Indeed, Zuboff in her recent and rich analysis of societal and economic consequences of digitization compares the relation between contemporary large private-sector data processors and data subjects to the relation between sixteenth century conquerors and indigenous populations. Like the conquerors who purported to embody the authority of God, the pope, and their king, and on these grounds claimed indigenous lives and land, the large players of surveillance capitalism “claim human experience as raw material free for the taking”, and purport to possess a “right to own the behavioural data derived from human experience”—declarations that effectively render the age of surveillance capitalism “an age of conquest” (Zuboff 2019 , ch. 6). The analogy is Zuboff’s, not ours, and we are not assessing its adequacy or possible limitations. What matters for our purposes is the suggestion that it might be misguided to regard consumer data as unoccupied territory. Like the conquerors, large private-sector organizations claim resources they falsely presume to be unowned, and neglect claims that precede their acquisition. Thus, in order to clarify whether the labour that data processors invest into the generation of data confers ownership, we must know more about antecedent claims that frame this process. Before data are generated, they might not be owned by anyone because they do not exist. What does exist, however, are the lives, behaviours, and features of data subjects, including their claims to have these processes protected, and to shape them autonomously in ways that allow them to participate in societal endeavours. How these claims are weighed against subsequent claims by data processors is a question whose answer a labour criterion does not by itself address.

The foregoing was primarily concerned with negative, protective claims that keep others outside one’s personal informational space. However, stances on the role of data ownership can be informed further by one’s theory of selves—what constitutes them, and whether we regard them primarily as occupying social roles as citizens or members of particular communities. The hint we can then take from Floridi is twofold. On the one hand, his notion of the self highlights the importance of protecting information pertaining to the personal sphere and the integrity of persons. On the other hand, he can also be taken to suggest that shielding is important but not enough. Individuals qua inforgs are deeply interwoven with their personal information and its embeddedness in the infosphere. Since inforgs weave informational ties across the infosphere, we might argue that controllable, localized retentions of the shielding of information are what allows them to interact with others and to participate in communal and societal activities.

This means that data ownership will not always be tied to presumed entitlements and mechanisms to constrain data flows. Sometimes, individuals will claim their data and seek to share them in certain ways (Hummel et al. 2018 ). For inforgs , data ownership as isolation cannot be enough. It must also allow participation in societal endeavours mediated through the infosphere. As mentioned, proponents of data ownership as marketability (3.2) argue that the relevant form of participation is receiving one’s share in the generation of economic value that is driven by the processing of individuals’ data. But beyond seeking economic returns, individuals are shaped by ascriptions of recognition of and to others. The intention to contribute to a common good can lead them to redistribute parts of their informational resources through giving and donating data (Hummel et al. 2019 ). It is a platitude that I can only legally give what is mine, and thus one potential aporia of donating and sharing data is that these attitudes seem to presuppose some form data ownership that legal frameworks might not recognize (2.). But drawing on our earlier suggestion (3.1), the ‘mine’ in the platitude need not signify genuine ownership, but can take the form quasi-ownership. Insisting on not only protective but also participatory ways of making use of one’s data thus need not rest on presumptions that are unattractive or unfounded from a legal perspective.

3.4 Individual versus Collective Claims and Interests

There are different frameworks for distinguishing kinds of property. For example, Waldron ( 1988 , 38–42; 2017, ch. 1) defines: in private property, the resource is under the decisional authority of particular individuals (or families or firms). In collective property, governance proceeds “by reference to the collective interests of society as a whole” (Waldron 1988 , 40), with these interests determined through mechanisms of collective decision-making. Common property refers to a resource being governed by rules whose point is to make them available for use by all or any members of a society. Here, Waldron explicitly links common property with conceptions of justice: “In the case of finite resources, or resources which cannot be used simultaneously by everyone who wants to use them, the operation of a system of common property requires procedures for determining a fair allocation of use to individual wants. This is the task of a theory of justice” (Waldron 1988 , 41). Both collective and common property typically involve governance by the state.

As these distinctions illustrate, conceptualizations of different kinds of property have a variety of parameters on which they can encode substantive commitments. To begin with, they call for specifications of who the owner is. Property need not be owned by single individuals. A multitude of individuals can own a resource. Moreover, in some cases, the agent(s) authorized to govern the particular good or resources by setting rules and restricting access might be the de facto owner(s), but actually remain(s) answerable to certain claims. For example, on the one hand, the state’s governance of collective property is “the effect of a decision by a sovereign authority” (Waldron 1988 , 41). On the other hand, the state is answerable to society’s claim to access for everyone. Moreover, different kinds of property reflect different goals and principles that shall guide governance, e.g. whether the point of property is making it accessible to everyone, to preserve it, or to govern it in accordance with collective decision-making. Last but not least, implicit in notions of commonality and collectives is a presumption of membership relations. Those not standing in these relations are outsiders from the perspective of the group or society in question. Crucially, these membership relations are linked with usage entitlements to the respective resource.

Besides highlighting the question of who is plausibly regarded as the data owner, these points demonstrate the importance of terminological clarity of which kind of property is envisioned. Waldron’s definitions are certainly not the only way to conceptualize different kinds of property, but they demonstrate that merely calling for data ownership is an underspecified demand. If there is—or should be—data ownership, the question immediately arises who owns which data, and what the point of such ownership is.

As outlined, one first response is that individuals shall have their data as private property. This sense of data ownership is put forward when it is proposed that patients shall have ownership over their health data (Kish and Topol 2015 ; Mikk et al. 2017 ), when individuals shall be able to market their data (Lanier 2014 ), and also when legal frameworks are perceived to assign proprietary rights to data (2.).

However, we have also seen that one salient alternative is to affirm that data can be property, but to deny that individual data subjects are the owners. First, as consideration of the Lockean position indicates, private-sector organizations, too, can articulate justified claims to ownership of the data they generate and process (e.g. Ensthaler and Haase 2016 , 1460–1461). Second, we can even dispute whether private property in Waldron’s sense is applicable to data. For example, Montgomery argues that if we really want to regard genetic and genomic information as property, it should not be considered private : “it is perhaps more convincing to consider individual genetic information as a form of common property belonging simultaneously to a group of people but with outsiders excluded” (Montgomery 2017 , 85). He also endorses “[p]ublic ownership of genomic information” (ibid.), whereas private property would implausibly enable patients “to appropriate to themselves material that is biologically common to others” (ibid.).

There are further authors who call for data ownership and envision the owner to be a collective, society, or mankind as a whole, for example when recognition and protection of a data commons is being demanded. For Prainsack, a data commons could be key to address power asymmetries between data subjects and data processors, typically large private sector organizations, provided careful reflection is devoted to dynamics of inclusion and exclusion surrounding this pool of resources. Inclusion involves being able to enter data into the digital commons, to use information from the commons, to enjoy benefits from the commons, and to participate in its governance (Prainsack 2019b , 8). Across these instances, exclusion need not always be unjust, but prompts further consideration of how the respective individuals and populations are affected by it. In the end, harm mitigation mechanisms (McMahon et al. 2019 ) are necessary for both those who incur damages due to their inclusion, and those who incur damages from being excluded.

In these proposals, data ownership held by populations, society or mankind need not preclude further ownership or control rights at the individual level. However, tensions and competition between individual and collective as well as local and global interests are conceivable. For example, consider Cohen’s view ( 2018 , 212–216) that healthcare data can be shared and processed for the public good without patient consent because her healthcare data is not her property. In this case, a denial of data ownership is tied to a denial of entitlements to control and consent to data processing, or in Cohen’s words, a “duty to share healthcare data” (2018, 209). He seems to assume that ownership is a necessary condition for control, whereas amongst our points above (3.1) was that we could conceive of control rights independently of whether or not they constitute ownership. At any rate, his reasoning demonstrates that there is an interplay and between individual and collective claims; if they are in tension or opposition, then discounting one can be perceived as strengthening the comparable plausibility of the other.

Considerations promoting the common good undoubtedly carry some normative weight particularly against the background of an interdependent relation between self and other. But how are we to think of the coexistence of individual and collective interests when algorithms informing litigation are fed with information on a particular defendant, when data of a particular individual is used to tailor marketing content that taps into her needs, desires, and preferences, when an individual’s credit application is assessed on the basis of ratings derived from data sets harvested across different domains, or when health data is tracked by wearables and subsequently shared with insurers, employers, and other third parties, thereby giving a variety of lifestyle decisions of an individual? The point is not merely that conflicts can arise between individual and collective interests. The point is also that it is up for debate what counts as a concordance of both, which considerations take priority in particular cases of conflict, and which compromises of individual interests and freedom count as sufficiently valuable additions to the data commons and provide public value and enhancements of the common good.

Just as prioritizing the common good can constrain individual self-determination, individual interests might involve delimiting the authority of others and enforcing claims against them, potentially with the consequences that Rodwin, Contreras, and others who are concerned about impediments to research fear for the common good. Our suggestion is not that there is a zero-sum game between individual and collective interests, i.e. that promotions of one are only to be had at the expense of the other. It is just that there are interdependencies and sometimes trade-offs between both domains. For this reason, it is an oversimplification to suggest that we can seamlessly envision individual control rights alongside collective-level ownership. Attributing data ownership to the public and maintaining a data commons will inevitably interact with individual freedom, and claims for data ownership gesture towards the need for an inclusive societal debate on how these two poles are being harmonized and mediated.

We have suggested that calls for data ownership position themselves along four dimensions, specified by four pairs of poles: property versus quasi-property, marketability versus inalienability, protection versus participation, and individual versus collective claims and interests. These results are presented in Table 1 . For the respective positions on these dimensions, different claims turned out to be relevant. The first dimension concerns incidents of (quasi-)property. These are expected to enable (quasi-)owners to control data flows and to impact the outcomes of data processing. The second dimension focuses on the individual’s relation to data. It asks whether or not there is, or should be, an entitlement of the data subject to market her data. Ideally, this allows her to benefit from her resource. However, it could also amount to the alienation and commercialization of certain core aspects of her as a self or subject. The third addresses the significance of the resource for individual constitution, flourishing, and integrity, and negotiates which combination of protection and participation is needed to advance them. The fourth pair of poles concerns the interplay between individual and collective interests, needs, and preferences, and raises the questions how these could be harmonized and aligned. The proposed scheme is not intended to present an antecedently determined standard on what a comprehensive call for or against data ownership shall cover. Neither is it intended to exhaust all possible meanings. Instead, it is inductive as it rests on inferences from specific, paradigmatic instances of discussions of data ownership.

4 Data Ownership and the Prerequisites for Informational Self-determination in Digitization

We conclude from the foregoing that the notion of data ownership is rife with tensions and perplexities. They arise independently of whether or not the reasons for or against data ownership prevail, and concern the question of what it would mean to recognize data ownership.

Some proposals and rejections of data ownership concern genuine property rights, whereas others concern certain control rights regardless of whether these qualify as property rights (3.1). And while some think that the point of data ownership is to put individuals in a position to market their data (3.2), others maintain that the relation between individuals and their data actually motivates the exact opposite: inalienability. Moreover, on some understandings, recognition of data ownership involves assigning protective rights as well as mechanisms to safeguard and to enforce these rights (3.3). But on other proposals, this is insufficient. Data ownership is not restricted to protective rights, but involves much more: to put data owners in a position to enjoy participation and inclusion in societal endeavours. Finally, there is disagreement on whether data is owned by individual data subjects, data processors, and/or collectives like society as a whole (3.4).

These observations illuminate what is at stake when data ownership is being claimed and disputed. So far, our enquiries were descriptive: they focused on what discussants mean with data ownership. But reflecting on these meanings invites an additional, substantive claim: all these dimensions of data ownership are vital to informational self-determination.

In order to develop this suggestion, we turn to a debate between Honneth and Fraser ( 2003 ) on the relation between distribution and recognition. Fraser distinguishes two folk paradigms of social justice. On the one hand, social injustice can be rooted in economic structures, taking the form of exploitation, economic marginalization, and deprivation. But it can also be understood as a cultural phenomenon, taking the form of cultural domination, nonrecognition, and disrespect (Fraser 2003 , 12–13). She argues that a one-sided economism or culturalism is inadequate and demands that a comprehensive account shall consider and integrate “both the standpoint of distribution and the standpoint of recognition, without reducing either one of these perspectives to the other” (Fraser 2003 , 63). Her normative target is “the notion of parity of participation . According to this norm, justice requires social arrangements that permit all (adult) members of society to interact with one another as peers” (Fraser 2003 , 36).

Honneth agrees with Fraser that both redistribution and recognition matter for social justice, but further maintains that recognition is “the fundamental, overarching moral category”, while distribution is derivative (Fraser and Honneth 2003 , 2–3). In his view, the paradigm of recognition—which he stratifies further into love, respect, and esteem—fits particularly well with the lived experiences of economically marginalized individuals, who not only have unmet material needs but also suffer from disappointed expectations of social recognition. What counts as achievement and work, the relevant targets of recognition as esteem, is always guided by cultural commitments. Economic allocations and principles of distribution are thus not value-free. “It would be wrong to speak, with Luhmann and Habermas, of capitalism as a ‘norm-free’ system of economic processes since material distribution takes place according to certainly contested but nevertheless always temporarily established value principles having to do with respect, with the social esteem of members of society” (Honneth 2003 , 142). Instead of aiming directly at parity of participation, Honneth proposes to “move from individual autonomy first to the goal of the most intact possible identity-formation, in order to then bring in principles of mutual recognition as that goal’s necessary presupposition. […] [T]his formulation is equivalent to saying that enabling individual self-realization constitutes the real aim of the equal treatment of all subjects in our societies” (Honneth 2003 , 176). Consequently, “[t]he justice or well-being of a society is proportionate to its ability to secure conditions of mutual recognition under which personal identity-formation, hence individual self-realization, can proceed adequately” (Honneth 2003 , 174).

For our purposes, we need not take a stance in the debate between Fraser and Honneth. Instead, we avail ourselves of their conceptual tools in order to highlight how the identified dimensions of data ownership (3.1–3.4) emphasize different aspects in the spectrum of redistribution and recognition. Our proposal is: through data ownership, data subjects call for the satisfaction of material needs and entitlements, and thus articulate and negotiate claims on the redistribution of resources. And for social beings for whom a fulfilled life depends on intersubjective prerequisites, data ownership further encodes expectations and demands on the recognition of proclaimed data owners. Both of these spheres need to be considered to grasp claims related to data ownership, and negotiating them is necessary to facilitate informational self-determination of data subjects.

Specifically, when property and quasi-property are discussed, the question concerns the extent to which the redistribution of data resources and power is called for, whether it shall take the form of assigning genuine ownership, or whether—and in which cases—this kind of redistribution stands in tension with commitments on recognition, for example the ascription of achievement to data processors. When marketability and inalienability are claimed, the question becomes what recognition of a person as a person involves and precludes. When protection and participation are addressed, the focus shifts on material and social enabling conditions for the self-constitution of agents, and the question becomes whether a closed-off private sphere suffices, or whether outward strands of connectedness and involvement with others are just as central. When individual and collective claims and interests in data are articulated, the focus shifts towards how recognition and redistribution relate the interests of individual data subjects to the common good. Overall, the dimensions of data ownership articulate stances on—and shift between matters of—distributive justice and cultural recognition.

Even if recognition is conceptually prior to redistribution, surely recognition and self-formation do not unfold in a vacuum. Recognizing others involves acknowledging their entitlements to exercise autonomy. There are material, distributive, and social conditions and structures that set the stage for self-formation and exercises of freedom. In such contexts, autonomy as a fundamental feature of persons can become practical as self-determination, harnessing concrete enabling conditions of autonomy. Within the context of digitization and an onlife world (Floridi 2014b ), data form a crucial part of the structural enabling conditions for self-determination, but can also impede it. Through data ownership, discussants refer to and spell out a variety of goals and normative target notions. When debating property and quasi-property, one possible key issue is whether there is a match between the immanent societal values that frame economic processes and the way data resources and rights are actually distributed, allocated, and controlled. Proponents of data ownership argue that redistributions are necessary, whereas opponents maintain that from the cultural perspective of recognition and achievement, status quo distributions are appropriate. When debating whether data shall be marketable or inalienable, the normative reference point appears to be the identity-formation and integrity of persons. When mediating between protective and participatory claims, we seem to be concerned with the agency and (parity of) participation of individuals, reflecting both material and cultural enabling conditions for self-determination. Finally, when individual and collective interests in data are addressed and mediated, the goal appears to be to govern the resource in accordance with social justice as harmonizing individual and common good, and to arrive at rationales for managing potential trade-offs between both spheres.

Data ownership thus expresses stances on the redistribution of material resources and socio-cultural recognition of data subjects in a datafied and data-driven lifeworld. It is mindful of the fact that the perspective of recognition, even if perceived to be prior, is entangled with distributive conditions, on which it can encode commitments as well. According to some critics, marketability and propertization are instances where this dialectic tips in a one-sided manner towards a kind of economization that can raise tensions with constraints from the cultural sphere in which it is embedded. Informational self-determination requires both redistribution and recognition, and the different dimensions of data ownership overlap with and across these spheres.

5 Conclusion

We have advanced two claims. First, data ownership encodes a variety of different concerns. Our analysis serves a need for conceptual clarification not only because of the somewhat contested legal status of data ownership (2.), but also because commentaries on the topic oscillate between poles that concern primarily pre- or para-legal questions: whether data is rivalrous, whether property is the right category to articulate and recognize the claims of data subjects or whether surrogate notions are more suitable (3.1), whether individuals should be in a position to market their data (3.2), whether and how we shall consider both protective and participatory claims (3.3), and how the claims of individual data subjects relate to collective claims and interests (3.4).

Second, we have proposed an explanation of what unifies this variety of concerns: they articulate and negotiate claims on the relation between redistribution and recognition in an increasingly datafied and data-driven lifeworld. Data ownership marks a domain of self-determination and concerns the full extent of prerequisites that need to be upheld to maintain the integrity of this domain in times of digitization and big data analytics.

What do our results imply in practice? How should data governance be shaped? We have not taken a stance on whether there should be data ownership and in which form. Part of the present paper was the demonstration that significant reconstructive work on the notion is in order first, and that debates so far have proceeded without making key guiding commitments and reference points explicit. Data ownership can come in many different forms and is compatible with a variety of normative targets and background assumptions. Nevertheless, the following demands can be articulated.

First, different understandings of data ownership are unified in calling for a variety of modes to control how data is used. Different conceptual and methodological angles (Table 1 ) shape how exactly controllability is spelled out and put into practice: as the ability to own data in a (quasi-)property sense, to market or to refrain from alienating intimate core features, to protect data but also to participate in data-driven endeavours, and to use data for one’s own benefit or the benefit of others. But taking any of these understandings seriously requires mechanisms to channel, constrain, and facilitate the flow of data.

Second, with regards to the marketization and commodification of data, ownership has turned out be a double-edged sword (3.2). It can be taken to motivate marketization, to harness the economic potential of data, and to put data subjects in a position to sell their data and thereby to receive a share in the value that gets generated from it. But at the same time, at least some versions of data ownership also pose significant constraints on this undertaking. Two worries are that intimate things are being alienated, and that losses of control loom for data subjects once their data is introduced into the market. As a consequence, care should be taken that marketization aligns with—rather than undercuts—the ideal of controllability.

Third, the poles relating to participation and the common good highlight that data ownership touches upon a dialectic between mine, yours, and ours. The individual as a relational self is dependent on informational ties with others, and its own good cannot be understood in abstraction from the social fabric in which it is embedded. Taking data ownership seriously and recognizing the full extent of the notion thus involves reflection not only on how data subjects could shield their data, but also on how they could share it in suitable ways (3.3).

Fourth, if our reading captures what is at stake, dismissals of data ownership on the grounds that strictly speaking, legal frameworks do not assign property in data to data subjects miss the mark. Some, like Fezer, make an explicit de lege ferenda claim: legal frameworks should change. But others appear to require less (3.1). For them, it might be secondary whether we term the institution that is called for ‘property’, and render individuals ‘owners’ of their data. Data subjects claiming data ownership are sometimes concerned primarily with issues of controlling data flows, inferences drawn about them, and the impact of data processing on their lives. Even if legal frameworks preclude genuine ownership in data, there remains room to debate whether they can and should accommodate such forms of quasi-ownership.

Overall, these distinctions demonstrate how calls for data ownership are less unified than one might hope. Rationales differ, and sometimes the terminology of ownership almost seems to distract from what authors are really concerned about. Besides scoping out the conceptual space and highlighting how the dimensions of data ownership are entangled, we intend the present contribution as an invitation to relate endorsements and rejections of data ownership more explicitly and systematically to the proposed reference points of recognizing data subjects and redistributing resources throughout the datafied and data-driven lifeworld.

There is a further important issue on which we have stayed neutral. Suppose I can and should own my data. Which data is mine? What secures the link between data and owner? Is it the data being about the subject? If so, in what sense? Is the notion of personal data in the GDPR appropriate? Should we instead refer to alternative concepts like Fezer’s behaviour-generatedness ? Which difference does and should it make whether the subject is explicitly represented or mentioned in the data? Could differences in explicitness, e.g. anonymized versus personal data, make a difference in ownership rights, and why (not)? While we can highlight these questions as marking potential aporias inherent to the notion of data ownership, our proposal is not intended to preempt a substantive answer. How the relation between data and their owner(s) is specified is precisely what positions on data ownership seek to negotiate.

Arneil, B. (1996). The wild Indian's venison: Locke's theory of property and English colonialism in America. Political Studies, 44 (1), 60–74. https://doi.org/10.1111/j.1467-9248.1996.tb00764.x .

Article   Google Scholar  

Becker, L. C. (1980). The moral basis of property rights. Nomos XXII: Property, 22 , 187–220.

Google Scholar  

Berberich, M. (2010). Virtuelles Eigentum . Tübingen: Mohr Siebeck.

Cohen, I. G. (2018). Is there a duty to share healthcare data? In I. G. Cohen, H. F. Lynch, E. Vayena, & U. Gasser (Eds.), Big data, health law, and bioethics (pp. 209–222). Cambridge: Cambridge University Press.

Chapter   Google Scholar  

Contreras, J. L., Rumbold, J., & Pierscionek, B. (2018). Patient data ownership. JAMA, 319 (9), 935.

Ensthaler, J., & Haase, M. S. (2016). Industrie 4.0 - Datenhoheit und Datenschutz. In H. C. Mayr & M. Pinzger (Eds.), Informatik 2016. Lecture Notes in Informatics (Vol. - P-259, pp. 1459-1472) . Bonn: Gesellschaft für Informatik.

Evans, B. J. (2011). Much ado about data ownership. Harvard Journal of Law & Technology, 25 (1), 69–130.

Fairfield, J. A. T. (2005). Virtual property. Boston University Law Review, 85 , 1047–1102.

Fezer, K.-H. (2017). Ein originäres Immaterialgüterrecht sui generis an verhaltensgenerierten Informationsdaten der Bürger. Zeitschrift für Datenschutz, 3 (2017), 99–105.

Fezer, K.-H. (2018). Repräsentatives Dateneigentum. Ein zivilgesellschaftliches Bürgerrecht . Sankt Augustin und Berlin: Konrad-Adenauer-Stiftung.

Floridi, L. (2014a). The fourth revolution: how the infosphere is reshaping human reality . Oxford: Oxford University Press.

Floridi, L. (2014b). The onlife manifesto . Cham: Springer.

Floridi, L. (2016). On human dignity as a foundation for the right to privacy. Philosophy & Technology, 29 , 307–312.

Fraser, N. (2003). Social justice in the age of identity politics: redistribution, recognition, and participation (J. Galb, J. Ingram, & C. Wilke, Trans.). In N. Fraser & A. Honneth (Eds.), Redistribution or recognition? A political-philosophical exchange (pp. 7–109). London and New York: Verso.

Fraser, N., & Honneth, A. (2003). Redistribution or recognition? A political-philosophical exchange (J. Galb, J. Ingram, & C. Wilke, Trans). London and New York: Verso.

German Ethics Council. (2017). Big Data und Gesundheit. In Datensouveränität als informationelle Freiheitsgestaltung . Berlin: German Ethics Council.

German Ethics Council. (2018). Big data and health — data sovereignty as the shaping of informational freedom . Berlin: German Ethics Council.

Glushko, B. (2007). Tales of the (virtual) city: governing property disputes in virtual worlds. Berkeley Technology Law Journal, 22 , 507–532.

Harbinja, E. (2017). Legal aspects of transmission of digital assets on death . University of Strathclyde.

Harbinja, E. (2019). Posthumous medical data donation: the case for a legal framework. In J. Krutzinna & L. Floridi (Eds.), The ethics of medical data donation (pp. 97–113). Cham: Springer.

Hardin, G. (1968). The tragedy of the commons. Science, 162 (3859), 1243–1248. https://doi.org/10.1126/science.162.3859.1243 .

Honneth, A. (2003). Redistribution as recognition: a response to Nancy Fraser (J. Galb, J. Ingram, & C. Wilke, Trans.). In N. Fraser & A. Honneth (Eds.), Redistribution or recognition? A political-philosophical exchange (pp. 110–197). London and New York: Verso.

Honoré, A. M. (1961). Ownership. In A. G. Guest (Ed.), Oxford Essays in Jurisprudence (pp. 107–147).

Hughes, J. (2005). Copyright and incomplete historiographies: of piracy, propertization, and Thomas Jefferson. S. Cal. L. Rev., 79 , 993–1084.

Hummel, P., Braun, M., Augsberg, S., & Dabrock, P. (2018). Sovereignty and data sharing (p. 2). ITU Journal: ICT Discoveries.

Hummel, P., Braun, M., & Dabrock, P. (2019). Data donations as exercises of sovereignty. In J. Krutzinna & L. Floridi (Eds.), The ethics of medical data donation (pp. 23–54). Cham: Springer.

Kish, L. J., & Topol, E. J. (2015). Unpatients—why patients should own their medical data. Nature Biotechnology, 33 , 921. https://doi.org/10.1038/nbt.3340 .

Lanier, J. (2014). Who owns the future? New York: Simon & Schuster.

Lemley, M. A., & Weiser, P. J. (2007). Should property or liability rules govern information? Texas Law Review, 85 (4), 783–841.

Lessig, L. (2002a). The future of ideas. The fate of the commons in a connected world . New York: Random House.

Lessig, L. (2002b). Privacy as property. Social Research: An International Quarterly, 69 (1), 247–269.

Lessig, L. (2004). Free culture. How big media uses technology and the law to lock down culture and control creativity . New York: The Penguin Press.

Litman, J. (2000). Information privacy/information property. Stanford Law Review, 52 (5), 1283–1313. https://doi.org/10.2307/1229515 .

Locke, J. (1689). Locke: Two Treatises of Government . Cambridge: Cambridge University Press 1988.

McMahon, A., Buyx, A., & Prainsack, B. (2019). Big data governance needs more collective agency: The role of harm mitigation in the governance of data-rich projects. under review .

Mikk, K. A., Sleeper, H. A., & Topol, E. J. (2017). The pathway to patient data ownership and better health. JAMA, 318 (15), 1433–1434.

Montgomery, J. (2017). Data sharing and the idea of ownership. The New Bioethics, 23 (1), 81–86. https://doi.org/10.1080/20502877.2017.1314893 .

Moore, A., & Himma, K. (2017). Intellectual property. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy . Metaphysics Research Lab, Stanford University.

Murphy, R. S. (1996). Property rights in personal information: an economic defense of privacy. Georgetown Law Journal, 84 , 2381–2417.

Nelson, J. W. (2009). The virtual property problem: what property rights in virtual resources might look like, how they might work, and why they are a bad idea. McGeorge L. Rev., 41 , 281.

Osborne Clarke LLP (2016). Legal study on ownership and access to data. European Commission, Directorate-General of Communications Networks, Content & Technology.

Ostrom, E. (1990). Governing the commons. The evolution of institutions for collective action . Cambridge: Cambridge University Press.

Book   Google Scholar  

Pearce, H. (2018). Personality, property and other provocations: exploring the conceptual muddle of data protection rights under EU law. Eur. Data Prot. L. Rev., 4 , 190–208.

Prainsack, B. (2019a). Data donations: how to resist the iLeviathan. In J. Krutzinna & L. Floridi (Eds.), The ethics of medical data donation (pp. 9–22). Cham: Springer.

Prainsack, B. (2019b). Logged out: ownership, exclusion and public value in the digital data and information commons. Big Data & Society, 6 (1).

Purtova, N. (2009). Property rights in personal data: learning from the American discourse. Computer Law & Security Review, 25 (6), 507–521. https://doi.org/10.1016/j.clsr.2009.09.004 .

Purtova, N. (2010). Property in personal data. A European perspective on the instrumentalist theory of propertisation. European Journal of Legal Studies, 2 (3), 193–208.

Purtova, N. (2015). The illusion of personal data as no one's property. Law, Innovation and Technology, 7 (1), 83–111.

Radin, M. J. (1982). Property and personhood. Stanford Law Review, 34 (5), 957–1015. https://doi.org/10.2307/1228541 .

Rieser, A. (1999). Prescriptions for the commons: environmental scholarship and the fishing quotas debate. Harv. Envtl. L. Rev, 23 , 393–421.

Rodwin, M. A. (2009). The case for public ownership of patient data. JAMA, 302 (1), 86–88. https://doi.org/10.1001/jama.2009.965 .

Scanlon, T. (1975). Thomson on privacy. Philosophy & Public Affairs , 315–322.

Simmons, A. J. (1992). The Lockean theory of rights . Princeton: Princeton University Press.

Solove, D. J. (2001). Privacy and power: computer databases and metaphors for information privacy. Stanford Law Review, 53 . https://doi.org/10.2139/ssrn.248300 .

Solove, D. J. (2008). Understanding Privacy . Cambridge, Massachussetts: Harvard University Press.

Specht, L. (2016). Ausschließlichkeitsrechte an Daten – Notwendigkeit, Schutzumfang, Alternativen. Computer und Recht, 32 (5), 288–296. https://doi.org/10.9785/cr-2016-0504 .

The British Academy, & The Royal Society (2017). Data management and use: governance in the 21st century.

Thomson, J. J. (1975). The right to privacy. Philosophy & Public Affairs , 295–314.

Thouvenin, F. (2017). Wem gehören meine Daten? Zu Sinn und Nutzen einer Erweiterung des Eigentumsbegriffs. Schweizerische Juristen-Zeitung, 113 , 21–32.

Thouvenin, F., Weber, R. H., & Früh, A. (2017). Data ownership: taking stock and mapping the issues. In M. Dehmer & F. Emmert-Streib (Eds.), Frontiers in data science (pp. 111–145). Boca Raton: CRC Press.

Victor, J. M. (2013). The EU general data protection regulation: toward a property regime for protecting data privacy. Yale Law Journal, 123 (2), 513–528. https://doi.org/10.2307/23744289 .

Waldron, J. (1988). The Right to Private Property . Oxford: Clarendon Press.

Waldron, J. (2017). Property and ownership. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy . Metaphysics Research Lab, Stanford University.

Westin, A. (1967). Privacy and freedom . New York: Atheneum.

Wiebe, A. (2016). Protection of industrial data – a new property right for the digital economy? Journal of Intellectual Property Law & Practice, 12 (1), 62–71. https://doi.org/10.1093/jiplp/jpw175 .

Zech, H. (2012). Information als Schutzgegenstand . Tübingen: Mohr Siebeck.

Zuboff, S. (2019). The age of surveillance capitalism . London: Profile.

Download references

Acknowledgements

We are grateful to Hannah Bleher, Tabea Ott, Hannah Schickl, Stephanie Siewert, Max Tretter, Ulrich von Ulmenstein for their helpful feedback on earlier versions of this paper. Special thanks to both anonymous reviewers for their thorough comments and critique.

Open access funding provided by Projekt DEAL. This work is part of the research project DABIGO (ZMV/1–2517 FSB 013), which has been funded by the German Ministry for Health, as well as the research project vALID (01GP1903A), which has been funded by the German Ministry of Education and Research.

Author information

Authors and affiliations.

Friedrich-Alexander University Erlangen-Nürnberg (FAU), Kochstraße 6, 91054, Erlangen, Germany

Patrik Hummel, Matthias Braun & Peter Dabrock

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Patrik Hummel .

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Hummel, P., Braun, M. & Dabrock, P. Own Data? Ethical Reflections on Data Ownership. Philos. Technol. 34 , 545–572 (2021). https://doi.org/10.1007/s13347-020-00404-9

Download citation

Received : 13 September 2019

Accepted : 13 May 2020

Published : 15 June 2020

Issue Date : September 2021

DOI : https://doi.org/10.1007/s13347-020-00404-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Data ownership
  • Recognition
  • Redistribution
  • Participation
  • Alienability
  • Find a journal
  • Publish with us
  • Track your research

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 01 April 2023

The dilemma and countermeasures of educational data ethics in the age of intelligence

  • Xiu Guan   ORCID: orcid.org/0000-0003-0566-5135 1 ,
  • Xiang Feng   ORCID: orcid.org/0000-0003-2251-2587 1 , 2 &
  • A.Y.M. Atiquil Islam   ORCID: orcid.org/0000-0002-5430-8057 2 , 3  

Humanities and Social Sciences Communications volume  10 , Article number:  138 ( 2023 ) Cite this article

5684 Accesses

7 Citations

1 Altmetric

Metrics details

  • Science, technology and society

With the advent of the era of intelligent education, artificial intelligence and other technologies are being integrated into education, so that more educational data can be collected, processed, and analyzed. However, educational data ethics is an important factor that hinders the application of educational data. Thus, it is important to ensure privacy and security by reasonable use of educational data. As such, this research focuses on analysis of hot-points, trends in development and problems of educational data ethics in existing research using bibliometric analysis. Based on in-depth analysis, this study proposes targeted problem-solving strategies, and future developmental direction for learner-centered educational data ethics in the era of intelligent education. This study found three major problems in the educational data ethics: (a) violation of privacy during data collection, storage, and sharing, (b) the deprivation of the ability of educational subjects to make independent choices by prediction of educational data, (c) and the lack of “forgetting ability” based on data as the evaluation standard, which is one of the restrictions on the development of educational subjects. This research proposes that the corresponding problem-solving strategy in China should be learner-centered, combined with technologies such as blockchain, 5 G technology, and federated learning to form targeted solutions from different levels. Firstly, it is necessary to establish a standard system and related platforms of educational data from the macro level. Second, the research-practice dual channels should make efforts to build a new ecology of educational data. Third, schools and teachers must appropriately apply educational data in evaluation. Finally, this study provides direction and guidance for other countries or regions in researching educational data ethics.

Similar content being viewed by others

data ethics thesis

Exploring the impact of artificial intelligence on higher education: The dynamics of ethical, social, and educational implications

data ethics thesis

Impact of artificial intelligence on human loss in decision making, laziness and safety in education

data ethics thesis

Education big data and learning analytics: a bibliometric analysis

Introduction.

With the advent of the intelligent education era, emerging technologies such as learning analysis, data mining, cloud computing, and artificial intelligence have been integrated into education. Data are the basis of combined education with these technologies for tracking behavior threat assessments (CDT, 2021a ; Ge et al., 2021 ). However, data ethics, which focuses on how to collect, manage, share, and use data in a secure and equitable way and to avoid harm to individuals and public profits, is a widely debated issue when it comes to the combination of technology and education (CDT, 2021b ). Recently, many proposals and documents about data ethics have been published by international organizations and institutions. For example, the United Nations has realized that information and communication technologies (ICTs) create opportunities for illegal data collection, surveillance, interception, and other violations of human privacy and rights. Therefore, a proposal on how to protect data privacy and ethics in the digital age was adopted (United Nations, 2020 ). UNESCO also published a proposal on some recommendations related to artificial intelligence ethics (UNESCO, 2021 ). It asserted that artificial intelligence caused new ethical problems in business, media, and education. In particular, there is an urgent need for countries and related organizations to form laws about data ethics during data collection, usage, sharing, saving, and deletion. The 2021 EDUCAUSE Horizon Report (Information Security Edition) analyzed six key technologies and practices, which may affect higher education information security in the future (EDUCAUSE, 2021 ). Cloud Vendor Management mainly requires cloud vendors to provide ethical data security protection for software tools, resources, and other aspects. Endpoint Detection and Response is the monitoring and protection of data privacy and security of terminal equipment. The purpose of Multifactor Authentication (MFA)/Single Sign-On (SSO) is to ensure the security of data ethics involved in user login accounts and identity verification. Preserving Data Authenticity/Integrity mainly emphasizes the guarantee of the authenticity and integrity of data in transmission, storage and processing. Research Security is aimed at ensuring the security of data generated by learners’ participation in teaching research or experiments. Student Data Privacy and Governance is about data privacy and security protection regarding the collection and use of learners’ personal information. This report discusses security and management of student data privacy and other content related to educational data. China has also participated in the formation of the Beijing consensus (UNESCO, 2019 ) and other proposals. Overall, data ethics is an important factor when technology is combined with education.

For research purposes, it is necessary to define the concept of educational data ethics. Data ethics in education are defined in the existing research. For example, the Center for Democracy & Technology (CDT) defines data ethics as the evolving principles for the collection, management, sharing and use of educational data in a safe and fair manner that avoids harming individual learners or the public interest (CDT, 2021b ). The General Services Administration’s (GSA) definition of data ethics is different from the former. It does not target a specific field or situation. However, it better guarantees how humans can collect, manage, or use data in the freest way and maximize public interest on the premise of avoiding risks. Furthermore, data ethics is seen as a basis for judgment and accountability (General Services Administration, 2020 ). In the Open Data Institute (ODI), data ethics is considered a branch of ethics, which mainly emphasizes the evaluation function and the value of data ethics for data application, such as data collection, sharing, and use. It can also restrict data application that may affect individuals and society (Open Data Institute, 2021 ). However, data ethics is seen as a constraint on behavior in the process of applying data, such how scholars’ decision-making was affected by their concepts of educational data ethics, as reviewed by Mandinach. Mandinach and Jimerson ( 2022 ) maintain that educational data ethics can ensure the rational use of data and correct analysis and interpretation of conclusions. In general, educational data ethics refers to the principles to be followed in the process of collecting, managing, sharing and using educational data, which can help to restrain behavior, assist in decision-making and evaluate practices, so as to realize that educational data application can securely maximize individual and public interests on the basis of equality.

Furthermore, many countries and regions have focused on data ethics according to their actual cultural situations. America’s General Services Administration (GSA) defined the concept of data ethics in the draft of the Data Ethics Framework (General Services Administration, 2020 ). Government Digital Service (GDS) in the United Kingdom also published the Data Ethics Framework (Government Digital Service, 2020 ). Open Data Institute (ODI) developed Data Ethics to help assess the influence of data collection, data sharing and data usage Canvas (Open Data Institute, 2021 ). Since data is an important cornerstone of the role of science and technology, the Ministry of Science and Technology of China has strengthened the governance of science and technology ethics. (Ministry of Science and Technology of the People’s Republic of China, 2021 ). Meanwhile, China has also promulgated laws to provide legal protection for data ethics, such as the Data Security Law (People’s Republic of China, 2021a ), the Personal Information Protection Law (People’s Republic of China, 2021b ), and the Internet Data Security Management Regulations (Cyberspace Administration of China, C, 2021 ). In general, the field of education is only a small part of these. However, there are still many unsolved problems in the actual implementation of related principles and strategies, especially in education. Therefore, it is particularly important to focus on educational data ethics, as learners are the main data collection objects, and there are vulnerable groups among learners such as children (CDT, 2021b ). Additionally, the relevant principles or laws of educational data ethics are relatively loose at present regarding implementation in actual situations. Therefore, it is difficult to obtain detailed information that can guide the solution of educational data ethics problems in a given situation (Rosa et al., 2022 ). There are different cultural, political, economic and other backgrounds in different countries or regions, and it is necessary to adapt and change international principles of privacy and data protection (Hoel & Chen, 2018 ). There is an urgent need for a systematic review of relevant international research to help propose specific and systematic solutions to educational data ethics problems in China, which is a developing and government-oriented nation. In doing so, the purpose of this study is to review and analyze the international literature using bibliometric analysis, which can provide more information and implications to solve problems related to the educational data ethics in China. Therefore, this study can clarify the research hotspots, trends, and evolution processes of the educational data ethics. The urgent problems and corresponding solutions can then be summarized and proposed.

Overall, these studies proposed horizontal solutions or countermeasures to the ethical issues of educational data from different perspectives which are relatively one-sided and lack systematization. Therefore, this study hopes to propose solutions for educational data ethics in the Chinese context through bibliometric analysis.

The research question are as follows:

What are the dilemmas and solutions of educational data ethics in the international context?

What do dilemmas and solutions tell us about the development of educational data ethics in the Chinese context?

It is hoped that the relevant results obtained can be learner-centered and combined with the actual situation in China to form solutions to problems related to educational data ethics. There are three parts, as follows. (1) First, a bibliometric analysis of the existing literature was carried out. (2) Based on the review of the existing literature, further bibliometric analysis of relevant literature was conducted to sort out the dilemmas and solutions of educational data ethics in China. (3) The third part describes the conclusions and implications that can be drawn from this study.

Research methods

Data sources.

The purpose of this research is to clarify the main problems regarding the educational data ethics and to formulate solutions to problems about data ethics with learner-centered thinking. The researchers chose keywords, which all related to “Data”, “Education” and “Ethics” as constraints for the literature’s themes. The keywords used in the study are from a literature review related to educational data ethics until 2019, and the literature search formula is (Hakimi et al., 2021 ):

“digital trace data OR digital data OR digital footprint OR learning analytics OR education data mining OR big data OR artificial intelligence OR predictive analytics OR adaptive learning OR critical data OR datafication OR education analytics OR educational data OR data science OR data-driven (Theme) and ethics OR ethic* OR privacy OR surveillance OR data protection OR data ownership OR dataveillance OR data sharing OR bias OR fairness OR accountability OR agency OR autonomy OR vulnerability OR anonymity OR inequality OR justice OR at risk OR governance OR ownership (Theme) and learn* or educat* or school or university or MOOC or distance learning or pre-school or primary school or prekindergarten or kindergarten or junior school or high school or secondary school or college or student or classroom or education technology or early years or instructional systems (Theme) and 2023 or 2022 or 2021 or 2020 or 2019 (Published years) and Thesis or Meeting or Published online or Review paper or Books (Type of literature) and Web of Science Core Collection (Database)”.

There were 88764 search results obtained from the Web of Science from 2019 to 2023 (because there are some e-prints published online, the 2023 was contained). To improve the efficiency of literature screening, we used ASReview software, which was used in many review articles, to perform screening based on machine learning algorithms (van de Schoot et al., 2021 ). ASReview software uses active learning to sort the relevance of all literature in real time based on the annotation of papers according to the filtering and inclusion criteria by the researcher. As such, the ASReview can push those with strong relevance to the front for priority annotation in real time. When 20 consecutive occurrences are marked as not relevant to the topic, it signifies that the rest of the literature is not relevant to the topic, and the filtering of the literature can be stopped. After filtering, 385 papers were included in the final analysis. Thus, the ethical issues of educational data were extracted and summarized, as shown in Fig. 1 .

figure 1

Literature filtering based on literature filtering criteria and literature screening tool ( https://asreview.nl/ ).

In order to effectively identify the relevant research on the educational data ethics from the perspective of intelligent education, the filtering and inclusion criteria for papers are as follows:

It should contain ethical issues about educational data, such as students’ privacy.

It can be a theoretical article on educational data ethics, such as proposing a framework to avoid the emergence of issues of educational data ethics.

It can also be an article that uses technology to solve the ethical problems of educational data, such as using unbiased algorithms to analyze educational data to achieve fairer teaching decisions.

It can also be a case article on the solution path in educational data ethics issues, such as exploring the data collection and analytic specifications of digital education applications based on data privacy protection laws or regulations.

This study used the bibliometric method to analyze the published articles. CiteSpace (6.1.R4) was used to show the research hotspots and trends related to the educational data ethics. CiteSpace is a visualization bibliometric analysis tool for academic literature review developed by Professor Chen Chaomei of Drexel University. It can analyze the hotspots, evolution, and development trends of a certain discipline or a research field (Wang et al., 2016 ). Combining deep reading and analysis of key literature, the current dilemmas and solutions regarding educational data ethics were proposed.

Analysis framework

In order to effectively respond to the two research questions, the analysis of this study can be divided into two main phases, as shown in Fig. 2 .

figure 2

Research framework for the entire study.

The first is the bibliometric analysis of the articles to find relevant research hotspots, research trends, and important scholars and research institutions. The second is to further read and analyze related literature in depth based on the results of the bibliometric analysis, and to summarize the dilemmas and solution strategies of educational data ethics.

There are three sections of bibliometric analysis of educational data ethics based on Citespace. First, the results of the research hotspots analysis can confirm the precision and feasibility of the purpose of this research. This is mainly concluded by the co-occurrence network of keywords formed by clustering based on the simultaneous occurrence frequency of two pairs of keywords in the literature. Then, the evolution process and the results of the development trend of the related research lay the foundation for the subsequent countermeasures. This is mainly based on the timeline chart, which shows the first time keywords included in the bibliometric analysis were used in the literature. Finally, important scholars and institutions were identified, which helped us clarify what articles published by these scholars and institutions should be analyzed. This was mainly done by drawing the cooperative network of authors and institutions to come to the corresponding conclusions.

Further literature was explored to summarize the dilemmas caused by educational data ethics in the existing research and practice as well as the corresponding solutions, and to put forward adaptive solution strategies for educational data ethics problems in the Chinese context.

Bibliometric analysis of educational data ethics

Research hotspots analysis.

In this study, the keyword co-occurrence network was formed by Citespace to mine the research hotspots of educational data ethics in the past 5 years, as shown in Fig. 3 , such as “learning analytics”, “data science”, “systematic review”, “artificial intelligence (ai)”, “big data”, “artificial intelligence literacy”, “gender bias”, “attitude” and “educational data analytics”. The keyword co-occurrence network divides the literature into different categories based on the frequency of simultaneous occurrence in different articles. In addition, the size, homogeneity and average publication year of different clusters are presented in detail in Table 1 .

figure 3

From 2018 to 2023 (top 10 clusters).

The “learning analysis” is based on the collection and examination of learning data to mine the rules of learning and education, which can be further used to improve learners’ learning performance (Baker & Inventado, 2014 ). Therefore, as learning analytics becomes a hot research topic, it is important to note that the ethical issues caused by the collection, processing and analysis of sensitive and private data cannot be ignored during learning analysis application (Jones, 2019 ). This also can be seen from Fig. 3 and Table 1 , which illustrate that “learning analytics” has been the hottest research hotspot of educational data ethics research in the past 5 years. “# 0 learning analytics” is the largest scale cluster, which covered the largest number of keywords in the cluster. Moreover, “learning analytics” is also the node with the highest number of times in the shortest connection paths located between the nodes, which represents that this node is an important research turning point and the bridge between different clusters. This is because Citespace can represent the nodes with high between centrality in the keyword co-occurrence network in purple circles. In Table 2 , the centrality value of “learning analytics”, which is equal to 0.24, is the highest among all keywords. “Learning analytics” is also the most frequently occurring keyword in Table 2 . As mentioned earlier, learning analytics needs to be based on the collection, storage, processing, and analysis of learning-related data. Therefore, the application of learning analytics inevitably involves issues of educational data ethics.

The “data science” is the foundation of educational data ethics. On the one hand, it can provide theoretical and practical support. On the other hand, the ethics education included in data science education also lays the foundation for avoiding educational data ethics issues. Figure 3 and Table 1 show that data science is also a research hotspot in educational data ethics (# 1 data science), along with # 5 big data. This is because big data-related research, especially that which deals with ethical issues, can be applied to educational big data.

Intelligent education is an important trend in the future development of education, and ethical issues are inevitable in the integration of artificial intelligence and education. “Artificial intelligence” refers to the issues of educational data ethics involved in the educational application of artificial intelligence, such as educational inequity due to the inherent bias of intelligent algorithms. “Artificial intelligence literacy” refers to which articles solve the ethical issues of educational data by improving and cultivating artificial intelligence literacy. The two clusters of “# 3 artificial intelligence” and “# 6 artificial intelligence literacy” appearing in the keyword co-occurrence network proves the importance of solving the ethical issues occurring during the integration of AI and education. Table 3 shows that the former is more focused on the issues arising from the application of AI in the field of education, while the latter concentrates on how to carry out AI education so that students have the literacy and ability to deal with the issues of educational data ethics.

There is some additional information. “Higher education” means that some studies focused on the educational data ethics in the context of higher education. Table 2 shows that researchers have focused on educational data ethics in higher education in the past 5 years (Freq = 46). Gender bias is a type of issue in educational data ethics. “Gender bias”refers to the gender bias in the processing and analysis of educational data due to the gender bias of the algorithm designers themselves, such as the assumption that girls are bad at physics. Table 1 shows that “gender bias” is an emerging research hotspot, as the average year of publication for the gender bias category is 2022, and the homogeneity of the literature in this category is relatively high (Silhouette = 0.995) in Table 1 .

Research development evolution and trend analysis

The development trend for the educational data ethics-related research is shown in Fig. 4 . Based on this figure, the burst of keywords for educational data ethics in the last 5 years can be divided into three stages.

figure 4

Keywords: these come from the keywords listed in the article; Year: This represents the average year of literature; Strength: The values of strength represent the strength of the keywords' burst, and the larger, the stronger; Begin/End: it represents the year the keyword started/stopped bursting.

First, the keywords “information”, “big data analytics”, “architecture”, and “academic library” emerged between 2019 and 2020. “Information” and “architecture” mean that there are ethical issues caused by extracting structured information from educational data. “Big data analytics” represents the educational data ethics that arise when applying the paradigm of “big data analytics” to educational data research, such as privacy or sensitive data leakage. The “academic library” refers to educational data ethics that occur in the context of academic libraries, such as permissions for the collection of personal reading records. In this period, the research related to the educational data ethics focuses on the ethical issues related to the extraction of information from data.

Second, between 2020 and 2021, the keywords “privacy principle” and “university” emerged. The “privacy principle” means that researchers focused on forming privacy principles which can be used for protect the privacy of students. “University” means that studies focused on the context of universities. Figure 4 shows that studies began to focus on how to apply privacy and ethics-related principles and strategies to address issues of educational data ethics, especially in higher education in this time frame.

Third, between 2021 and 2023, the keywords “systematic review”, “user acceptance”, “educational data mining”, “teacher”, and “decision-making” have emerged, which foreshadows the future research trends of educational data ethics. “Systematic review” means that there is a critical mass of existing research and time sufficient to support the drawing of common conclusions from the available studies. “User acceptance” represents the beginning of research focused on the impact that educational data ethics may have on the user acceptance and experience of learners. “Educational data mining” represents research that applies big data mining and analysis into educational contexts and pays attention to the educational data ethics that can be associated with educational data mining. “Teacher” represents relevant research that focuses on what teachers need to do and what responsibilities they have in protecting educational data ethics. “Decision-making” refers to the ethical issues involved in the application of educational data to the specific context of decision-making. For example, teachers may rely too heavily on the results of educational data analysis, ignoring the learner’s ability to develop. Therefore, it can be seen from Fig. 4 that researchers began to systematically review past studies and tried to find the paths or cases that could be applied to the resolution of educational data ethics issues in this stage.

Moreover, the emergence of these keywords shows that the relevant research at this stage focuses more on the issues of educational data ethics in actual educational contexts and that this trend will continue in the future. These studies explored include the acceptance of teachers and students as educational subjects in educational data collection, the sensitive and private information involved in the mining of educational data, and the explain-ability and trustworthiness of the results of educational data analysis for educational decision-making. These studies focused on education and teaching practice, which can be applied to future problem-solving and solutions.

Analysis of research scholars and institutions

The main goal of the analysis of research institutions and scholars’ co-relationship is to discover those with important influence, which can help to reach further conclusions. In order to identify scholars with significant influential roles in educational data ethics in the last 5 years, this study formed an author collaboration network based on Citespace, as shown in Fig. 5 , and counted the top 12 scholars’ publications in the last 5 years, as shown in Table 4 .

figure 5

Note: The larger the node circle, the more citations.

The largest collaborative network of authors focuses on the application of learning analytics to higher education, while issues of educational data ethics are considered one of the hindering factors. For example, ethical challenges that can be encountered in learning analytics applications and higher education contexts are explored (Alzahrani et al., 2022 ). The scholars who are often associated with Jones focused more on the ethical and privacy issues involved in learning analytics (Jones et al., 2020 ). Scholars who cooperate more with Prinsloo are more concerned about how to solve the ethical problems of educational data and protect students’ privacy. Scholars working with Fonseca David and Amo Daniel focus more on the ethical issues that can be encountered during the implementation of learning analytics in web contexts. Knight Simon and other scholars mainly explored the pedagogue’s perspective about issues related to educational data ethics (Shibani, Knight, & Shum, 2020 ). Viberg and other scholars, however, focused on exploring the issues of educational data ethics from the learners’ perspective (Viberg, Engstrom, Saqr, & Hrastinski, 2022 ).

In order to identify research institutions that have had a significant impact on educational data ethics in the last 5 years, this study formed an author co-occurrence network through Citespace (as shown in Fig. 6 ) and counted the top 11 research institutions’ publications in the last 5 years (as shown in Table 5 ). Combining Fig. 6 and Table 5 , it can be seen that Monash University is the research institution with the highest number of publications on educational data ethics in the last 5 years. Collaborations between this institution and research institutions have focused on researching educational data ethics issues involved in learning analytics, particularly in higher education and in classroom contexts. For example, they explored how educational data ethics issues are addressed and resolved from the perspective of pre-service teachers (Prestigiacomo et al., 2020 ). The research institute, in collaboration with Beijing Normal University, focused on reviewing the current state of development and dilemmas of the integration of technology into education and found that educational data ethics is one of the factors causing the dilemma (Tlili et al., 2021 ). The research institute in collaboration with the University of Eastern Finland, KTH Royal Institute of Technology mainly focused on addressing the educational data ethics issues arising from the integration of AI and education, with particular emphasis on the need to train designers of ethical AI applications (Vanhee & Borit, 2022 ). The related research institutions working with Indiana University have focused not only on the ethical issues raised by the integration of AI into educational contexts (Morley et al., 2021 ), but also on data ethics and privacy issues in other educational contexts such as libraries (Jones et al., 2020 ).

figure 6

Related to educational data ethics.

In general, the research hotspots of educational data ethics research in the past 5 years are data ethics issues related to learning analytics, mainly concentrating on higher education. The studies also showed a tendency towards educational data ethics issues in specific learning contexts, such as educational data ethics issues arising from the application of artificial intelligence in education. In addition, the study further identifies researchers and research institutions with significant status in educational data ethics studies in the past 5 years. The results of the above bibliometric analysis lay the foundation for summarizing educational data ethical issues and proposing ways to avoid and solve educational data ethical dilemmas in the Chinese context.

Based on the above analysis, the reviews of studies on educational data ethics in the past 5 years can be divided into two main groups. The first group focuses on the ethical or privacy issues related to educational data in specific educational contexts. Most of these reviews address ethical and privacy security issues arising from learning analytics, such as the review of the current state of educational data ethics in learning analytics and the research trends (Tzimas & Demetriadis, 2021 ). The second group focuses on the current state of research for specific technologies and education integration, such as a review of the current state of research and practices regarding AI applied to educational contexts, where ethical issues are one of the challenges caused by AI applications in educational contexts (Zhai et al., 2021 ). In contrast to these reviews, the present study is not limited to specific educational contexts or the integration of specific technologies into the field of education but includes all theoretical and practical articles related to traditional offline classrooms as well as online education.

Fewer studies have specifically addressed the issues of educational data ethics, although there is one study that addresses the issues related to tracking data collection, processing, and analysis in digital education (Hakimi, Eynon, & Murphy, 2021 ). In contrast to Hakimi and other scholars’ articles, the present study analyzes articles on educational data ethics between 2019 and 2023 to examine the research hotspots, research trends, and important researchers and research institutions. As such, we read the important literature, identified the dilemmas caused by educational data ethics, and developed adaptive strategies to avoid or solve educational data ethics issues in the Chinese educational context.

This helps to analyze the research related to educational data ethics more comprehensively. In turn, it provides more valuable insights for the avoidance of and solutions to educational data ethics issues in the Chinese context. Therefore, we further reviewed related literature, which can help summarize the current ethical problems and solutions of educational data.

Current educational data ethics issues

Current educational data ethics issues can be summarized as follows: (a) Privacy is violated during the collection, storage, and sharing of educational data. (b) The predictive function of educational data deprives learners and teachers of their ability to choose independently. (c) The application of educational data leads to a preference for evaluation by data standards, but there is a lack of “forgetting ability”. To overcome the barriers caused by the educational data ethics, this study uses China’s context as a backdrop for proposing some solutions to educational data ethics.

Violation of privacy during data collection, storage, and sharing

On the basis of educational data application, there are violations of the privacy of educational subjects in the process of data collection, processing, storage, and sharing. Some questions need to be addressed, such as “why are data collected?” “Is it really necessary to collect data?” “What is the purpose of data use?” As these questions have not been sufficiently answered, the best means of protecting the collected data is still unclear, and there is a lack of systematic and effective norms and guidance regarding practice.

Specifically, violations of the privacy of educational subjects can be classified into three aspects. First, to realize personalized teaching, it is necessary to thoroughly collect a large amount of information from learners in real time. However, there are many problems in the informed consent process before data collection, such as deception and ambiguity (Rubel & Jones, 2016 ). Secondly, there are hidden dangers in the process of data storage, such as lack of anonymization of sensitive information (Kularski & Martin, 2021 ). Data storage without effective data protection measures may lead to leakage of private information, which causes teachers and learners to become “transparent” and poses a huge threat to them. Third, there may be a risk of privacy leakage during data sharing. In the process of data and analysis results sharing, the private information of learners and teachers is fully exposed through continuous cross-validation of data from various sources by data mining and related technologies and algorithms of artificial intelligence (Yang & Liang, 2017 ). This poses a threat to the privacy and security of learners and educators. Therefore, during data collection, storage and sharing, researchers and practitioners should pay attention to ethical issues of educational data collection, storage and sharing, and prevent the leakage of privacy from educational data application.

Prediction of educational data deprives people of the ability to choose independently

One of the analyses and extractions of the main functions of educational data is prediction. However, the use of the prediction function requires correct attitudes and perspectives. That is, the functions of prediction, clustering, and anomaly detection provided in educational applications should be seen as aids in educational and teaching decision-making information. In other words, they cannot directly replace the educators and learners in making educational decisions. Excess dependence on the results of educational data mining and analysis will deprive learners and educators of the ability to make independent decisions.

For students, the possible impact caused by the educational data ethics can be divided into two aspects. Firstly, prediction results will reduce the learner’s opportunities for trial and error in the process of learning. Although this greatly improves learning efficiency and accuracy of prediction, it also greatly weakens the possibility of learners’ independent thinking and trial-and-error innovation during learning (Zhong & Tang, 2018 ). This hinders the development of innovative thinking skills. Secondly, mass dissemination based on data will create an information cocoon, narrow the horizons and minds of learners and teachers, and cause them to be too engrossed in the topics they are interested in. This is because the information recommended by the algorithm will show information, which lacks diversity (Li, 2021 ). In turn, this can have a negative impact on fostering the promotion of morally sound values and outlook in young people.

The predictive function of educational data may help educators free themselves from teaching concerns, but it will push them into another constraint. That is, educators must deal with various types of intelligent teaching systems, analyze various data, and obtain and judge data prediction results. Learning data between different systems and platforms cannot be exchanged, so it is difficult for teachers to directly integrate data between different systems or platforms to gain a comprehensive understanding of learners, which requires teachers to have higher skills and literacy (Holstein et al., 2019 ). Therefore, this hinders the promotion and development of normal teaching activities. Similarly, if teachers rely too much on the prediction function provided by educational data, it will reduce their deep thinking, which may eventually affect their professional development (Magdy & Dony, 2020 ). It will also indirectly affect learners’ learning performance. Furthermore, based on educational data processing and analysis results, such as learning analytics, educators’ decision-making can have unconscious or conscious biases towards learners (Rubel & Jones, 2016 ), which in turn leads to biases in decision-making or even mistakes.

Using data as an evaluation standard but lacking the ability to “forget”

Another ethical problem with educational data application is that evidence-based evaluations based on educational data may be overly data-istic. At the same time, there will be a lack of ability to “forget”, because data storage is distributed and permanent (Mayer-Schönberger, 2011 ).

First of all, the results of educational data analysis cannot fully ensure the accuracy of data prediction because the educational data used for prediction are not complete. Specifically, it is difficult to include all the factors that affect the accuracy of the prediction result (Schouten, 2017 ). As such, further consideration is still needed as to whether prediction and evaluation of educational data must be accurate, and whether the algorithm can be responsible for the error of the prediction result (Essa, 2019 ). As in other fields, data-driven decision-making can lead to discrimination due to private information in the data (Mittelstadt et al., 2016 ). The results of the prediction and evaluation of educational data contain a large number of private details about the learner, such as disease status and family economic status. This information is likely to cause teachers to have a ‘bias’ or discriminate against learners. These are unresolved ethical problems that can be brought about by educational data-based evolution. Moreover, predicting and customizing future learning trajectories based on the results of educational data analysis undoubtedly obliterates the possibility of learners’ acquired mutations (Mayer-Schönberger & Cukier, 2014 ). As learners are developing people, we cannot box learners into specific evaluations, labels, and categories.

Secondly, a comprehensive collection and permanent storage of educational data means that the labels of learners will be solidified. As Richards and King ( 2015 ) proposed in their identity paradox, although analysis of data can help identify and classify identities, it also affects and constrains the formation and change of identities. Since the records of educational data will be stored for a long time, the misbehavior of the learner in the immature period will also be kept for a long time, which may increase the bad influence of the learner’s mistakes on their future performance. It may even obliterate the learner’s motivation for follow-up development (Tang & Zhang, 2020 ).

Therefore, the integration of technology and education should form human-friendly products and services during the pursuit of intelligence in education. Moreover, it is necessary for the evaluators to judge learners from a developmental perspective and have appropriate tolerance for learners’ faults. This also corresponds to the main idea of this paper that educational data should be learner-centered.

Strategies and recommendations for ethical issues in Educational Data

With the aim of solving the ethical problems faced by educational data, there is an in-depth analysis of relevant literature. In general, it is necessary to fully consider the exertion and scheduling of all parties to achieve collaborative innovation and then form a new ecology of educational data for symbiosis and co-prosperity. Specifically, strategies and suggestions for solving the current ethical dilemma of educational data can be divided into three directions:

First, from the macro level, government departments need to lead the construction and implementation of a systematic and standard system of educational data. On this basis, the constraints on the collection, storage, and sharing of educational data can be realized. The construction of an educational data center or an educational digital base can also be promoted. Furthermore, the foundation of educational data application can be provided. The second is to make efforts from the “research-practice dual channel”, which can coordinate effective forces of all parties. In this way, educational data applications can fully play their role and show their value in compliance with ethical norms. Third, it is necessary to carry out education on the educational data ethics, which is mainly to establish correct concepts and attitudes on the use of educational data. This not only means accepting educational data with an open mind, but also fully guaranteeing autonomy and innovation.

Establish systematic standard systems or platforms of educational data from the macro level

The role of educational data in personalized learning is unquestionable, but data security is an important premise. First, different industries and companies use different data for varying purposes, methods, and standards. In order to ensure data exchange, government departments must formulate and promote educational data standards for data collection so that educational data can be standardized and systematized. This will promote the further development of educational data applications, such as the successful experience of establishing data standards in Kentucky, USA (Wang & Dan, 2021 ). At the same time, government departments can establish effective, strong and perfect protection frameworks and supervision mechanisms to monitor and restrain educational data so that educational data can be used within a controllable condition (Ma et al., 2017 ). At present, there are relevant policies and documents around the world, so ethical issues of educational data have been addressed and resolved to a certain extent. For example, the UNESCO Institute for Information Technology in Education (IITE, 2020 ) jointly issued the “Personal Data Security Technical Guide for Online Education Platforms”, which shows that far from enough has been done to meet the needs of the era of intelligent education. This study calls government departments to speed up the establishment of educational data standards, so that the process of collecting, storing and sharing educational data can be rationalized and established as lawful.

This section analyzes the value of educational data from three platforms. With the help of establishing data centers and digital bases, government departments can conduct convenient and fast data aggregation, governance and application. When combined with the education cloud platform, safe and efficient data aggregation and governance can be achieved. Among them, the establishment of digital bases can meet the basic functional requirements of educational data collection, transmission and calculation. These digital bases can also help to develop various educational digital applications.

However, because digital bases have a tendency toward publicity and natural monopoly, construction and operation need to be led by government departments (Gao, 2021 ). The establishment of data centers can standardize data management, promote integration between various business modules, simplify the educational data sharing and exchange process, and improve efficiency (Li, Shu et al., 2021 ). This, in turn, helps obtain accurate, intelligent, and personalized decision-making based on educational big data. It is particularly important that establishment of data centers can build a safe and credible data system to ensure the safe circulation of educational data by technological means, such as disaster recovery backup and rights management (Gu & Li, 2021 ). The core of the education cloud platform mainly provides platforms and services such as single-sign-on, unified identity authentication, unified portal, unified interface, and unified data center. This platform can help to realize the effective integration and management of the authority of educational data. At the same time, the education cloud platform can realize the integration of software and applications, which enables seamless switching of multiple applications (Yang & Yu, 2015 ).

Additionally, 5G+ can provide a foundation of communication for the construction of the digital base, data center, and education cloud platform. Therefore, vigorous development and construction of 5G+ education are needed. For example, 5G+ can be used to build a multi-interactive intelligent communication platform, which can provide better interaction, learning support, immersive experience, etc. (Wang & Wang, 2020 ). Owing to the large scale of relevant construction, it needs to be established with standardized and unified data standards, such as interface specifications. Furthermore, the construction and management of 5G+ education require relatively high technical skills, complex structure, extensive coverage, and a long cycle, so it should be led and carried out by the relevant construction departments.

Construction of a New Ecology of Educational Data by the Dual Path of “Research-Practice”

With corresponding policy support provided by states and government departments, there are two ways to solve real ethical problems faced by educational data: the practice of applying educational data and academic research. In other words, it is necessary to fully realize the transformation of research outcomes into practical applications. This coincides with the international call for education stakeholders to collaboratively address ethical issues related to educational data (Siemens, 2019 ).

On the one hand, managers of educational data must strengthen their education in data ethics. They not only need to cultivate and develop their awareness and understand the concepts of data ethics, but they also need to exercise and improve the professional quality and data literacy of data management (Onorato, 2013 ). Suppliers of educational data-related products and services should also have professional trainings. For example, at the beginning of product design and development, they should consider how to effectively protect data privacy and security (Robinson & Gran, 2018 ). At the same time, they need to pay attention to and investigate data privacy needs. Furthermore, the relevant data service personnel also need to improve their awareness and knowledge of the educational data ethics (Jones, 2019 ). Then, they need to make sure that the storage, use, and sharing of data is open and transparent. Especially in the process of user informed consent, more consideration should be given to the use of words and expressions that users can understand, rather than overly professional agreement content (Siemens, 2019 ). It is necessary to control data disclosure to some extent in the process of providing products and services made with educational data (Li, Chen et al., 2021 ), which can help avoid hidden dangers or negative impacts on users’ privacy.

On the other hand, it is also necessary to research disciplines related to ethical issues of educational data, which is the main aim of researchers and practitioners. Research departments and institutions must explore and study relevant ethical disciplines based on educational data, such as investigating and discovering the positive relationship between the standardization of educational data and the utilization rate of educational data applications (Pentland, 2014 ). Education intelligence, distributed blockchain storage, secret key encryption, and other technologies are the methods and technologies that can protect the privacy of educators and learners (Zeng et al., 2020 ). These are widely considered important in the field of educational research. The decentralized nature of the blockchain in particular enables credible, reliable, and highly private data storage and sharing.

Many blockchain application scenarios in the field of education have been studied to build a relatively complete architecture and implementation path, such as the combination of blockchain and credit bank (Yuan, 2021 ). The roles of the technologies involved are shown below. First of all, chained timestamping blocks and hash values can ensure the traceability and formability of data. Furthermore, distributed ledger technology can help realize decentralization, form flat architectures of systems, and form a consensus mechanism. Based on this, all participants can directly access educational data under certain permissions. In addition to this, smart contract technology can help realize the automatic credit transformation, which has been determined in the contract. In addition, the combination of blockchain and federated learning can solve the problem of “data silos” between different institutions or organizations. In other words, it can realize data sharing between different organizations based on specific policies or standards. This is mainly achieved through the decentralization feature of the blockchain. At the same time, the blockchain can also support the sharing of differentiated private multiparty data models to ensure communication security and privacy protection in the process of data sharing and use (Li, Yuan et al., 2021 ).

At present, many other technologies can be combined with blockchain to ensure data security and privacy protection, such as Secure Multi-Party Computation (SMC), Zero-Knowledge Protocol, Ring Signature, Homomorphic Encryption and Trusted Execution Environment (TEE) (Zhang, Wang, & Li, 2021 ).

Moderate application and forgetting of educational data

It is also important to think about how educational data ethics should be guaranteed and realized. This can start with data ethics education. On the one hand, the training and development of educational data ethics should be strengthened in learner-centered education and teaching so as to ensure the appropriate application of educational data. On the other hand, for the data record itself, reasonable algorithms should be used to achieve moderate forgetting during data-based and evidence-based education evaluation.

First of all, teachers and learners are not only the direct beneficiaries of educational data applications, but they are also directly threatened by ethical issues of educational data. Therefore, to effectively enhance the awareness, development and ethical cultivation of educational data, it is necessary to provide ethical education about educational data, which can effectively improve teachers’ and learners’ self-control of data. On the one hand, it should enhance the subject’s awareness of educational data protection (Chen et al., 2018 ). It is especially important to avoid unintentional violation of privacy in the use of educational data when teachers and learners use products and services during the educational process. On the other hand, in order to help educational subjects and understand the correct way to use educational data, it is necessary to know what educational data is and why we need to use it. Only by fully understanding the usage and values of educational data can it be better implemented (Hazelbaker, 2016 ). Moreover, it is necessary to foster correct understanding in the educational subject of how to appropriately use educational data. In other words, it is necessary to control and moderate the use of educational data. Learners must accurately comprehend the concept of being a learner. It is especially important to enhance learners’ ability to resist psychological stress and strengthen the construction of students’ psychological quality when faced with the application of educational data (Zhou & Tang, 2020 ). Additionally, teachers should not be too data-based or too strict in the implementation of learning evaluations. It is necessary to avoid dimensional dataism and appropriately relax data-based and evidence-based evaluation standards so as to achieve a fair and just evaluation for learners and implement the ethical concept of learner-centered educational data.

Secondly, due to the accessibility, permanence, and comprehensiveness of digital memory in the digital age, a lack of moderate forgetting will result in a panorama of prisons for teachers and learners. That is to say, the lack of moderately forgotten educational data will confine the educational subject to a digital cage, which can be analogous to the iron cage of social development proposed by Weber in The Protestant Ethic and the Spirit of Capitalism. In this situation, it is easy to trigger the “chilling effect”. In other words, the learners will reduce related activities and avoid conscious or unconscious mistakes in the learning activity process because these will be recorded and permanently retained. As such, learners’ opportunities for normal learning behavior will be affected. The moderate forgetting of educational data can be realized from both technical and ethical aspects. In terms of technology, the originality of data forgetting emphasizes that forgetting will make evaluation objects tend to return to the previous state of data generation (Yang, 2020 ), so related algorithms guided by this core idea can help achieve moderate data forgetting. Forgetting should be regarded as a virtue in the digital age. Owing to the advent of the self-media era, release, dissemination and diffusion of information are often fast and difficult to monitor, so it is important to form corresponding community norms (Mayer-Schönberger, 2011 ). Overall, the ethical regulation of all producers and users of educational data is crucial.

In general, with the continuous intervention and integration of emerging technologies in education such as artificial intelligence and big data, educational data is the core of constraining and balancing development of educational intelligence. Educational data ethics, as an important obstacle, is a common dilemma faced by researchers in related fields. Therefore, through bibliometric analysis and in-depth literature review, this study analyzes research hotspots, the evolution process, and development trends in the fields of the educational data ethics and confirms that related issues of educational data ethics are important factors that affect educational informatization, intelligence, and development. At the same time, the research concludes the three main dilemmas and the corresponding strategies in current research on ethics of education data that are further sorted out from a detailed literature reading.

The difficulties are as follows. (a) The privacy of educational subjects during data collection, storage and sharing is violated. (b) The prediction function of educational data deprives educational subjects of their ability to choose independently. (c) Data are used as an evaluation standard but lack the ability to be forgotten. There are three learner-centered strategies, which provide research directions and foundations for researchers and practitioners in related fields: (a) Establish systematic educational data standard systems and related platforms from the macro level. (b) Make efforts to build a new education data ecology through dual “Research-practice” channels. (c) Implement appropriate ethical education and educational data application and forgetting during evaluation. On the one hand, stakeholders must have a correct understanding of data ethics; on the other hand, intelligent technology must be able to automatically guarantee data ethics. These results can provide inspiration for future research and practice in educational data ethics in China.

However, this study also has shortcomings; that is, this study mainly uses the bibliometric analysis of the literature and in-depth literature review to demonstrate viewpoints and lacks the support of specific research practices. Therefore, future research is expected to make breakthroughs in the practical direction of solving ethical problems of educational data. It is important to make theoretical and practical contributions to the application of educational data, which can help break the ethical barrier in education.

Problems of educational data ethics

With the continuous integration of technology and education, the problems of educational data ethics have also received extensive attention from relevant researchers, but they have often focused on specific educational technologies or methods. First, some researchers focused on the ethical issues related to the use of video in education (Peters et al., 2021 ). Peters et al.’s study emphasized the necessity of carefully considering questions caused by videos in education. For example, why is video needed? How can consent and anonymity be achieved? How can the videos be processed to protect data privacy? Second, learning analysis is mainly based on the analysis of digital traces or footprints generated in the learning process, so there are significant issues surrounding data privacy and security, such as the prediction of learning trends (Mathrani et al., 2021 ). Based on the literature published from 2011–2018, Tzimas and other scholars summarized data ethics problems caused by learning analysis into three aspects: teaching intervention, the contradiction between learners’ needs and privacy and security, and mismatch between technology updates and regulations of laws (Tzimas & Demetriadis, 2021 ). Third, big data also involves many ethical issues. Baig et al. ( 2020 ) literature review of 40 primary studies published from 2014 to 2019 shows that ethics is an important direction of development. Some countries and regions have researched the ethical issues involved in big data as follows. On the one hand, many people do not have an accurate understanding, especially of the predictive ability of big data. On the other hand, stakeholders disregard morality and politics when they use social networks, mobile applications, and other ways to collect and use large amounts of data (Chen & Quan-Haase, 2018 ). In general, there are many problems with educational data ethics, but most of these studies research educational data ethics about a specific technology combined with education.

Solutions to educational data ethics

In order to solve the problems of educational data ethics, existing researchers often think about solutions to the ethical problem of educational data from different perspectives but lack systematization. From the perspective of educational leaders and researchers, some researchers have explored how to fairly, ethically, and effectively use AI and other technologies in education. It is crucial for all parties to join together to develop strong educational ethics. (Roschelle et al., 2020 ). Owing to the particularity of China’s political, economic, and cultural background, the central government’s policies play an important role in solving ethical issues of educational data (Knox, 2020 ). From the perspective of researchers and developers of educational data-related technologies, some researchers have thought about how to realize the protection of educational data ethics. Shum and other researchers explore how developers can avoid or respond to data ethics issues during their work (Shum & Luckin, 2019 ). Furthermore, from the perspective of educational technology companies, some researchers have focused on the challenges, solutions, and needs regarding data ethics faced by education companies. For example, Kousa and Niemi ( 2022 ) claim that research and development of artificial intelligence education products should be preventive, safe, explainable and equal. Similarly, facing the challenges of data ethics caused by artificial intelligence requires the collaboration of multiple stakeholders, including companies, consumers, educational institutions, researchers, funders, and managers. In any case, research on educational data ethics must be learner-centered (Tzimas & Demetriadis, 2021 ). It is necessary to effectively safeguard data ethics, combine cultural backgrounds, and collaborate with various stakeholders facing different situations. In general, there is an urgent need to form a more systematic educational data ethics solution, focusing on a specific country or context.

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Alzahrani AS, Tsai Y-S, Iqbal S, Moreno Marcos PM, Scheffel M, Drachsler H, Kloos CD, Aljohani N, Gasevic D (2022) Untangling connections between challenges in the adoption of learning analytics in higher education. Educ Inform Technol https://doi.org/10.1007/s10639-022-11323-x

Baig MI, Shuib L, Yadegaridehkordi E (2020) Big data in education: a state of the art, limitations, and future research directions. Int J Educ Technol High Educ 17(1):44. https://doi.org/10.1186/s41239-020-00223-0

Article   Google Scholar  

Baker RS, Inventado PS (2014) Educational data mining and learning analytics. In: Larusson J., White B. (eds) Learning analytics. Springer, New York, NY

CDT (2021a) Technological school safety initiatives: considerations to protect all students. Retrieved June 24, 2022 from https://www.brennancenter.org/sites/default/files/analysis/20190524schoolsafety.pdf

CDT (2021b) Data ethics in education and the social sector: what does it mean and why does it matter? Retrieved June 24, 2022 from https://cdt.org/insights/report-data-ethics-in-education-and-the-social-sector-what-does-it-mean-and-why-does-it-matter/

Chen W, Quan-Haase A (2018) Big data ethics and politics: toward new understandings. Soc Sci Comput Rev 38(1):3–9. https://doi.org/10.1177/0894439318810734

Chen W, Quan-Haase A, Park YJ (2018) Privacy and data management: the user and producer perspectives. Am Behav Sci 0002764218791287. https://doi.org/10.1177/0002764218791287

Cyberspace Administration of China, C (2021) Regulations on the Administration of Network Data Security (Draft for Comment). Retrieved June 24, 2022 from http://www.cac.gov.cn/2021-11/14/c_1638501991577898.htm

EDUCAUSE (2021) EDUCAUSE horizon report | Information security edition. Retrieved June 24, 2022 from https://library.educause.edu/resources/2021/2/2021-educause-horizon-report-information-security-edition

Essa A (2019) Is data dark? Lessons from Borges’s “Funes the Memorius. J Learn Anal 6(3):35–42. [ https://doi.org/10.18608/jla.2019.63.7 ]

Gao X (2021) Establishing a new form of government governance to adapt to the digital age. Exp Free View 04:141-146+179-180, https://kns.cnki.net/kcms/detail/detail.aspx?FileName=TSZM202104021&DbName=DKFX2021

Google Scholar  

Ge H, Li X, Liu X (2021) Public understanding of and participation in ethical events relating to science and technology: Comparing China with other countries. Cult Sci 4(2):90–96

General Services Administration (2020) Data Ethics Framework. Retrieved June 24, 2022 from https://strategy.data.gov/assets/docs/data-ethics-framework-action-14-draft-2020-sep-2.pdf

Government Digital Service (2020) Data Ethics Framework. Retrieved June 24, 2022 from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/923108/Data_Ethics_Framework_2020.pdf

Gu X, Li S (2021) Artificial intelligence education brain: a technical framework for data-driven educational governance and teaching innovation. CET China Educ Technol 01:80–88. https://kns.cnki.net/kcms/detail/detail.aspx?FileName=ZDJY202101012&DbName=CJFQ2021

Hakimi L, Eynon R, Murphy VA (2021) The ethics of using digital trace data in education: a thematic review of the research landscape. Rev Educ Res 91(5):671–717. https://doi.org/10.3102/00346543211020116

Hazelbaker C (2016, 22/02/2016) Rowing Together, Vendors and CIOs Navigate Tricky Relationships. https://er.educause.edu/articles/2016/2/rowing-together-vendors-and-cios-navigate-tricky-relationships

Hoel T, Chen W (2018) Privacy and data protection in learning analytics should be motivated by an educational maxim—towards a proposal. Res Pract Technol Enhanc Learn 13(1):20. https://doi.org/10.1186/s41039-018-0086-8

Article   PubMed   PubMed Central   Google Scholar  

Holstein K, McLaren BM, Aleven V (2019) Co-designing a real-time classroom orchestration tool to support teacher-AI complementarity. J Learn Anal 6(02):27–52. https://doi.org/10.18608/jla.2019.62.3

IITE (2020) Personal Data Security Technical Guide for Online Education Platforms. UNESCO IITE. Retrieved June 24, 2022. Retrieved from https://iite.unesco.org/news/personal-data-security-technical-guide-for-online-education-platforms/#:~:text=The%20new%20Personal%20Data%20Security%20Technical%20Guide%20for,a%20long-standing%20issue%20with%20development%20of%20online%20learning

Jones KML (2019) ‘Just because you can doesn’t mean you should’: practitioner perceptions of learning analytics ethics. Libr Acad 3(19):407–428

Jones KML, Briney KA, Goben A, Salo D, Asher A, Perry MR (2020) A comprehensive primer to library learning analytics practices, initiatives, and privacy issues. Coll Res Libr 81(3):570–591. https://doi.org/10.5860/crl.81.3.570

Knox J (2020) Artificial intelligence and education in China. Learn Media Technol 45(3):298–311. https://doi.org/10.1080/17439884.2020.1754236

Koondhar MA, Shahbaz M, Memon KA, Ozturk I, Kong R (2021) A visualization review analysis of the last two decades for environmental Kuznets curve “EKC” based on co-citation analysis theory and pathfinder network scaling algorithms. Environ Sci Pollut Res 28:16690–16706. https://doi.org/10.1007/s11356-020-12199-5

Article   CAS   Google Scholar  

Kousa P, Niemi H (2022) AI ethics and learning: EdTech companies’ challenges and solutions. Interact Learn Environ 1–12. https://doi.org/10.1080/10494820.2022.2043908

Kularski CM, Martin F (2021) Online student privacy in higher education: a systematic review of the research. Am J Dist Educ 1–15. https://doi.org/10.1080/08923647.2021.1978784

Li A, Shu H, Gu X (2021) Building an educational artificial intelligence brain: the technical architecture and implementation path of the educational data middle platform. Educ Res 27(03):96–103. https://kns.cnki.net/kcms/detail/detail.aspx?FileName=JFJJ202103011&DbName=CJFQ2021

Li L, Yuan S, Jin Y (2021) Review of blockchain-based federated learning. Appl Res Comput 38(11):3222–3230. https://doi.org/10.19734/j.issn.1001-3695.2021.04.0094

Li Y (2021) The Integration of Spirit of Combating the COVID-19 Epidemic into Ideological and Political Education in Universities under the background of Internet. 2021 International Conference on Internet. Educ Inform Technol (IEIT), pp. 304–308, https://doi.org/10.1109/IEIT53597.2021.00074 . https://ieeexplore.ieee.org/document/9525478/

Li Y, Chen X, Sun D, Zhu Y, Zhai X (2021) From “Transparent People” to “Practitioner”: Challenges and Responses to Information Security in Higher Education: implications from 2021 EDUCAUSE Horizon Report (Information Security Edition). J Dist Educ 39(03):11–19. https://kns.cnki.net/kcms/detail/detail.aspx?FileName=YCJY202103003&DbName=CJFQ2021

CAS   Google Scholar  

Ma Y, Bai M, Zhou Z (2017) Development of artificial intelligence in education in China in smart education era——An interpretation and enlightenment of preparing for the future of artificial intelligence. e-Educ Res 38(03):123–128. https://kns.cnki.net/kcms/detail/detail.aspx?FileName=DHJY201703021&DbName=CJFQ2017

Magdy A, Dony C (2020) 1st ACM SIGSPATIAL Workshop on Geo-Computational Thinking in Education (GeoEd 2019): Chicago, Illinois, USA–November 5, 2019. SIGSPATIAL Special 11(3):12–13. https://doi.org/10.1145/3383653.3383657

Mandinach EB, Jimerson JB (2022) Data ethics in education: a theoretical, practical, and policy issue. Stud Paedagog 26(4):9–26. https://doi.org/10.5817/sp2021-4-1

Mathrani A, Susnjak T, Ramaswami G, Barczak A (2021) Perspectives on the challenges of generalizability, transparency and ethics in predictive learning analytics. Comput Educ Open 2:100060. https://doi.org/10.1016/j.caeo.2021.100060

Mayer-Schönberger V (2011) Delete: the virtue of forgetting in the digital age. Princeton University Press

Mayer-Schönberger V, Cukier K (2014) Learning with big data: the future of education. Harper Business

Ministry of Science and Technology of the People’s Republic of China, M (2021) Guiding Opinions on Strengthening the Ethical Governance of Science and Technology (Draft for Comment). http://www.most.gov.cn/tztg/202107/t20210728_176136.html

Mittelstadt B, Allo P, Taddeo M, Wachter S, Floridi L (2016) The Ethics of Algorithms: Mapping the Debate. Big Data Soc (in press). https://doi.org/10.1177/2053951716679679

Morley J, Elhalal A, Garcia F, Kinsey L, Mokander J, Floridi L (2021) Ethics as a service: a pragmatic operationalisation of AI ethics. Minds Mach 31(2):239–256. https://doi.org/10.1007/s11023-021-09563-w

Onorato M (2013) Transformational leadership style in the educational sector: an empirical study of corporate managers and educational leaders. Acad Educ Leadersh J 17(1):33–47. https://www.proquest.com/scholarly-journals/transformational-leadership-style-educational/docview/1368593704/se-2?accountid=10659

MathSciNet   Google Scholar  

Open Data Institute (2021) Data Ethics Canvas. Retrieved June 24, 2022 from https://theodi.org/article/the-data-ethics-canvas-2021/

Pentland A (2014) Social physics: How good ideas spread-the lessons from a new science. Penguin Press

People’s Republic of China (2021a) Data Security Law of the People’s Republic of China. Retrieved June 24, 2022 from http://en.npc.gov.cn.cdurl.cn/2021-06/10/c_689311.htm

People’s Republic of China (2021b) Personal Information Protection Law of the People’s Republic of China. Retrieved June 24, 2022 from http://en.npc.gov.cn.cdurl.cn/2021-12/29/c_694559.htm

Peters MA, White EJ, Besley T, Locke K, Redder B, Novak R, Gibbons A, O’Neill J, Tesar M, Sturm S (2021) Video ethics in educational research involving children: literature review and critical discussion. Educ Philos Theor 53(9):863–880. https://doi.org/10.1080/00131857.2020.1717920

Prestigiacomo R, Hunter J, Knight S, Martinez-Maldonado R, Lockyer L (2020) Data in practice: a participatory approach to understanding pre-service teachers’ perspectives. Australas J Educ Technol 36(6):107–119. https://doi.org/10.14742/ajet.6388

Richards NM, King JH (2015) Three paradoxes of big data. Appl Mech Mater 743:603–606

Robinson L, Gran B (2018) No kid is an Island: privacy scarcities and digital inequalities. Am Behav Sci 62:000276421878701. https://doi.org/10.1177/0002764218787014

Rosa MJ, Williams J, Claeys J, Kane D, Bruckmann S, Costa D, Rafael JA (2022) Learning analytics and data ethics in performance data management: a bench learning exercise involving six European universities. Qual High Educ 28(1):65–81. https://doi.org/10.1080/13538322.2021.1951455

Roschelle J, Lester J, Fusco J (2020) AI and the future of learning: expert panel report. https://circls.org/wp-content/uploads/2020/11/CIRCLS-AI-Report-Nov2020.pdf

Rubel A, Jones KML (2016) Student privacy in learning analytics: an information ethics perspective. Inform Soci 32(2):143–159. https://doi.org/10.1080/01972243.2016.1130502

Schouten G (2017) On meeting students where they are: teacher judgment and the use of data in higher education. Theor Res Educ 15(3):321–338. https://doi.org/10.1177/1477878517734452

Article   MathSciNet   Google Scholar  

Shibani A, Knight S, Shum SB (2020) Educator perspectives on learning analytics in classroom practice. Internet High Educ 46. https://doi.org/10.1016/j.iheduc.2020.100730

Shum SJB, Luckin R (2019) Learning analytics and AI: Politics, pedagogy and practices. Br J Educ Technol 50(6):2785–2793. https://doi.org/10.1111/bjet.12880

Siemens G (2019) Learning analytics and open, flexible, and distance learning. Dist Educ 40(3):414–418. https://doi.org/10.1080/01587919.2019.1656153

Tang H, Zhang J (2020) Limits of big data application in education. J East China Norm Univ (Educ Sci) 38(10):60–68. https://kns.cnki.net/kcms/detail/detail.aspx?FileName=HDXK202010005&DbName=CJFQ2020

Tlili A, Chang M, Moon J, Liu Z, Burgos D, Chen N-S, Kinshuk (2021) A systematic literature review of empirical studies on learning analytics in educational games. Int J Interact Multimedia Artif Intell 7(2):250–261. https://doi.org/10.9781/ijimai.2021.03.003

Tzimas D, Demetriadis S (2021) Ethical issues in learning analytics: a review of the field. Educ Technol Res Dev 69(2):1101–1133. https://doi.org/10.1007/s11423-021-09977-4

UNESCO (2019) BEIJING CONSENSUS on artificial intelligence and education. Retrieved June 24, 2022 from http://www.moe.gov.cn/jyb_xwfb/gzdt_gzdt/s5987/201908/W020190828311234688933.pdf

UNESCO (2021) Recommendation on the Ethics of Artificial Intelligence. Retrieved June 24, 2022 from https://unesdoc.unesco.org/ark:/48223/pf0000380455

United Nations (2020) Promotion and protection of human rights: human rights questions, including alternative approaches for improving the effective enjoyment of human rights and fundamental freedoms. United Nations. Retrieved June 24, 2022 from https://undocs.org/en/A/RES/75/176

van de Schoot R, de Bruin J, Schram R, Zahedi P, de Boer J, Weijdema F, Kramer B, Huijts M, Hoogerwerf M, Ferdinands G, Harkema A, Willemsen J, Ma Y, Fang Q, Hindriks S, Tummers L, Oberski DL (2021) An open source machine learning framework for efficient and transparent systematic reviews. Nat Mach Intell 3:125–133. https://doi.org/10.1038/s42256-020-00287-7

Vanhee L, Borit M (2022) Viewpoint: ethical by designer-how to grow ethical designers of artificial intelligence. J Artif Intell Res 73:619–631. https://doi.org/10.1613/jair.1.13135

Article   MathSciNet   MATH   Google Scholar  

Viberg O, Engstrom L, Saqr M, Hrastinski S (2022) Exploring students’ expectations of learning analytics: a person-centered approach. Educ Inform Technol 27(6):8561–8581. https://doi.org/10.1007/s10639-022-10980-2

Wang J, Chen S, Wang L, Yang X (2016) The analysis of research hot spot and trend on big data in education based on CiteSpace. Mod Educ Technol 26(02):5–13. https://kns.cnki.net/kcms/detail/detail.aspx?FileName=XJJS201602002&DbName=CJFQ2016

Wang S, Wang Y (2020) The connotation, key characteristics and communication model of 5G+ education. ChongQing High Educ Res 8(02):35–47. https://kns.cnki.net/kcms/detail/detail.aspx?FileName=YXXY202002004&DbName=CJFQ2020

Wang Z, Dan J (2021) How to construct the education data governance system—Taking the successful experience of Kentucky as an example. Mod Dist Educ Res 33(01):77–86. https://kns.cnki.net/kcms/detail/detail.aspx?FileName=XDYC202101010&DbName=CJFQ2021

Yang J, Liang R (2017) Challenges and reform of university classroom teaching model in the age of big data. e-Educ Res 38(08):111–115. https://kns.cnki.net/kcms/detail/detail.aspx?FileName=DHJY201708020&DbName=CJFQ2017

Yang Q (2020) Health QR code, human deep datafication and construction of forgetting ethics. Expl Free View 09:123-129+160–161. https://kns.cnki.net/kcms/detail/detail.aspx?FileName=TSZM202009026&DbName=DKFX2020

Yang X, Yu S (2015) The architecture and key support technologies of smart education. China Educ Technol 01:77-84+130, https://kns.cnki.net/kcms/detail/detail.aspx?FileName=ZDJY201501015&DbName=CJFQ2015

Yuan Y (2021) Study on the platform design of the national credit bank of vocational education based on “Internet Plus”. China Educ Technol 04:84–90. https://kns.cnki.net/kcms/detail/detail.aspx?FileName=ZDJY202104011&DbName=CJFQ2021

Zeng Z, Li Y, Cao Y, Zhao Y, Zhong J, Sidorov D, Zeng X (2020) Blockchain technology for information security of the energy internet: Fundamentals, features, strategy and application. Energies 13:881. https://doi.org/10.3390/en13040881

Zhai X, Chu X, Chai CS, Jong MSY, Istenic A, Spector M, Liu J, Yuan J, Li Y (2021) A review of artificial intelligence (AI) in education from 2010 to 2020. Complexity, 2021. https://doi.org/10.1155/2021/8812542

Zhang H, Wang M, Li G (2021) The development status of frontier technology of blockchain security and privacy protection. Inform Technol Netw Secur 40(05):7–12. https://kns.cnki.net/kcms/detail/detail.aspx?FileName=WXJY202105002&DbName=CJFQ2021

Zhong S, Tang Y (2018) Research on the orientation and route of educational innovative development in the age of artificial intelligence. e-Educ Res 39(10):15-20+40, https://kns.cnki.net/kcms/detail/detail.aspx?FileName=DHJY201810004&DbName=CJFQ2018

Zhou X, Tang W(2020) Research on the ethical consciousness of university education management under the background of big data J Qiqihar Univ (Philos Soc Sci Edn) 10:166–169 https://kns.cnki.net/kcms/detail/detail.aspx?FileName=QQHD202010044&DbName=CJFQ2020

Download references

Acknowledgements

This work was supported by the Science and Technology Commission of Shanghai Municipality [grant number 21ZR1419100]. Xiu Guan and Xiang Feng have equally contributed to this article, and they should be considered as first authors. Xiang Feng and A.Y.M. Atiquil Islam should be considered as corresponding authors.

Author information

Authors and affiliations.

Shanghai Engineering Research Center of Digital Educational Equipment, East China Normal University, Shanghai, China

Xiu Guan & Xiang Feng

Department of Education Information Technology, East China Normal University, Shanghai, China

Xiang Feng & A.Y.M. Atiquil Islam

School of Teacher Education, Jiangsu University, Zhenjiang, China

A.Y.M. Atiquil Islam

You can also search for this author in PubMed   Google Scholar

Corresponding authors

Correspondence to Xiang Feng or A.Y.M. Atiquil Islam .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Informed consent

Additional information.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Guan, X., Feng, X. & Islam, A.A. The dilemma and countermeasures of educational data ethics in the age of intelligence. Humanit Soc Sci Commun 10 , 138 (2023). https://doi.org/10.1057/s41599-023-01633-x

Download citation

Received : 22 August 2022

Accepted : 16 March 2023

Published : 01 April 2023

DOI : https://doi.org/10.1057/s41599-023-01633-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

A meta systematic review of artificial intelligence in higher education: a call for increased ethics, collaboration, and rigour.

  • Melissa Bond
  • Hassan Khosravi
  • George Siemens

International Journal of Educational Technology in Higher Education (2024)

A comprehensive analysis of the role of artificial intelligence in aligning tertiary institutions academic programs to the emerging digital enterprise

  • Duncan Nyale
  • Simon Karume
  • Andrew Kipkebut

Education and Information Technologies (2024)

Screening Smarter, Not Harder: A Comparative Analysis of Machine Learning Screening Algorithms and Heuristic Stopping Criteria for Systematic Reviews in Educational Research

  • Diego G. Campos
  • Tim Fütterer
  • Ronny Scherer

Educational Psychology Review (2024)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

data ethics thesis

To read this content please select one of the options below:

Please note you do not have access to teaching notes, the digital traveller: implications for data ethics and data governance in tourism and hospitality.

Journal of Consumer Marketing

ISSN : 0736-3761

Article publication date: 3 September 2021

Issue publication date: 7 February 2023

Big data and analytics are being increasingly used by tourism and hospitality organisations (THOs) to provide insights and to inform critical business decisions. Particularly in times of crisis and uncertainty data analytics supports THOs to acquire the knowledge needed to ensure business continuity and the rebuild of tourism and hospitality sectors. Despite being recognised as an important source of value creation, big data and digital technologies raise ethical, privacy and security concerns. This paper aims to suggest a framework for ethical data management in tourism and hospitality designed to facilitate and promote effective data governance practices.

Design/methodology/approach

The paper adopts an organisational and stakeholder perspective through a scoping review of the literature to provide an overview of an under-researched topic and to guide further research in data ethics and data governance.

The proposed framework integrates an ethical-based approach which expands beyond mere compliance with privacy and protection laws, to include other critical facets regarding privacy and ethics, an equitable exchange of travellers’ data and THOs ability to demonstrate a social license to operate by building trusting relationships with stakeholders.

Originality/value

This study represents one of the first studies to consider the development of an ethical data framework for THOs, as a platform for further refinements in future conceptual and empirical research of such data governance frameworks. It contributes to the advancement of the body of knowledge in data ethics and data governance in tourism and hospitality and other industries and it is also beneficial to practitioners, as organisations may use it as a guide in data governance practices.

  • Hospitality
  • Data governance
  • Data ethics
  • Digital privacy

Yallop, A.C. , Gică, O.A. , Moisescu, O.I. , Coroș, M.M. and Séraphin, H. (2023), "The digital traveller: implications for data ethics and data governance in tourism and hospitality", Journal of Consumer Marketing , Vol. 40 No. 2, pp. 155-170. https://doi.org/10.1108/JCM-12-2020-4278

Emerald Publishing Limited

Copyright © 2021, Emerald Publishing Limited

Related articles

All feedback is valuable.

Please share your general feedback

Report an issue or find answers to frequently asked questions

Contact Customer Support

  • Business Essentials
  • Leadership & Management
  • Credential of Leadership, Impact, and Management in Business (CLIMB)
  • Entrepreneurship & Innovation
  • Digital Transformation
  • Finance & Accounting
  • Business in Society
  • For Organizations
  • Support Portal
  • Media Coverage
  • Founding Donors
  • Leadership Team

data ethics thesis

  • Harvard Business School →
  • HBS Online →
  • Business Insights →

Business Insights

Harvard Business School Online's Business Insights Blog provides the career insights you need to achieve your goals and gain confidence in your business skills.

  • Career Development
  • Communication
  • Decision-Making
  • Earning Your MBA
  • Negotiation
  • News & Events
  • Productivity
  • Staff Spotlight
  • Student Profiles
  • Work-Life Balance
  • AI Essentials for Business
  • Alternative Investments
  • Business Analytics
  • Business Strategy
  • Business and Climate Change
  • Creating Brand Value
  • Design Thinking and Innovation
  • Digital Marketing Strategy
  • Disruptive Strategy
  • Economics for Managers
  • Entrepreneurship Essentials
  • Financial Accounting
  • Global Business
  • Launching Tech Ventures
  • Leadership Principles
  • Leadership, Ethics, and Corporate Accountability
  • Leading Change and Organizational Renewal
  • Leading with Finance
  • Management Essentials
  • Negotiation Mastery
  • Organizational Leadership
  • Power and Influence for Positive Impact
  • Strategy Execution
  • Sustainable Business Strategy
  • Sustainable Investing
  • Winning with Digital Platforms

5 Principles of Data Ethics for Business

business team discusses data ethics around laptop

  • 16 Mar 2021

Data can be used to drive decisions and make an impact at scale. Yet, this powerful resource comes with challenges. How can organizations ethically collect, store, and use data? What rights must be upheld? The field of data ethics explores these questions and offers five guiding principles for business professionals who handle data.

What Is Data Ethics?

Data ethics encompasses the moral obligations of gathering, protecting, and using personally identifiable information and how it affects individuals.

“Data ethics asks, ‘Is this the right thing to do?’ and ‘Can we do better?’” Harvard Professor Dustin Tingley explains in the Harvard Online course Data Science Principles .

Data ethics are of the utmost concern to analysts, data scientists, and information technology professionals. Anyone who handles data, however, must be well-versed in its basic principles.

For instance, your company may collect and store data about customers’ journeys from the first time they submit their email address on your website to the fifth time they purchase your product. If you’re a digital marketer, you likely interact with this data daily.

While you may not be the person responsible for implementing tracking code, managing a database, or writing and training a machine-learning algorithm, understanding data ethics can allow you to catch any instances of unethical data collection, storage, or use. By doing so, you can protect your customers' safety and save your organization from legal issues.

Here are five principles of data ethics to apply at your organization.

Access your free e-book today.

5 Principles of Data Ethics for Business Professionals

1. ownership.

The first principle of data ethics is that an individual has ownership over their personal information. Just as it’s considered stealing to take an item that doesn’t belong to you, it’s unlawful and unethical to collect someone’s personal data without their consent.

Some common ways you can obtain consent are through signed written agreements, digital privacy policies that ask users to agree to a company’s terms and conditions, and pop-ups with checkboxes that permit websites to track users’ online behavior with cookies. Never assume a customer is OK with you collecting their data; always ask for permission to avoid ethical and legal dilemmas.

2. Transparency

In addition to owning their personal information, data subjects have a right to know how you plan to collect, store, and use it. When gathering data, exercise transparency.

For instance, imagine your company has decided to implement an algorithm to personalize the website experience based on individuals’ buying habits and site behavior. You should write a policy explaining that cookies are used to track users’ behavior and that the data collected will be stored in a secure database and train an algorithm that provides a personalized website experience. It’s a user’s right to have access to this information so they can decide to accept your site’s cookies or decline them.

Withholding or lying about your company’s methods or intentions is deception and both unlawful and unfair to your data subjects.

Another ethical responsibility that comes with handling data is ensuring data subjects’ privacy. Even if a customer gives your company consent to collect, store, and analyze their personally identifiable information (PII) , that doesn’t mean they want it publicly available.

PII is any information linked to an individual’s identity. Some examples of PII include:

  • Street address
  • Phone number
  • Social Security card
  • Credit card information
  • Bank account number
  • Passport number

To protect individuals’ privacy, ensure you’re storing data in a secure database so it doesn’t end up in the wrong hands. Data security methods that help protect privacy include dual-authentication password protection and file encryption.

For professionals who regularly handle and analyze sensitive data, mistakes can still be made. One way to prevent slip-ups is by de-identifying a dataset. A dataset is de-identified when all pieces of PII are removed, leaving only anonymous data. This enables analysts to find relationships between variables of interest without attaching specific data points to individual identities.

Related: Data Privacy: 4 Things Every Business Professional Should Know

4. Intention

When discussing any branch of ethics, intentions matter. Before collecting data, ask yourself why you need it, what you’ll gain from it, and what changes you’ll be able to make after analysis. If your intention is to hurt others, profit from your subjects’ weaknesses, or any other malicious goal, it’s not ethical to collect their data.

When your intentions are good—for instance, collecting data to gain an understanding of women’s healthcare experiences so you can create an app to address a pressing need—you should still assess your intention behind the collection of each piece of data.

Are there certain data points that don’t apply to the problem at hand? For instance, is it necessary to ask if the participants struggle with their mental health? This data could be sensitive, so collecting it when it’s unnecessary isn’t ethical. Strive to collect the minimum viable amount of data, so you’re taking as little as possible from your subjects while making a difference.

Related: 5 Applications of Data Analytics in Health Care

5. Outcomes

Even when intentions are good, the outcome of data analysis can cause inadvertent harm to individuals or groups of people. This is called a disparate impact , which is outlined in the Civil Rights Act as unlawful.

In Data Science Principles, Harvard Professor Latanya Sweeney provides an example of disparate impact. When Sweeney searched for her name online, an advertisement came up that read, “Latanya Sweeney, Arrested?” She had not been arrested, so this was strange.

“What names, if you search them, come up with arrest ads?” Sweeney asks in the course. “What I found was that if your name was given more often to a Black baby than to a white baby, your name was 80 percent more likely get an ad saying you had been arrested.”

It’s not clear from this example whether the disparate impact was intentional or a result of unintentional bias in an algorithm. Either way, it has the potential to do real damage that disproportionately impacts a specific group of people.

Unfortunately, you can’t know for certain the impact your data analysis will have until it’s complete. By considering this question beforehand, you can catch any potential occurrences of disparate impact.

Ethical Use of Algorithms

If your role includes writing, training, or handling machine-learning algorithms, consider how they could potentially violate any of the five key data ethics principles.

Because algorithms are written by humans, bias may be intentionally or unintentionally present. Biased algorithms can cause serious harm to people. In Data Science Principles, Sweeny outlines the following ways bias can creep into your algorithms:

  • Training: Because machine-learning algorithms learn based on the data they’re trained with, an unrepresentative dataset can cause your algorithm to favor some outcomes over others.
  • Code: Although any bias present in your algorithm is hopefully unintentional, don’t rule out the possibility that it was written specifically to produce biased results.
  • Feedback: Algorithms also learn from users’ feedback. As such, they can be influenced by biased feedback. For instance, a job search platform may use an algorithm to recommend roles to candidates. If hiring managers consistently select white male candidates for specific roles, the algorithm will learn and adjust and only provide job listings to white male candidates in the future. The algorithm learns that when it provides the listing to people with certain attributes, it’s “correct” more often, which leads to an increase in that behavior.

“No algorithm or team is perfect, but it’s important to strive for the best,” Tingley says in Data Science Principles. “Using human evaluators at every step of the data science process, making sure training data is truly representative of the populations who will be affected by the algorithm, and engaging stakeholders and other data scientists with diverse backgrounds can help make better algorithms for a brighter future.”

A Beginner's Guide to Data and Analytics | Access Your Free E-Book | Download Now

Using Data for Good

While the ethical use of data is an everyday effort, knowing that your data subjects’ safety and rights are intact is worth the work. When handled ethically, data can enable you to make decisions and drive meaningful change at your organization and in the world.

Are you interested in furthering your data literacy? Download our Beginner’s Guide to Data & Analytics to learn how you can leverage the power of data for professional and organizational success.

data ethics thesis

About the Author

Data ethics: What it means and what it takes

Now more than ever, every company is a data company. By 2025, individuals and companies around the world will produce an estimated 463 exabytes of data each day, 1 Jeff Desjardins, “How much data is generated each day?” World Economic Forum, April 17, 2019. compared with less than three exabytes a decade ago. 2 IBM Research Blog , “Dimitri Kanevsky translating big data,” blog entry by IBM Research Editorial Staff, March 5, 2013.

With that in mind, most businesses have begun to address the operational aspects of data management—for instance, determining how to build and maintain a data lake  or how to integrate data scientists and other technology experts  into existing teams. Fewer companies have systematically considered and started to address the ethical aspects of data management, which could have broad ramifications and responsibilities. If algorithms are trained with biased data sets or data sets are breached, sold without consent, or otherwise mishandled, for instance, companies can incur significant reputational and financial costs. Board members could even be held personally liable. 3 Leah Rizkallah, “Potential board liability for cybersecurity failures under Caremark law,” CPO Magazine , February 22, 2022.

So how should companies begin to think about ethical data management? What measures can they put in place to ensure that they are using consumer, patient, HR, facilities, and other forms of data appropriately across the value chain—from collection to analytics to insights?

We began to explore these questions by speaking with about a dozen global business leaders and data ethics experts. Through these conversations, we learned about some common data management traps that leaders and organizations can fall into, despite their best intentions. These traps include thinking that data ethics does not apply to your organization, that legal and compliance have data ethics covered, and that data scientists have all the answers—to say nothing of chasing short-term ROI at all costs and looking only at the data rather than their sources.

In this article, we explore these traps and suggest some potential ways to avoid them, such as adopting new standards for data management, rethinking governance models, and collaborating across disciplines and organizations. This list of potential challenges and remedies is not exhaustive; our research base was relatively small, and leaders could face many other obstacles, beyond our discussion here, to the ethical use of data. But what’s clear from our research is that data ethics needs both more and sustained attention from all members of the C-suite, including the CEO.

Potential challenges for business leaders

What is data ethics.

We spoke with about a dozen business leaders and data ethics experts. In their eyes, these are some characteristics of ethical data use:

It preserves data security and protects customer information. The practitioners we spoke with tend to view cybersecurity and data privacy as part and parcel of data ethics. They believe companies have an ethical responsibility (as well as legal obligations) to protect customers’ data, defend against breaches, and ensure that personal data are not compromised.

It offers a clear benefit to both consumers and companies. “The consumer’s got to be getting something” from a data-based transaction, explained an executive at a large financial-services company. “If you’re not solving a problem for a consumer, you’ve got to ask yourself why you’re doing what you’re doing.” The benefit to customers should be straightforward and easy to summarize in a single sentence: customers might, for instance, get greater speed, convenience, value, or savings.

It offers customers some measure of agency. “We don’t want consumers to be surprised,” one executive told us. “If a customer receives an offer and says, ‘I think I got this because of how you’re using my data, and that makes me uncomfortable. I don’t think I ever agreed to this,’ another company might say, ‘On page 41, down in the footnote in the four-point font, you did actually agree to this.’ We never want to be that company.”

It is in line with your company’s promises. In data management, organizations must do what they say they will do—or risk losing the trust of customers and other key stakeholders. As one senior executive pointed out, keeping faith with stakeholders may mean turning down certain contracts if they contradict the organization’s stated data values and commitments.

There is a dynamic body of literature on data ethics. Just as the methods companies use to collect, analyze, and access data are evolving, so will definitions of the term itself. In this article, we define data ethics as data-related practices that seek to preserve the trust of users, patients, consumers, clients, employees, and partners. Most of the business leaders we spoke to agreed broadly with that definition, but some have tailored it to the needs of their own sectors or organizations (see sidebar, “What is data ethics?”). Our conversations with these business leaders also revealed the unintended lapses in data ethics that can happen in organizations. These include the following:

Thinking that data ethics doesn’t apply to your organization

While privacy and ethical considerations are essential whenever companies use data (including artificial-intelligence and machine-learning applications), they often aren’t top of mind for some executives. In our experience, business leaders are not intentionally pushing these thoughts away; it’s often just easier for them to focus on things they can “see”— the tools, technologies, and strategic objectives associated with data management—than on the seemingly invisible ways data management can go wrong.

In a 2021 McKinsey Global Survey on the state of AI , for instance, only 27 percent of some 1,000 respondents said that their data professionals actively check for skewed or biased data during data ingestion. Only 17 percent said that their companies have a dedicated data governance committee that includes risk and legal professionals. In that same survey, only 30 percent of respondents said their companies recognized equity and fairness as relevant AI risks. AI-related data risks are only a subset of broader data ethics concerns, of course, but these numbers are striking.

Thinking in silos: Legal, compliance, or data scientists have data ethics covered

Companies may believe that just by hiring a few data scientists, they’ve fulfilled their data management obligations. The truth is data ethics is everyone’s domain, not just the province of data scientists or of legal and compliance teams. At different times, employees across the organization—from the front line to the C-suite—will need to raise, respond to, and think through various ethical issues surrounding data. Business unit leaders will need to vet their data strategies with legal and marketing teams, for example, to ensure that their strategic and commercial objectives are in line with customers’ expectations and with regulatory and legal requirements for data usage.

As executives navigate usage questions, they must acknowledge that although regulatory requirements and ethical obligations are related, adherence to data ethics goes far beyond the question of what’s legal. Indeed, companies must often make decisions before the passage of relevant laws. The European Union’s General Data Protection Regulation (GDPR) went into effect only in May 2018, the California Consumer Privacy Act has been in effect only since January 2020, and federal privacy law is only now pending in the US Congress. Years before these and other statutes and regulations were put in place, leaders had to set the terms for their organizations’ use of data—just as they currently make decisions about matters that will be regulated in years to come.

Laws can show executives what they can do . But a comprehensive data ethics framework can guide executives on whether they should , say, pursue a certain commercial strategy and, if so, how they should go about it. One senior executive we spoke with put the data management task for executives plainly: “The bar here is not regulation. The bar here is setting an expectation with consumers and then meeting that expectation—and doing it in a way that’s additive to your brand.”

Chasing short-term ROI

Prompted by economic volatility, aggressive innovation in some industries, and other disruptive business trends, executives and other employees may be tempted to make unethical data choices—for instance, inappropriately sharing confidential information because it is useful—to chase short-term profits. Boards increasingly want more standards for the use of consumer and business data, but the short-term financial pressures remain. As one tech company president explained: “It’s tempting to collect as much data as possible and to use as much data as possible. Because at the end of the day, my board cares about whether I deliver growth and EBITDA.… If my chief marketing officer can’t target users to create an efficient customer acquisition channel, he will likely get fired at some point—or at least he won’t make his bonus.”

Looking only at the data, not at the sources

Ethical lapses can occur when executives look only at the fidelity and utility of discrete data sets and don’t consider the entire data pipeline. Where did the data come from? Can this vendor ensure that the subjects of the data gave their informed consent for use by third parties? Do any of the market data contain material nonpublic information? Such due diligence is key: one alternative data provider was charged with securities fraud for misrepresenting to trading firms how its data were derived. In that case, companies had provided confidential information about the performance of their apps to the data vendor, which did not aggregate and anonymize the data as promised. Ultimately, the vendor had to settle with the US Securities and Exchange Commission. 4 “SEC charges App Annie and its founder with securities fraud,” US Securities and Exchange Commission, September 14, 2021.

A few important building blocks

These data management challenges are common—and they are by no means the only ones. As organizations generate more data, adopt new tools and technologies to collect and analyze data, and find new ways to apply insights from data, new privacy and ethical challenges and complications will inevitably emerge. Organizations must experiment with ways to build fault-tolerant data management programs. These seven data-related principles, drawn from our research, may provide a helpful starting point.

Set company-specific rules for data usage

Leaders in the business units, functional areas, and legal and compliance teams must come together to create a data usage framework for employees—a framework that reflects a shared vision and mission for the company’s use of data . As a start, the CEO and other C-suite leaders must also be involved in defining data rules that give employees a clear sense of the company’s threshold for risk and which data-related ventures are OK to pursue and which are not.

Leaders must come together to create a data usage framework that reflects a shared vision and mission for the company’s use of data.

Such rules can improve and potentially speed up individual and organizational decision making. They should be tailored to your specific industry, even to the products and services your company offers. They should be accessible to all employees, partners, and other critical stakeholders. And they should be grounded in a core principle—for example, “We do not use data in any way that we cannot link to a better outcome for our customers.” Business leaders should plan to revisit and revise the rules periodically to account for shifts in the business and technology landscape.

Communicate your data values, both inside and outside your organization

Once you’ve established common data usage rules, it’s important to communicate them effectively inside and outside the organization. That might mean featuring the company’s data values on employees’ screen savers, as the company of one of our interview subjects has done. Or it may be as simple as tailoring discussions about data ethics to various business units and functions and speaking to their employees in language they understand. The messaging to the IT group and data scientists, for instance, may be about creating ethical data algorithms or safe and robust data storage protocols. The messaging to marketing and sales teams may focus on transparency and opt-in/opt-out protocols.

Organizations also need to earn the public’s trust. Posting a statement about data ethics on the corporate website worked for one financial-services organization. As an executive explained: “When you’re having a conversation with a government entity, it’s really helpful to be able to say, ‘Go to our website and click on Responsible Data Use, and you’ll see what we think.’ We’re on record in a way that you can’t really walk back.” Indeed, publicizing your company’s data ethics framework may help increase the momentum for powerful joint action, such as the creation of industry-wide data ethics standards.

" "

Why digital trust truly matters

Build a diverse data-focused team.

A strong data ethics program won’t materialize out of the blue. Organizations large and small need people who focus on ethics issues; it cannot be a side activity. The work should be assigned to a specific team or attached to a particular role. Some larger technology and pharmaceutical companies have appointed chief ethics or chief trust officers in recent years. Others have set up interdisciplinary teams, sometimes referred to as data ethics boards, to define and uphold data ethics. Ideally, such boards would include representatives from, for example, the business units, marketing and sales, compliance and legal, audit, IT, and the C-suite. These boards should also have a range of genders, races, ethnicities, classes, and so on: an organization will be more likely to identify issues early on (in algorithm-training data, for example) when people with a range of different backgrounds and experiences sit around the table.

One multinational financial-services corporation has developed an effective structure for its data ethics deliberations and decision making. It has two main data ethics groups. The major decisions are made by a group of senior stakeholders, including the head of security and other senior technology executives, the chief privacy officer, the head of the consulting arm, the head of strategy, and the heads of brand, communications, and digital advertising. These are the people most likely to use the data.

Governance is the province of another group, which is chaired by the chief privacy officer and includes the global head of data, a senior risk executive, and the executive responsible for the company’s brand. Anything new concerning data use gets referred to this council, and teams must explain how proposed products comply with the company’s data use principles. As one senior company executive explains, “It’s important that both of these bodies be cross-functional because in both cases you’re trying to make sure that you have a fairly holistic perspective.”

As we’ve noted, compliance teams and legal counsel should not be the only people thinking about a company’s data ethics, but they do have an important role to play in ensuring that data ethics programs succeed. Legal experts are best positioned to advise on how your company should apply existing and emerging regulations. But teams may also want to bring in outside experts to navigate particularly difficult ethical challenges. For example, a large tech company brought in an academic expert on AI ethics to help it figure out how to navigate gray areas, such as the environmental impact of certain kinds of data use. That expert was a sitting but not voting member of the group because the team “did not want to outsource the decision making.” But the expert participated in every meeting and led the team in the work that preceded the meetings.

Engage champions in the C-suite

Some practitioners and experts we spoke with who had convened data ethics boards pointed to the importance of keeping the CEO and the corporate board apprised of decisions and activities. A senior executive who chaired his organization’s data ethics group explained that while it did not involve the CEO directly in the decision-making process, it brought all data ethics conclusions to him “and made sure he agreed with the stance that we were taking.” All these practitioners and experts agreed that having a champion or two in the C-suite can signal the importance of data ethics to the rest of the organization, put teeth into data rules, and support the case for investment in data-related initiatives.

Indeed, corporate boards and audit committees can provide the checks needed to ensure that data ethics are being upheld, regardless of conflicting incentives. The president of one tech company told us that its board had recently begun asking for a data ethics report as part of the audit committee’s agenda, which had previously focused more narrowly on privacy and security. “You have to provide enough of an incentive—a carrot or a stick to make sure people take this seriously,” the president said.

Consider the impact of your algorithms and overall data use

Organizations should continually assess the effects of the algorithms and data they use—and test for bias throughout the value chain. That means thinking about the problems organizations might create, even unwittingly, in building AI products. For instance, who might be disadvantaged by an algorithm or a particular use of data? One technologist we spoke with advises asking the hard questions: “Start your meetings about AI by asking, ‘Are the algorithms we are building sexist or racist?’”

Certain data applications require far greater scrutiny and consideration. Security is one such area. A tech company executive recalled the extra measures his organization took to prevent its image and video recognition products and services from being misused: “We would insist that if you were going to use our technology for security purposes, we had to get very involved in ensuring that you debiased the data set as much as possible so that particular groups would not be unfairly singled out.” It’s important to consider not only what types of data are being used but also what they are being used for—and what they could potentially be used for down the line.

Think globally

The ethical use of data requires organizations to consider the interests of people who are not in the room. Anthropologist Mary Gray, the senior principal researcher at Microsoft Research, raises questions about global reach in her 2019 book, Ghost Work . Among them: Who labeled the data? Who tagged these images? Who kept violent videos off this website? Who weighed in when the algorithm needed a steer?

Today’s leaders need to ask these sorts of questions, along with others about how such tech work happens. Broadly, leaders must take a 10,000-foot view of their companies as players in the digital economy, the data ecosystem, and societies everywhere. There may be ways they can support policy initiatives or otherwise help to bridge the digital divide, support the expansion of broadband infrastructure, and create pathways for diversity in the tech industry. Ultimately, data ethics requires leaders to reckon with the ongoing rise in global inequality—and the increasing concentration of wealth and value both in geographical tech hubs and among AI-enabled organizations. 5 For more on the concentration of value among AI-enabled firms, see Marco Iansiti and Karim R. Lakhani, Competing in the Age of AI: Strategy and Leadership When Algorithms and Networks Run the World , Boston: Harvard Business Review Press, 2020.

Embed your data principles in your operations

It’s one thing to define what constitutes the ethical use of data and to set data usage rules; it’s another to integrate those rules into operations across the organization. Data ethics boards, business unit leaders, and C-suite champions should build a common view (and a common language) about how data usage rules should link up to both the company’s data and corporate strategies and to real-world use cases for data ethics, such as decisions on design processes or M&A. In some cases, there will be obvious places to operationalize data ethics—for instance, data operations teams, secure-development operations teams, and machine-learning operations teams. Trust-building frameworks for machine-learning operations  can ensure that data ethics will be considered at every step in the development of AI applications.

Regardless of which part of the organization the leaders target first, they should identify KPIs that can be used to monitor and measure its performance in realizing their data ethics objectives. To ensure that the ethical use of data becomes part of everyone’s daily work, the leadership team also should advocate, help to build, and facilitate formal training programs on data ethics.

Data ethics can‘t be put into practice overnight. As many business leaders know firsthand, building teams, establishing practices, and changing organizational culture are all easier said than done. What’s more, upholding your organization’s data ethics principles may mean walking away from potential partnerships and other opportunities to generate short-term revenues. But the stakes for companies could not be higher. Organizations that fail to walk the walk on data ethics risk losing their customers’ trust and destroying value.

Alex Edquist is an alumna of McKinsey’s Atlanta office; Liz Grennan is an associate partner in the Stamford, Connecticut, office; Sian Griffiths is a partner in the Washington, DC, office; and Kayvaun Rowshankish is a senior partner in the New York office.

The authors wish to thank Alyssa Bryan, Kasia Chmielinski, Ilona Logvinova, Keith Otis, Marc Singer, Naomi Sosner, and Eckart Windhagen for their contributions to this article.

This article was edited by Roberta Fusaro, an editorial director in the Waltham, Massachusetts, office.

Explore a career with us

Related articles.

Getting to know--and manage--your biggest AI risks

Getting to know—and manage—your biggest AI risks

" "

Localization of data privacy regulations creates competitive opportunities

Close up view of white Greek statues head with a blue background.

AI Ethics in today’s world

Bucknell University logo

  • < Previous

Home > STUDENT-SCHOLARSHIP > Theses > HONORS_THESES > 546

Honors Theses

Ethics, privacy and data collection: a complex intersection.

Matthew S. Brown , Bucknell University Follow

Date of Thesis

Spring 2020

Description

The technology around us enables incredible abilities such as high-resolution video calls and the ability to stay connected with everyone we care about through social media. This technology also comes with a hidden cost in the form of data collection.

This work explores what privacy means and how users understand what data social media companies collect and monetize. This thesis also proposes a more ethical business model that addresses privacy concerns from an individual perspective.

privacy, social media, data collection, ethics, computer science

Access Type

Honors Thesis

Degree Type

Bachelor of Science

Computer Science

Minor, Emphasis, or Concentration

First advisor.

L. Felipe Perrone

Recommended Citation

Brown, Matthew S., "Ethics, Privacy and Data Collection: A Complex Intersection" (2020). Honors Theses . 546. https://digitalcommons.bucknell.edu/honors_theses/546

Since May 11, 2020

Included in

Information Security Commons

Advanced Search

  • Notify me via email or RSS
  • Collections
  • Disciplines

Author Corner

  • Submit Research

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright

AI & Data Ethics

The AI and Data Ethics initiative aims to develop a robust ethics ecosystem for responsible development and use of autonomous, computational, and data driven systems through foundational research, translational research focused on policy and practice, education and training programs, and public scholarship. 

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Ethical Considerations in Research | Types & Examples

Ethical Considerations in Research | Types & Examples

Published on October 18, 2021 by Pritha Bhandari . Revised on May 9, 2024.

Ethical considerations in research are a set of principles that guide your research designs and practices. Scientists and researchers must always adhere to a certain code of conduct when collecting data from people.

The goals of human research often include understanding real-life phenomena, studying effective treatments, investigating behaviors, and improving lives in other ways. What you decide to research and how you conduct that research involve key ethical considerations.

These considerations work to

  • protect the rights of research participants
  • enhance research validity
  • maintain scientific or academic integrity

Table of contents

Why do research ethics matter, getting ethical approval for your study, types of ethical issues, voluntary participation, informed consent, confidentiality, potential for harm, results communication, examples of ethical failures, other interesting articles, frequently asked questions about research ethics.

Research ethics matter for scientific integrity, human rights and dignity, and collaboration between science and society. These principles make sure that participation in studies is voluntary, informed, and safe for research subjects.

You’ll balance pursuing important research objectives with using ethical research methods and procedures. It’s always necessary to prevent permanent or excessive harm to participants, whether inadvertent or not.

Defying research ethics will also lower the credibility of your research because it’s hard for others to trust your data if your methods are morally questionable.

Even if a research idea is valuable to society, it doesn’t justify violating the human rights or dignity of your study participants.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

data ethics thesis

Before you start any study involving data collection with people, you’ll submit your research proposal to an institutional review board (IRB) .

An IRB is a committee that checks whether your research aims and research design are ethically acceptable and follow your institution’s code of conduct. They check that your research materials and procedures are up to code.

If successful, you’ll receive IRB approval, and you can begin collecting data according to the approved procedures. If you want to make any changes to your procedures or materials, you’ll need to submit a modification application to the IRB for approval.

If unsuccessful, you may be asked to re-submit with modifications or your research proposal may receive a rejection. To get IRB approval, it’s important to explicitly note how you’ll tackle each of the ethical issues that may arise in your study.

There are several ethical issues you should always pay attention to in your research design, and these issues can overlap with each other.

You’ll usually outline ways you’ll deal with each issue in your research proposal if you plan to collect data from participants.

Voluntary participation Your participants are free to opt in or out of the study at any point in time.
Informed consent Participants know the purpose, benefits, risks, and funding behind the study before they agree or decline to join.
Anonymity You don’t know the identities of the participants. Personally identifiable data is not collected.
Confidentiality You know who the participants are but you keep that information hidden from everyone else. You anonymize personally identifiable data so that it can’t be linked to other data by anyone else.
Potential for harm Physical, social, psychological and all other types of harm are kept to an absolute minimum.
Results communication You ensure your work is free of or research misconduct, and you accurately represent your results.

Voluntary participation means that all research subjects are free to choose to participate without any pressure or coercion.

All participants are able to withdraw from, or leave, the study at any point without feeling an obligation to continue. Your participants don’t need to provide a reason for leaving the study.

It’s important to make it clear to participants that there are no negative consequences or repercussions to their refusal to participate. After all, they’re taking the time to help you in the research process , so you should respect their decisions without trying to change their minds.

Voluntary participation is an ethical principle protected by international law and many scientific codes of conduct.

Take special care to ensure there’s no pressure on participants when you’re working with vulnerable groups of people who may find it hard to stop the study even when they want to.

Prevent plagiarism. Run a free check.

Informed consent refers to a situation in which all potential participants receive and understand all the information they need to decide whether they want to participate. This includes information about the study’s benefits, risks, funding, and institutional approval.

You make sure to provide all potential participants with all the relevant information about

  • what the study is about
  • the risks and benefits of taking part
  • how long the study will take
  • your supervisor’s contact information and the institution’s approval number

Usually, you’ll provide participants with a text for them to read and ask them if they have any questions. If they agree to participate, they can sign or initial the consent form. Note that this may not be sufficient for informed consent when you work with particularly vulnerable groups of people.

If you’re collecting data from people with low literacy, make sure to verbally explain the consent form to them before they agree to participate.

For participants with very limited English proficiency, you should always translate the study materials or work with an interpreter so they have all the information in their first language.

In research with children, you’ll often need informed permission for their participation from their parents or guardians. Although children cannot give informed consent, it’s best to also ask for their assent (agreement) to participate, depending on their age and maturity level.

Anonymity means that you don’t know who the participants are and you can’t link any individual participant to their data.

You can only guarantee anonymity by not collecting any personally identifying information—for example, names, phone numbers, email addresses, IP addresses, physical characteristics, photos, and videos.

In many cases, it may be impossible to truly anonymize data collection . For example, data collected in person or by phone cannot be considered fully anonymous because some personal identifiers (demographic information or phone numbers) are impossible to hide.

You’ll also need to collect some identifying information if you give your participants the option to withdraw their data at a later stage.

Data pseudonymization is an alternative method where you replace identifying information about participants with pseudonymous, or fake, identifiers. The data can still be linked to participants but it’s harder to do so because you separate personal information from the study data.

Confidentiality means that you know who the participants are, but you remove all identifying information from your report.

All participants have a right to privacy, so you should protect their personal data for as long as you store or use it. Even when you can’t collect data anonymously, you should secure confidentiality whenever you can.

Some research designs aren’t conducive to confidentiality, but it’s important to make all attempts and inform participants of the risks involved.

As a researcher, you have to consider all possible sources of harm to participants. Harm can come in many different forms.

  • Psychological harm: Sensitive questions or tasks may trigger negative emotions such as shame or anxiety.
  • Social harm: Participation can involve social risks, public embarrassment, or stigma.
  • Physical harm: Pain or injury can result from the study procedures.
  • Legal harm: Reporting sensitive data could lead to legal risks or a breach of privacy.

It’s best to consider every possible source of harm in your study as well as concrete ways to mitigate them. Involve your supervisor to discuss steps for harm reduction.

Make sure to disclose all possible risks of harm to participants before the study to get informed consent. If there is a risk of harm, prepare to provide participants with resources or counseling or medical services if needed.

Some of these questions may bring up negative emotions, so you inform participants about the sensitive nature of the survey and assure them that their responses will be confidential.

The way you communicate your research results can sometimes involve ethical issues. Good science communication is honest, reliable, and credible. It’s best to make your results as transparent as possible.

Take steps to actively avoid plagiarism and research misconduct wherever possible.

Plagiarism means submitting others’ works as your own. Although it can be unintentional, copying someone else’s work without proper credit amounts to stealing. It’s an ethical problem in research communication because you may benefit by harming other researchers.

Self-plagiarism is when you republish or re-submit parts of your own papers or reports without properly citing your original work.

This is problematic because you may benefit from presenting your ideas as new and original even though they’ve already been published elsewhere in the past. You may also be infringing on your previous publisher’s copyright, violating an ethical code, or wasting time and resources by doing so.

In extreme cases of self-plagiarism, entire datasets or papers are sometimes duplicated. These are major ethical violations because they can skew research findings if taken as original data.

You notice that two published studies have similar characteristics even though they are from different years. Their sample sizes, locations, treatments, and results are highly similar, and the studies share one author in common.

Research misconduct

Research misconduct means making up or falsifying data, manipulating data analyses, or misrepresenting results in research reports. It’s a form of academic fraud.

These actions are committed intentionally and can have serious consequences; research misconduct is not a simple mistake or a point of disagreement about data analyses.

Research misconduct is a serious ethical issue because it can undermine academic integrity and institutional credibility. It leads to a waste of funding and resources that could have been used for alternative research.

Later investigations revealed that they fabricated and manipulated their data to show a nonexistent link between vaccines and autism. Wakefield also neglected to disclose important conflicts of interest, and his medical license was taken away.

This fraudulent work sparked vaccine hesitancy among parents and caregivers. The rate of MMR vaccinations in children fell sharply, and measles outbreaks became more common due to a lack of herd immunity.

Research scandals with ethical failures are littered throughout history, but some took place not that long ago.

Some scientists in positions of power have historically mistreated or even abused research participants to investigate research problems at any cost. These participants were prisoners, under their care, or otherwise trusted them to treat them with dignity.

To demonstrate the importance of research ethics, we’ll briefly review two research studies that violated human rights in modern history.

These experiments were inhumane and resulted in trauma, permanent disabilities, or death in many cases.

After some Nazi doctors were put on trial for their crimes, the Nuremberg Code of research ethics for human experimentation was developed in 1947 to establish a new standard for human experimentation in medical research.

In reality, the actual goal was to study the effects of the disease when left untreated, and the researchers never informed participants about their diagnoses or the research aims.

Although participants experienced severe health problems, including blindness and other complications, the researchers only pretended to provide medical care.

When treatment became possible in 1943, 11 years after the study began, none of the participants were offered it, despite their health conditions and high risk of death.

Ethical failures like these resulted in severe harm to participants, wasted resources, and lower trust in science and scientists. This is why all research institutions have strict ethical guidelines for performing research.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Measures of central tendency
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Cluster sampling
  • Stratified sampling
  • Thematic analysis
  • Cohort study
  • Peer review
  • Ethnography

Research bias

  • Implicit bias
  • Cognitive bias
  • Conformity bias
  • Hawthorne effect
  • Availability heuristic
  • Attrition bias
  • Social desirability bias

Ethical considerations in research are a set of principles that guide your research designs and practices. These principles include voluntary participation, informed consent, anonymity, confidentiality, potential for harm, and results communication.

Scientists and researchers must always adhere to a certain code of conduct when collecting data from others .

These considerations protect the rights of research participants, enhance research validity , and maintain scientific integrity.

Research ethics matter for scientific integrity, human rights and dignity, and collaboration between science and society. These principles make sure that participation in studies is voluntary, informed, and safe.

Anonymity means you don’t know who the participants are, while confidentiality means you know who they are but remove identifying information from your research report. Both are important ethical considerations .

You can only guarantee anonymity by not collecting any personally identifying information—for example, names, phone numbers, email addresses, IP addresses, physical characteristics, photos, or videos.

You can keep data confidential by using aggregate information in your research report, so that you only refer to groups of participants rather than individuals.

These actions are committed intentionally and can have serious consequences; research misconduct is not a simple mistake or a point of disagreement but a serious ethical failure.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2024, May 09). Ethical Considerations in Research | Types & Examples. Scribbr. Retrieved July 30, 2024, from https://www.scribbr.com/methodology/research-ethics/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, data collection | definition, methods & examples, what is self-plagiarism | definition & how to avoid it, how to avoid plagiarism | tips on citing sources, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

We use cookies on reading.ac.uk to improve your experience, monitor site performance and tailor content to you.

Read our cookie policy to find out how to manage your cookie settings.

This site may not work correctly on Internet Explorer. We recommend switching to a different browser for a better experience.

Research ethics and data protection

When planning for the management of data collected from research participants, it is essential to consider issues of research ethics and data protection from the outset, because how you handle the information and consent processes may affect your ability to share data later on. You also have an ethical and legal responsibility to ensure that you store and share confidential and personal data securely and do not disclose them to unauthorised persons.

In most cases, data collected from human subjects can be made accessible to others, either publicly or (where necessary) on a restricted basis, but you will need to ensure that you use appropriate consent procedures, and that you plan for anonymisation or minimisation of identifiable data prior to sharing.

Applications to Research Ethics Committees

REC DMP Template (docx)

REC DMP Guidance (PDF)

REC DMP Examples (docx)

The DMP template, guidance and examples of suitable responses provided here are specifically for use as part of a REC process. For general data management planning, please refer to guidance on writing a data management plan .

University Research Ethics Committee (UREC)

Applications for ethical approval submitted to the University Research Ethics Committee must be accompanied by a DMP prepared using the above template. Applicants should refer to the guidance document when completing the DMP. The DMP will be reviewed by the Information Management and Policy Services (IMPS) office and the Research Data Manager. Comments on the DMP will be returned to the applicant with the UREC opinion, and any conditions that must be met for a favourable oininion to be granted will be specified.

School Research Ethics Committees (SRECs)

UREC encourages School Research Ethics Committees to require the submission of a DMP using the above template as part of an application for ethical approval. Where this requirement operates, SRECs should review the DMP with reference to the DMP Assessment Guide for School RECs  (PDF). We provide a Checklist (docx) that can be used to assist in the review process and a DMP Review Form (docx) that can be completed and returned to the candidate with any requirements or advice. IMPS and the Research Data Service will not review DMPs submitted to SRECs as part of standard procedure, but can be contacted by SRECs for advice on specific questions or concerns raised by an application.

Research ethics

You have an ethical obligation to protect the confidentiality of personal information provided to you by research participants. Any research involving human subjects will need to receive approval from your School's or the University's Research Ethics Committee, and you will be required to obtain documented consent to participation in the research from your participants. Guidance on the process of seeking ethical approval for research projects can be found on the Research ethics web pages.

In your application for ethical approval and in the information your provide to participants you should not undertake at any time to destroy research data collected from the participants, or not to share such data outside the project, as this may prevent you from sharing data in the future. Research data that have been anonymised are no longer confidential and can therefore be shared.

Personal data should be destroyed when no longer required, and it is acceptable to tell your participants this, but you should clearly distinguish between the personal data that will be held in confidence and ultimately destroyed, and the anonymised research data that will be retained indefinitely and made available to others.

It is wise to avoid making a specific commitment to destroy personal data by a set time: under data protection law, personal data can be retained as long as a valid reason for their retention exists, and personal data held for public interest archiving, scientific or historical research, or statistical purposes may be retained indefinitely. For example, you may wish retain details of participants on an internal database to enable you to undertake follow-up studies.

Data protection

You must comply with data protection law if you collect and process personal data. Where personal data are processed in jurisdictions outside the European Economic Area, they should be handled to the standards prescribed by UK data protection law.

If you will be processing personal data in your research, you are advised to consult University guidance on Data Protection and Research . Here you can find a Data Protection Checklist for researchers , which you should use as part of your planning process. A sample information sheet and consent form are also provided. You should acquaint yourself with the University's Data Protection, Encryption and Remote Working policies, which can be found on the Information Compliance Policies web page.

Personal data is any information relating to an identified or identifiable natural person. These data enjoy statutory protection under the General Data Protection Regulation 2016 and the Data Protection Act 2018. Under this legislation any personal data collected by you must be processed fairly and lawfully. Among other things you will be required to issue a privacy notice to your research participants, which explains the purpose(s) for which the data are being collected, your lawful basis for processing the data, who the data will be disclosed to, and the rights of the individuals in respect of their personal data. For certain kinds of research, for example involving the processing of sensitive data or human genetic data, you will need to complete a Data Protection Impact Assessment under the advice of the University Information Management & Policy Services Officer.

You must ensure that personal data are kept secure and are not disclosed to unauthorised persons. You should use a locked storage container such as a filing cabinet in a locked office for paper-based personal data; for digital data, password-protected or, preferably, encrypted storage. This particularly applies in the case of special category sensitive personal data, which include information about an individual's: race; ethnic origin; politics; religion; trade union membership; genetics; biometrics (where used for ID purposes); health; sex life; or sexual orientation. Such personal data should be encrypted, and not stored or shared by means of cloud services other than a University OneDrive account, or transferred via unencrypted channels (e.g. via email). You can securely transfer data between individuals and devices using OneDrive; and from off-campus to a location on the University network using VPN, which provides an encrypted channel.

You will need to consider issues related to data protection and secure processing of information when you use instruments to collect data from research participants, including any online software services such as survey tools. Any third-party service provider collecting personal survey data on your behalf is acting in the capacity of a data processor as defined under the Data Protection Act. Whenever a data controller uses a data processor, a written contract must be in place so that both parties understand their responsibilities and liabilities. Among other things the data processor would need to store data in the European Economic Area or under conditions that provide equivalent protections for personal data. Visit the online survey tools web page to find out about approved survey tools available through the University. If you want to use a particular data collection tool and are not sure whether it is suitable, contact us for advice.

Working procedures should be designed to minimise the risk of inappropriate disclosure. Data can often be pseudonymised for purposes of processing and analysis, with the personally-identifying information and their linked IDs stored separately from the working dataset. When the study is complete and if there is no further need to link individuals to data, the linking key can be destroyed, so that the data become fully anonymised.

You can retain personal data indefinitely for archiving purposes in the public interest, scientific or historical research purposes, or statistical purposes. You do not need to commit to destroy personal data at a set time, but they should be managed under a retention schedule that specifies periodic reviews, so that they can be securely destroyed when no longer needed.

Consent and anonymisation

Most data collected from human subjects can be anonymised for public sharing. This applies to both quantitative and qualitative data. A valid reason for restricting access to such data would obtain only if it is not possible to anonymise the data (biometric data, for example) or if the risk of causing harm or distress by disclosure is significant. The UK Data Service provides guidance on anonymisation of both quantitative and qualitative data ; the MRC also provides information about anonymisation and pseudonymisation .

Be aware that in order for publicly-disclosed data to be anonymised, any means of linking them to participant records stored internally would need to be destroyed. If you have used key codes (pseudonymous identifiers) in your dataset which are linked to separately-held internal participant records, the key codes would have to be removed for the dataset to be anonymised.

Even data that are anonymised but considered higher risk or data containing personal or confidential information may be shared under certain conditions, with appropriate consent and under secure management. Some data repositories can manage data under a controlled access protocol, e.g. the UK Data Service ReShare repository has a 'safeguarded' option for higher-risk anonymised data, and the European Genome-phenome Archive manages access to identifiable human data under a rigorous approval process and subject to a data access agreement. The University's Research Data Archive can also offer a restricted access option. Contact us if you wish to discuss this.

If data collected from human subjects have been fully anonymised, you do not need consent to share them, but it is good practice to inform your research participants how the data you collect from them will be used. Your information sheet should address this and your consent form should specifically allow the participant to indicate they have understood your intentions and agree to data sharing, by checking a statement such as this:

'I understand that the data collected from me in this study will be preserved and made available in anonymised form, so that they can be consulted and re-used by others.'

This statement is suitable where anonymised data will be be made available as open data, i.e. without restriction. If access to data will be controlled by the data repositoy, then a suitable consent formula would be:

'I understand that the data collected from me in this study will be preserved, and subject to safeguards will be made available to other authenticated researchers.’

Further guidance on consent and anonymisation is provided in the University's Data Protection and Research guide.

Robert Darby , Research Data Manager

[email protected]

0118 378 6161

DMP Assessment Guide for School RECs  (PDF)

DMP Assessment Checklist for School RECs (docx)

DMP Review Form for School RECs (docx)

data ethics thesis

College of Agriculture & Natural Resources │ College of Engineering Department of Biosystems & Agricultural Engineering

Thesis defense: emilia emerson.

August 8, 2024 - August 9, 2024

Farrall Hall room 208 or Zoom

Thesis Defense 

Emilia Emerson

“The Effect of Hydraulic Retention Time on Recoverable Ammonia, Virus-Particle Association, and Bacterial, Fungal, and Viral Populations in a Bench Scale Activated Sludge Municipal Wastewater Treatment System”

Thursday, August 8, 2024

9:00 AM – 10:00 AM EST

Farrall Hall room 208

Zoom: https://msu.zoom.us/j/91202303218

Committee Members

Dr. Wei Liao, Department of Biosystems & Agricultural Engineering (Chair)

Dr. Joan Rose, Department of Fisheries and Wildlife (Co-Chair)

Dr. Yan (Susie) Liu, Department of Biosystems & Agricultural Engineering

Tags: bae events

new - method size: 2 - Random key: 1, method: personalized - key: 1

You Might Also Be Interested In

Labor Day 2024 - University Closed

September 2, 2024 3:39PM – 12:00AM

CANR Tailgate

September 14, 2024 10:30AM – 3:30PM

Join fellow alumni and CANR leadership at a tailgate prior to the MSU v. Prairie View A&M football game.

2024 ODEI Welcome Reception Brunch

September 17, 2024 12:00PM – 2:00PM MSU Horticulture Gardens

At the beginning of each fall semester, the Office of Diversity, Equity and Inclusion holds a welcome event. It's a chance to meet the staff, as well as connect with others in the college.

Agriculture and Food industry Career Fair 2024

September 24, 2024 3:00PM – 6:00PM

MSU to host Agriculture and Food industry Career Fair Sept. 24, 2024

9/26/24 Virtual Mental Health First Aid

September 26, 2024 8:30AM – 4:30PM Zoom

The 9/26/24 Virtual Mental Health First Aid certification course is for adults 18 years and older who hope to learn how to assist other adults experiencing a mental health challenge or crisis.

Thanksgiving 2024 - University Closed

November 28, 2024 – November 29, 2024

MSU is closed

  • bae events,
  • department of biosystems and agricultural engineering

share this on facebook

data ethics thesis

Media Center 7/30/2024 8:00:00 AM Corbin McGuire

Swimming in Data: How Kate Douglass’ statistical studies have fueled Olympic dreams

Virginia swimmer merges data analytics with elite training to shatter records ahead of paris games.

Kate Douglass' time at Virginia has uniquely prepared her for what could be a breakout performance at the 2024 Paris Olympics. 

This is true on several levels, including one directly tied to her bachelor's degree in statistics, which she finished in 2023, as well as the master's degree in the same field she's working on. 

A 15-time NCAA champion in individual and relay events, Douglass has worked with Ken Ono, a renowned mathematician and professor at Virginia, to apply statistical analysis to improve her swimming techniques. This innovative approach has paid off with faster times in every event and helped her earn spots at the Paris Olympics in the 200-meter breaststroke and 200-meter individual medley, as well as on the 4x100-meter relay team that claimed silver Saturday and set an American record. Douglass co-authored an academic paper, " Swimming in Data ," which detailed how these techniques helped her refine her breaststroke, resulting in breaking a 12-year-old American record in the 200-meter breaststroke.

"Data analytics has kind of started to become big around swimming," Douglass said. "We use an accelerometer device that wraps around our waist when we're swimming, and it can test our acceleration, whether we're accelerating or decelerating in certain spots of our stroke."

Douglass, a two-time Olympian and one of the most decorated NCAA swimmers ever, has thrived in both the pool and the classroom, thanks to the supportive and competitive environment in Charlottesville.

"I think kind of being a part of the team culture, and just having older girls to look up to in practice, definitely helped me figure out how to reach my full potential in practice," Douglass said. "It took kind of my teammates and coaches in college to get me out of my comfort zone and realize that feeling uncomfortable in practice was how you got better."

A two-time Olympian, Douglass won 15 NCAA titles in individual and relay events at Virginia, as well as helping the Cavaliers to team championships in 2021, 2022 and 2023.

Douglass' collegiate career has been nothing short of stellar. 

Her freshman year was marked by an impressive performance at the 2020 Atlantic Coast Conference championships, where she won five conference titles. Although the 2020 NCAA championships were canceled due to the COVID-19 pandemic, Douglass' sophomore year was historic. She set Virginia records in five different events and won an individual title in the 50-yard free at the 2021 NCAA championships, helping the Cavaliers claim their first NCAA team trophy. 

"Every time we've come together as a team and won a national championship, it was a really awesome moment. I definitely think the first one was the most special because COVID was my first year and it was a hard year for all of us," she said. "So that first national championship was really for all those fourth-year girls who didn't get a chance to do it. It felt really good to do it for them."

Douglass followed it up with what she described as a surprise in qualifying for the Tokyo Olympics in the 200-meter individual medley. She went on to win a bronze medal. Both moments were pivotal in her career.  

"Going into Tokyo, I wasn't really prepared to make the team. I didn't really know what was going to happen," she said. "I definitely think going to Tokyo helped boost my confidence"

Returning to Virginia, Douglass continued to dominate. At the 2022 NCAA championships, she became the first swimmer — male or female — to win individual races in three different strokes (50-yard free, 100-yard butterfly and 200-yard breaststroke). In 2023, she added three more individual titles, with Virginia winning relays both years, bringing her total to seven individual and relay titles in each of her last two NCAA championships. Her victories helped propel the Cavaliers to NCAA team titles in both 2022 and 2023.

Her success continued on the world stage, too. In the past three world championships, Douglass won 14 medals. Douglass attributes much of her success to her teammates and the challenging environment at Virginia, specifically noting Ella Nelson and Maddie Donohoe's impact.

"I definitely think the girls in my class have been super instrumental most of my career," Douglass said. "We kind of just went through everything together."

Douglass has continued to train at Virginia under Cavalier head coach Todd DeSorbo, who is also the U.S. Olympic women's swimming coach. Part of what kept her at Virginia is the balanced approach to life it offers. She said the joy she's found in both her academic and athletic endeavors has been vital for her mental well-being. 

"Outside of swimming, I really just enjoy being a UVA student and embracing the culture of the University of Virginia," she said. "On days when I don't have practice, it's nice to kind of just walk to class and walk around the grounds and just kind of take in how beautiful it is, and just pretend I'm a normal college student for once." 

Douglass credits her teammates and coaches at Virginia for her success internationally.

Thanks for visiting !

The use of software that blocks ads hinders our ability to serve you the content you came here to enjoy.

We ask that you consider turning off your ad blocker so we can deliver you the best experience possible while you are here.

Thank you for your support!

  • Infrastructure & Standards
  • Information & Data
  • Intellectual Property Rights
  • Privacy & Security

Making sense of data ethics. The powers behind the data ethics debate in European policymaking

Introduction.

January 2018: The tweet hovered over my head: “Where are the ethicists?” I was on a panel in Brussels about data ethics and this wasn’t the first time a panel or initiative as such was questioned. There wasn’t the foundation proper, the right expertise was not included - the ethicists were missing, the humanists were missing, the legal experts were missing. The results, outcome and requirements of these initiatives were unclear. Would they water down the law? I understood the critiques though. How could we talk about data ethics when a law was just passed following a lengthy negotiation process on this very topic? What was the function of these discussions? If we were not there to acknowledge a consensus, that is, the legal solution, what then was the point?

In the slipstream of sweeping data protection law reform in Europe, discussions regarding data ethics has gained traction in European public policy-making. Numerous data ethics public policy initiatives have been created, moving beyond issues of mere compliance with data protection law to increasingly focus on the ethics of big data, especially concerning private companies’ and public institutions’ handling of personal data in digital forms. Reception in public discourse has been mixed. Although gaining significant public attention and interest, these data ethics policy initiatives have also been depicted as governmental “toothless wonders” (e.g., Hill, 24 November 2017) and a waste of resources, and have been criticised for drawing attention away from public institutions’ mishandling of citizens’ data (e.g., Ingeniøren’s managing panel, op ed, 16 March 2018) and for potential “ethics washing” (Wagner, 2018), questioning the expertise and interests involved in the initiatives, as well as their normative ethics frameworks.

This article constitutes an analytical investigation of the various dimensions and actors that shape definitions of data ethics in European policy-making. Specifically, I explore the role and function of European data ethics policy initiatives and present an argument regarding how and why they took shape in the context of a European data protection regulatory reform. The explicit use of the term “ethics” calls for a philosophical framework; the term “data” for a contemporary perspective of the critical role of information in a digitalised society; and the policy context for consensus-making and problem solving. Together, these views on the role of the data ethics policy initiatives are highly pertinent. However, taken separately they each provide a one-sided kaleidoscopic insight into their role and function. For example, a moral philosophical view concerning data ethics initiatives (in public policy-making as well as in the private industry) might not be vigilant of the embedded interests and power relations; pursuit of actionable policy results may overlook their function as spaces of negotiation and positioning; while viewing data ethics initiatives as something radically new in the age of big data can lose sight of their place in and relation to history and governance in general.

In my analysis, I therefore adopt an interdisciplinary approach that draws on methods and theories from different subfields within applied ethics, political science, sociology, culture and infrastructure/STS studies. A central thesis of this article is that we should perceive data ethics policy initiatives as open-ended spaces of negotiation embedded in complex socio-technical dynamics, which respond to multifaceted governance challenges extended over time. Thus, we should not view data ethics policy initiatives as solutions in their own right. They do not replace legal frameworks such as the European General Data Protection Regulation (GDPR). Rather, they complement existing law and may inspire, guide and even set in motion political, economic and educational processes that could foster an ethical “design” of the big data age, covering everything from the introduction of new laws, the implementation of policies and practices in organisations and companies and the development of new engineering standards, to awareness campaigns among citizens and educational initiatives.

In the following, I first outline a cross-disciplinary conceptualisation of data ethics, presenting what I define as an analytical framework for a data ethics of power . I then describe the data ethics public policy focus in the context of the GDPR. I recognise that ethics discussions are implicit in legislative processes. Nevertheless, in this article I do not specifically focus on the regulation’s negotiation process as such, but rather on policymakers’ explicit use of the term “data ethics”, and especially on the emergence of formal data ethics policy initiatives (for instance, committees, working groups, stated objectives and results), many of which followed the adoption of the GDPR. I subsequently move on to an analysis of data ethics as described in public policy reports, statements, interviews and events in the period 2015–2018. In conclusion, I take a step back and review the definition of data ethics. Today, data ethics is an idea, concept and method that is used in policy-making, but which has no shared definition. While more aligned conceptualisations of data ethics might provide a guiding step for a collective vision for actions in law, business and society in general, an argument that runs through this article is that there is no definition of data ethics in this space neutral of values and politics. Therefore, we must position ourselves within a context-specific type of ethical action.

This article is informed by a study that I am conducting on data ethics in governance and technology development in the period 2017-2020. In that study and this article, I use an ethnographically informed approach based on active and embedded participation in various data protection/internet governance policy events, working groups and initiatives. Qualitative embedded research entails an immersion of the researcher in the field of study as an active and engaged member to achieve thorough knowledge and understanding (Bourdieu, 1997; Bourdieu & Wacquant 1992; Goffman, 1974; Ingold, 2000; Wong, 2009). Thus, essential to my understanding of the underlying dimensions of the topic of this article is my active participation in the internet governance policy community. I was for example part of the Danish government’s data ethics expert committee (2018) and am part of the European Commission’s Artificial Intelligence High Level Expert group (2018-2020). I am also the founder of the non profit organisation DataEthics.eu, which is active in the field.

In this article, I also draw on ideas, concepts and opinions generated in interaction with nine active players (decision-makers, policy advisors and civil servants) whom contributed to my understanding of the policy-making dynamics by sharing their experiences with data ethics in European 1 policy-making (see further in references). The interviewees were informed about the study and that they would not be represented by name and institution in any publications, as I wanted them to be minimally influenced by institutional interests and requirements in their accounts. 2

Section 1: What is data ethics? A data ethics of power

In this section I introduce the emerging field of data ethics as the cross-disciplinary study of the distribution of societal powers in the socio-technical systems that form the fabric of the “Big Data Society”. Based on theories, practices and methods within applied ethics, legal studies and cultural studies, social and political sciences, as well as a movement within policy and business, I present an analytical framework for a “data ethics of power”.

As a point of departure, I define a data ethics of power as an action-oriented analytical framework concerned with making visible the power relations embedded in the “Big Data Society” and the conditions of their negotiation and distribution, in order to point to design, business, policy, social and cultural processes that support a human-centric distribution of power. In a previous book (Hasselbalch and Tranberg, 2016) we described data ethics as a social movement of change and action: “Across the globe, we’re seeing a data ethics paradigm shift take the shape of a social movement, a cultural shift and a technological and legal development that increasingly places the human at the centre” (p. 10). Thus, data ethics can be viewed as a proactive agenda concerned with shifting societal power relations and with the aim to balance the powers embedded in the Big Data Society. This shift is evidenced in legal developments (such as the GDPR negotiation process) and in new citizen privacy concerns and practices such as the rise in use of ad blockers and privacy enhancing services, etc. In particular, new types of businesses emerge that go beyond mere compliance with data protection legislation when incorporating data ethical values in collection and processing of data, as well as their general innovation practices, technology development, branding and business policies.

Here, I use the notion of “Big Data Society” to reflectively position data ethics in the context of a recent data (re)evolution of the “Information Society”, enabled by computer technologies and dictated by a transformation of all things (and people) into data formats (“datafication”) in order to “quantify the world” (Mayer-Schonberger & Cukier, 2013, p. 79) to organise society and predict risks. While I suggest that this is not an arbitrary evolution, but can also be viewed as an expression of negotiations between different ontological views on the status of the human being and the role of science and technology. As the realisation of a prevailing ideology of modernist scientific practices to command nature and living things, the critical infrastructures of the Big Data Society may therefore very well be described as modernity embodied in a “lived reality” (Edwards, 2002, p. 191) of control and order. From this viewpoint, a data ethics of power can be described as a type of post-modernist , or in essence vitalist , call for a specific kind of “ethical action” (Frohmann, 2007, p. 63) to free the living/human being from the constraints of the practices of control embedded in the technological infrastructures of modernity that at the same time reduce the value of the human being. It is here valuable to understand current calls for data ethical action in extension of the philosopher Henri Bergson’s vitalist arguments at the turn of the last century against the scientific rational intellect that provides no room for, or special status to, the living (Bergson, 1988, 1998). In a similar ethical framework, Gilles Deleuze, who was also greatly inspired by Bergson (Deleuze, 1988), later described over-coded “Societies of Control” (Deleuze, 1992), which reduce people (“dividuals”) to a code marking their access and locking their bodies in specific positions (p. 5). More recently, Spiekerman et al. (2017) in their Anti-Transhumanist Manifesto directly oppose a vision of the human as merely information objects, no different than other information objects (that is; non-human informational things), which they describe as “an expression of the desire to control through calculation. Their approach is limited to reducing the world to data-based patterns suited for mechanical manipulation” (p. 2).

However, a data ethics of power should also be viewed as a direct response to the power dynamics embedded in and distributed via our very present and immediate experiences of a “Liquid Surveillance Society” (Lyon, 2010). Surveillance studies scholar David Lyon (2014) envisions an “ethics of Big Data practices” (2014, p. 10) to renegotiate what is increasingly exposed to be an unequal distribution of power in the technological big data infrastructures. Within this framework we do not only pay conventional attention to the state as the primary power actor (of surveillance), but also include new stakeholders that gain power through accumulation and access to big data. For example, in the analytical framework of a data ethics of power, changing power dynamics are progressively more addressed in the light of the information asymmetry between individuals and the big data companies that collect and process data in digital networks (Pasquale, 2015; Powles, 2015–2018; Zuboff, 5 March 2016, 9 September 2014, 2019).

Beyond this fundamental theoretical framing, a data ethics of power can be explored in an interdisciplinary field addressing the distribution of power in the Big Data Society in diverse ways.

For instance, in a computer ethics perspective, power distributions are approached as ethical dilemmas or as implications of the very design and practical application of computer technologies. Indeed, technologies are never neutral, they embody moral values and norms (Flanagan, Howe, & Nissenbaum, 2008), hence power relations can be identified through analysing how technologies are designed in ethical or ethically problematic ways. Information science scholars Batya Friedman and Helen Nissenbaum (1996) have illustrated different types of bias embedded in existing computer systems that are used for tasks such as flight reservations and the assignment of medical graduates to their first job, and have presented a framework for such issues in the design of computer systems. From this perspective, we can also describe data ethics as what the philosophy and technology scholar Philip Brey terms a “Disclosive Computer Ethics”, identifying moral issues such as “privacy, democracy, distributive justice, and autonomy” (Brey, 2000, p. 12) in opaque information technologies. Phrased differently, a data ethics of power presupposes that technology has “politics” or embedded “arrangements of power and authority” (Winner, 1980, p. 123). Case studies of specific data processing software and their use can be defined as data ethics case studies of power, notably the “Machine Bias” study (Angwin et al., 2016), which exposed discrimination embedded in data processing software used in United States defence systems, and Cathy O’Neil’s (2016) analysis of the social implications of the math behind big data decision making in everything from getting insurance, credit to getting and holding a job.

Nevertheless, data systems are increasingly ingrained in society in multiple forms (from apps to robotics) and have limitless and wide-ranging ethical implications (from price differentiation to social scoring), necessitating that we look beyond design and computer technology as such. Data ethics as a recent designation represents what philosophers Luciano Floridi and Mariateresa Taddeo (2016, p. 3) describe as a primarily semantic shift within a computer and information ethics philosophical tradition from a concern with the ethical implications of the “hardware” to one with data and data science practices. However, looking beyond applied ethics in the field of philosophy to a data ethics of power, our theorisation of the Big Data Society is more than just semantic. The conceptualisation of a data ethics of power can also be explored in a legal framework, as an aspect of the rule of law and protection of citizens’ rights in an evolving Big Data Society. Here, redefining the concept of privacy (Cohen, 2013; Solove, 2008) in a legal studies framework, addresses the ethical implications of new data practices and configurations that challenge existing laws, and thereby the balancing of powers in a democratic society. As legal scholars Neil M. Richards and Jonathan King (2014) argue: “Existing privacy protections focused on managing personally identifying information are not enough when secondary uses of big data sets can reverse engineer past, present, and even future breaches of privacy, confidentiality, and identity” (p. 393). Importantly, these authors define big data “socially, rather than technically, in terms of the broader societal impact they will have,” (Richards & King, 2014, p. 394) providing a more inclusive analysis of a “big data ethics” (p. 393) and thus pointing to the ethical implications of the empowerment of institutions that possess big data capabilities at the expense of “individual identity” (p. 395).

Looking to the policy, business and technology field, the ethical implications of the power of data and data technologies are framed as an issue of growing data asymmetry between big data institutions and citizens in the very design of data technologies. For example, the conceptual framework of the “Personal Data Store Movement” (Hasselbalch & Tranberg, 27 September 2016) is described by the non-profit association MyData Global Movement as one in which “[i]ndividuals are empowered actors, not passive targets, in the management of their personal lives both online and offline – they have the right and practical means to manage their data and privacy” (Poikola, Kuikkaniemi, & Honko, 2018). In this evolving business and technology field, the emphasis is on moving beyond mere legal data protection compliance, implementing values and ethical principles such as transparency, accountability and privacy by design (Hasselbalch & Tranberg, 2016), and ethical implications are mitigated by values-based approaches to the design of technology. For example, engineering standards such as those of IEEE P7000s Ethics and AI standards 3 that seek to develop ethics by design standards and guiding principles for the development of artificial intelligence (AI). A values based design approach is also revisited in recent policy documents such as section 5.2. “Embedded values in technology – ethical-by-design” of the European Parliament’s “ Resolution on Artificial Intelligence and Robotics ” adopted in February 2019.

A key framework for data ethics is the human-centric approach that we increasingly see included within ethics guidelines and policy documents. For example, the European Parliament’s (2019, V.) resolution states that “whereas AI and robotics should be developed and deployed in a human-centred approach with the aim of supporting humans at work and at home…”. The EC High Level Expert Group on Artificial Intelligence’s draft ethics guidelines also stress how the human-centric approach to AI is one that “strives to ensure that human values are always the primary consideration” (working document, 18 December 2018, p. iv), and directly associate it with the balance of power in democratic societies: “political power is human centric and bounded. AI systems must not interfere with democratic processes” (p. 7). The human-centric approach in European policy-making is framed in a European fundamental rights framework (as for example extensively described in the European Commission’s AI High Level Expert group’s draft ethics guidelines) and/or with an emphasis on the human being’s interests prevailing over “the sole interests of society or science” (article 2, “Oviedo Convention”). Practical examples of the human-centric approach can also be found in technology and business developments that aim to preserve the specific qualities of humans in the development of information processing technologies. Examples include the Human in the Loop (HITL) approach to the design of AI, The International Organization for Standardization (ISO) standards on human-centred design (HCD) and the Personal Data Store Movement, which is defined as “A Nordic Model for human-centered personal data management and processing.” (Poikola et al., 2018)

Section 2: European data ethics policy initiatives in context

Policy debates that specifically address ethics in the context of technological developments have been ongoing in Europe since the 1990s. The debate has increasingly sought to harmonise national laws and approaches in order to preserve a European value framework in the context of rapid technological progress. For instance, the Council of Europe’s “Oviedo Convention” was motivated by what Wachter (1997, p. 14) describes as “[t]he feeling that the traditional values of Europe were threatened by rapid and revolutionary developments in biology and medicine”. Data ethics per se gained momentum in pan-European politics in the final years of the negotiation of the GDPR, through the establishment of a number of initiatives directly referring to data and/or digital ethics. Thus, the European Data Protection Supervisor (EDPS) Digital Ethics Advisory Group (2018, p. 5) describes its work as being carried out against “a growing interest in ethical issues, both in the public and in the private spheres and the imminent entry into force of the General Data Protection Regulation (GDPR) in May 2018” .

Examination of the differences in scope and the stakeholders involved in respectively the development of the 1995 Data Protection Directive and the negotiation process of the GDPR beginning with the European Commission’s proposal in 2012, provides some insight into the evolution of the focus of data ethics. The 1995 Directive was developed by a European working party of privacy experts and national data protection commissioners in a process that excluded business stakeholders (Heisenberg, 2005). Nevertheless, the group of actors influencing and participating in the development of the GDPR process progressively expanded, with new stakeholders comprising consumer and civil liberty organisations and American industry representatives and policymakers. The GDPR was generally described as one of the most lobbied EU regulations (Warman, 8 February 2012). At the same time, the public increasingly scrutinised the ethical implications of a big data era, with numerous news stories published on data leaks and hacks, algorithmic discrimination and data-based voter manipulation.

Several specific provisions of the GDPR were discussed inside and outside the walls of European institutions. For example, the “right to erasure” proposed in 2012 was heavily debated by industry and civil society organisations, especially in Europe and the USA, and was frequently described in the media as a value choice between privacy and freedom of expression. In 2013, the transfer of data to third countries (including those covered by the EU-US Safe Harbour agreement) engendered a wider public debate between certain EU parliamentarians and US politicians regarding mass surveillance and the role of large US technology companies. Another example was the discussion of an age limit of 16. This called civil society advocates into action (Carr, Should I laugh, cry or emigrate?, 13 December 2015) and led to new alliances with US technology companies regarding young people’s right to “educational and social opportunities” (Richardson, “European General Data Protection Regulation draft: the debate”, 10 December 2015). A last-minute decision rendered it possible to lower the age limit to 13 in member states.

These intertwined debates and negotiations illustrate how the data protection field was transformed within a global information technology infrastructure. It took shape as a negotiation of competing interests and values between economic entities, EU institutions, civil society organisations, businesses and third country national interests. We can also perceive these spaces of negotiation of rights, values and responsibilities and the creation of new alliances to have a causal link with the emergence of data ethics policy initiatives in European policy-making. In the years following the first communication of the reform, data protection debates were extended, with the concept of data ethics increasingly included in meeting agendas, debates in public policy settings and reports and guidelines. Following the adoption of the GDPR, the list of European member states or institutions with established data or digital ethics initiatives and objectives rapidly grew. Examples included the UK government’s announcement of a £9 million Centre for Data Ethics and Innovation with the stated aim to “advise government and regulators on the implications of new data-driven technologies, including AI” ( Digital Charter, 2018 ). The Danish government appointed a data ethics expert committee 4 in March 2018 with a direct economic incentive to create data ethics recommendations to Danish industry and to turn responsible data sharing into a competitive advantage for the country (Danish Business Authority, 12 March 2018). Several member states’ existing and newly established expert and advisory groups and committees began to include ethics objectives into their work. For example, the Italian government established an AI Task Force in April 2017, publishing its first white paper in 2018 (AI Task Force/Italy, 2018) with an explicit section on ethics. The European Commission’s communication on an AI strategy , published in April 2018, also included the establishment of an AI High Level Expert Group 5 , whose responsibility it was, among others, to publish ethics guidelines for AI in Europe the following year.

Section 3: Data ethics - policy vacuums

“I’m pretty convinced that the ethical dimension of data protection and privacy protection is going to become a lot more important in the years to come” (in ‘t Veld, 2017). These words of a European parliamentarian in a public debate in 2017 referred to the evolution of policy debates regarding data protection and privacy. You can discuss legal data protection provisions, she claimed, but then there is “a kind of narrow grey area where you have to make an ethical consideration and you say what is more important” (in ‘t Veld, 2017). What did she mean by her use of the term “ethics” in this context?

In an essay entitled “What is computer ethics?” (1985), the moral philosophy scholar James H. Moor described the branch of applied ethics that studies the ethical implications of computer technologies. Published only a few years after Acorn, the first IBM personal computer, was introduced to the mass market, Moor was interested in computer technologies per se (what is special about computers), as well as the policies required in specific situations where computers alter the state of affairs and create something new. But he also predicted a more general societal revolution (Moor, 1985, p. 268) due to the introduction of computers that will “leave us with policy and conceptual vacuums” (p. 272). Policy vacuums, he argued, would present core problems and challenges, revealing “conceptual muddles” (p. 266), uncertainties and the emergence of new values and alternative policies (p. 267).

If we view data ethics policy initiatives according to Moor’s framework, they can be described as moments of sense-making and negotiation created in response to the policy vacuums that arise when situations and settings are amended by computerised systems. In an interview conducted at the Internet Governance Forum (IGF) in 2017, a Dutch parliamentarian described how in 2013, policy-makers in her country rushed to tackle the transformations instigated by digital technologies that were going “ very wrong” (Interview, IGF 2017). In response, she proposed the establishment of a national commission to consider the ethical challenges of the digital society: “it’s very hard to get the debate out of the trenches, you know, so that people stop saying, ‘well this is my position and this is my position’, but to just sit back and look at what is happening at the moment, which is going to be so huge, so incredible, we have no idea what is going to happen with our society and we need people to think about what to do about all of this, not in the sense you know, ‘I don’t want it’, but more in the sense, ‘are there boundaries?’ ‘Do we have to set limits to all of these possibilities that will occur in the coming years?’” Similarly, in another interview conducted at the same event, a representative of a European country involved in the information policy of the Committee of Ministers of the Council of Europe discussed how the results of the evolution of the Information Society included “violations” , “abuses” and recognition of the internet’s influence on the economy. Concluding, she stated that: “We need to slow down a little bit and to think about where we are going”.

In reviewing descriptions of data ethics initiatives, we can note implicit acknowledgement of the limits of data protection law in harnessing all of the ethical implications of a rapidly evolving information and data infrastructure. Data ethics thus become a means to make sense of emerging problems and challenges and to evaluate various policies and solutions. For example, a report from EDPS from 2015 states: “In today’s digital environment, adherence to the law is not enough; we have to consider the ethical dimension of data processing” (p. 4). It continues by describing how different EU law principles (such as data minimisation and the concepts of sensitive personal data and consent) are challenged by big data business models and methods.

The policy vacuums described in such reports and statements highlight the uncertainties and questions that exist regarding the governance of a socio-technical information infrastructure that increasingly shapes not only personal, but also social, cultural and economic activities.

In the same year as Moor’s essay was published, communications scholar Joshua Meyrowitz’s No Sense of Place (1985) portrayed the emergence of “information systems” that modify our physical settings via new types of access to information, thereby restructuring our social relations by transforming situations. As Meyrowitz (1985, p. 37) argued, “[w]e need to look at the larger, more inclusive notion of “patterns of information””, illustrating how our information realities have real qualities that shape our social and physical realities. Accordingly, European policymakers emphasise the real qualities of information and data. They see digital data processes as meaningful components of social power dynamics. Information society policy-making thus becomes an issue of the distribution of resources and of social and economic power, as an EU Competition Commissioner stated at a DataEthics.eu event on data as power in Copenhagen in 2016: “ I’m very glad to have the chance to talk with you about how we can deal with the power that data can give” (Vestager, 9 September 2016). Thus, data ethics policy debates have moved beyond the negotiation of a legal data protection framework, increasingly involving a general focus on information society policy-making, in which different sectional policy areas are intertwined. As the European Commissioner for Competition elaborated at the DataEthics.eu event: “So competition is important. It keeps the pressure on companies to give people what they want. And that includes security and privacy. But we can’t expect competition enforcement to solve all our privacy problems. Our first line of defence will always be rules that are designed specifically to guarantee our privacy” .

Section 4: Data ethics - culture and values

According to Moor, the policy vacuums that emerge when existing policies clash with technological evolution, force us to “discover and make explicit what our value preferences are” (1985, p. 267). He proposes that the computer induced societal revolution will occur in two stages, marked by the questions that we ask. In the first “Introduction Stage”, we ask functional questions: How well does this and that technology function for its purpose? In the second “Permeation Stage”, when institutions and activities are transformed, Moor argues that we will begin to ask questions regarding the nature and value of things (p. 271). Such second-stage questions are echoed in the European policy debate of 2017, as one Member of the European Parliament (MEP) who was heavily involved in the GDPR negotiation process argued in a public debate: “[this is] not any more a technical issue, it’s a real life long important learning experience” (Albrecht, 2017), or as another MEP claimed in the same debate: “The GDPR is not only a legislative piece, it’s like a textbook, which is teaching us how to understand ourselves in this data world and how to understand what are the responsibilities of others and what are the rules which is governing in this world” (Lauristin, 2017).

Consequently, the technicalities of new data protection legislation are transformed into a general discussion about the norms and values of a big data age. Philip Brey describes values as “idealized qualities or conditions in the world that people find good”, ideals that we can work towards realising (2010, p. 46). However, values are not just personal ideals; they are also culturally situated. The cultural theorist Raymond Williams (1958, p. 6) famously defined culture as a “shape”, a set of purposes and common meanings expressed “in institutions, and in arts and learning”, which emerge in a social space of “active debate and amendment under the pressures of experience, contact and discovery”. Culture is thus traditional as well as creative, consisting of prescribed dominant meanings and their negotiation (Williams, 1958). Similarly, the anthropologist James Clifford (1997) replaced the metaphor of “roots” (an image of the original, authentic and fixed cultural entity) with “routes”: intervals of negotiation and translation between the fixed cultural points of accepted meaning. Values are advanced in groups with shared interests and culture but they exist in spaces of constant negotiation. In an interview conducted at the IGF 2017, one policy advisor to an MEP enquired as to the role of values in the GDPR’s negotiations, described privacy as a value shared by a group of individuals involved in the reform process: “I think a group of core players shared that value (…) all the way from people who wrote the proposal at the Commission, to the Commissioner in charge to the rapporteur from the EU Parliament, they all (…) to some extent shared this value, and I think that they managed to create a compromise closer to their value than to others” . He also explained how discussions about values were emerging in processes of negotiation between diverse and at times contradictory interests: “the moment you see a conflict of interest, that is when you start looking at the values (…) normally it would be a discussion about different values (….) an assessment of how much one value should go before another value (… ) so some people might say that freedom of information might be a bigger value or the right to privacy might be a bigger value” .

Accordingly, ethics in practice, or what Brey refers to as “the act of valuing something, or finding it valuable (…) to find it good in some way” (2010, p. 46) is in essence never merely a subjective practice, but neither is it a purely objective construct. If we investigate the meaning of data ethics and ethical action in European data protection policy-making, we can see the points of negotiation. That is, if we look at what happens in the “intervals” between established value systems and the renegotiation of these in new contexts, we discover clashes of values and negotiation as well as the contours of cultural positioning.

Section 5: Data ethics - power and positioning

Philosophy and media studies scholar Charles Ess (2014) has illustrated how culture plays a central role in shaping our ethical thinking about digital technologies. For instance, he argues that people in Western societies place ethical emphasis on “the individual as the primary agent of ethical reflection and action, especially as reinforced by Western notions of individual rights” (p. 196). Such cultural positioning in a global landscape can also be identified in the European data ethics policy debate. An example is the way in which one participant in the 2017 MEP debate discussed above described the GDPR with reference to the direct lived experiences of specific European historical events: “It is all about human dignity and privacy. It is all about the conception of personality which is really embedded in our culture, the European culture ( ...) it came from the general declaration of human rights. But there is a very very tragic history behind war, fascism, communism and totalitarian societies and that is a lesson we have learned in order to understand why privacy is important” (Lauristin, 2017).

Values such as human dignity and privacy are formally recognised in frameworks of European fundamental rights and data protection law, and conscious of their institutionalised roots in the European legal framework, European decision-makers will reference them when asked about the values of “their” data ethics. Awareness of data ethics thus becomes a cultural endeavour, transferring European cultural values into technological development. As stated in an EDPS report from 2015: “The EU in particular now has a ‘critical window’ before mass adoption of these technologies to build the values into digital structures which will define our society” (p. 13) .

When exploring European data ethics policy initiatives as spaces of value negotiations, a specific cultural arrangement emerges. In this context, policy and decision-makers position themselves against a perceived threat to a specifically European set of values and ethics that is pervasive, opaque and embedded in technology. In particular, a concern with a new opponent to the state power emerges. In an interview conducted in 2018 at an institution in Europe, a project officer reflected on her previous work in a European country’s parliament and government where concerns with the alternative form of power that the internet represents had surfaced. The internet is the place where discussions are held and decisions are made, she said, before remembering the policy debates concerning “GAFA” (the acronym for the four giant technology companies of Google, Apple, Facebook and Amazon). Such a clash in values has been directly addressed by European policymakers in public speeches and debates, increasingly naming the technology industry stakeholders they deem responsible. Embedded values of technology innovation are a “wrecking ball” , aiming not simply to “play with the way society is organised but instead to demolish the existing order and build something new in its place” , argued a President of the European Parliament in a speech in 2016 (Schultz, 2016). Values and ethics are hence directly connected with a type of cultural power that is built into technological systems. As one Director for Fundamental Rights and Union Citizenship, European Commission DG Justice claimed in a 2017 public debate: “ the challenge of ethics is not in the first place with the individual, the data subject; the challenge is with the controllers, which have power, they have power over people, they have power over data, and what are their ethics? What are the ethics they instil in their staff? In house compliance ethics? Ethics of engineers?” (Nemitz, 2017).

Section 6: Data ethics - spaces of negotiation

When dealing with the development of technical systems, we are inclined towards points of closure and stabilisation (Bijker et al., 1987) that will guide the governance, control and risk mitigation of the systems. Relatedly, we can understand data ethics policy initiatives as end results with the objectives “to formulate and support morally good solutions (e.g., right conducts or right values)” (Floridi & Taddeo, 2016, p. 1), emphasising algorithms (or technologies) that may not be “ethically neutral” (Mittelstadt et al., 2016, p. 4). That is to say, as solutions to the ethical problems raised within the very design of technologies, the data processing activities of the algorithms or the collection and dissemination of data. However, I would like to address data ethics policy initiatives in their contexts of interest and value negotiation. For instance, where does morality begin and end in a socio-technical infrastructure that extends across jurisdictions and continents, cultural value systems and societal sectors?

The technical does indeed in the very design represent forms of order, as the political theorist Langdon Winner reminded us (1980, p. 123). That is, it is “political” and thus has ethical implications when creating by design “wonderful breakthroughs by some social interests and crushing setbacks by others” (Winner, 1980, p 125). To provide an example, the Facebook APIs that facilitated the mass collection of user data, before these were reused and processed by Cambridge Analytica, were specifically designed to track users and share data en masse with third parties, hence directly enabling the mass collection, storage and processing of data. However, these design issues of the technical are also “inextricably bound up into an organic whole” with economic, social, political and cultural problems (Callon, 1987, p. 84). An analysis of data ethics as it is evolving in the European policy sphere demonstrates the complexity of governance challenges arising from the infrastructure of the information age being “shaped by multiple agents with competing interests and capacities, engaged in an indefinite set of distributed interactions over extended periods of time” (Harvey et al., 2017, p. 26). Governance in this era is, as highlighted by internet governance scholars Jeanette Hofmann et al., a “heterogeneous process of ordering without a clear beginning or endpoint” (2016, p. 1412). It consists of actors engaged in “fields of struggle” (Pohle et al-, 2016) of meaning making and competing interpretations of policy issues that are “continuously produced, negotiated and reshaped by the interaction between the field and its actors” (p. 4). I propose that we also explore, as essential components of our data ethics endeavours, the complex dynamics of the ways in which powers are distributed and how interests are met in spaces of negotiation.

Evidently, we must also recognise data ethics policy initiatives as components of a general infrastructural development’s rhythm rather than caved in ethical solutions and isolated events. Understand them as the kind of negotiation posts that repeatedly occur throughout the course of a technological system’s development (Bijker et al., 1987), and as segments of a process of standardisation and consensus-building within a complex general technological evolution of our societies that “contain messy, complex, problem-solving components” (Hughes, 1987, p. 51). The technological systems of modernity are like the architecture of mundane buildings. They reside, as Edwards (2002, p. 185) claims, in a “naturalised background, ordinary as trees, daylight, and dirt”. Silently they represent, constitute and are constituted by both our material and imagined modern societies and the distribution of power within. They remain unnoticed until they fail (Edwards, 2002). But when they do fail, we see them in all their complexity. An example is the US intelligence officers PowerPoint presentations (The Guardian, 2013) detailing the “PRISM program” leaked by Edward Snowden in 2013 that provide a detailed map of an information and data infrastructure that is characterised by intricate interconnections between a state organisation of mass surveillance, laws, jurisdictions and budgets, and the technical design of the world wide web and social media platforms. The technological infrastructures are indeed like communal buildings. With doors that we never give a second thought until the day we find one of them locked.

October 2018: “These are just tools!” one person exclaimed. We were at a working group meeting where an issue with using Google Docs for the practical work of the group was raised and discussed at length. While some were arguing for an official position on the use of the online service, mainly with reference to what they described as Google’s insufficient compliance with European data protection law, others saw the discussion as a waste of time. Why spend valuable work time on this issue?

What is data ethics? Currently, the reply is shrill, formally framed in countless statements, documents and mission statements from a multitude of sources, including governments, intergovernmental organisations, consultancy firms, companies, non-governmental organisations, independent experts and academics. But it also emerges when least expected, in “non-allocated” moments of discussion. Information technologies that permeate every aspect of our lives today, from micro work settings to macro economics and politics, are increasingly discussed as “ethical problems” (Introna, 2005, p. 76) that must be solved. Their pervasiveness sparks moments of ethical thinking, negotiated in terms of moral principles, values and ideal conditions (Brey, 2010). In allocated or unallocated spaces of negotiation, moments of pause and sense-making (Moor, 1985), we discuss the values (Flanagan et al., 2008) and politics (Winner, 1980) of the business practices, cultures and legal jurisdictions that shape them. These spaces of negotiation encompass very concrete discussions regarding specific information technology tools, but increasingly they also evolve into reflections concerning general challenges to established legal frameworks, individuals’ agency and human rights, as well as questions regarding the general evolution of society. As one Danish minister said at the launch of a national data ethics expert group: “This is about what society we want” (Holst, 11 March 2018 ).

In this article, I have explored data ethics in the context of a European data protection legal reform. In particular, I have sought to answer the question: “What is data ethics?” with the assumption that the answer will shape how we perceive the role and function of data ethics policy initiatives. Based on a review of policy documents, reports and press material, alongside analysis of the ways in which policymakers and civil servants make sense of data ethics, I propose that we recognise these initiatives as open-ended spaces of negotiation and cultural positioning.

This approach to ethics might be criticised as futile in the context of policy and action. However, I propose that understanding data ethics policy initiatives as spaces of negotiation does not prevent action. Rather, it forces us to make apparent our point of departure: the social and cultural values and interests that shape our ethical action. We can thus create the potential for a more transparent negotiation of ethical action in the “Big Data Era”, enabling us to acknowledge the macro-level data ethics spaces of negotiation that are currently emerging not only in Europe but globally.

This article’s analytical investigation of European data ethics policy initiatives as spaces of cultural value negotiations has revealed a set of actionable thematic areas. It has illustrated a clash of values and an emerging concern with the alternative forms of power and control embedded in our technological environment, which exert pressure on people and individuals in particular. Here, a data ethics of power that takes its point of departure in Gilles Deleuze’s description of computerised Societies of Control (1992) enables us to think about the ethical action that is necessary today. Ethical action could for example concern the empowerment of individuals to challenge the laws and norms of opaque algorithmic computer networks, as we have noted in debates on the right to explanation and the accountability and interpretability of algorithms. Ethical action may also strive towards ideals of freedom in order to break away from coding, to become indiscernible to “Weapons of Math Destruction” (O’Neil, 2016) that increasingly define, shape and limit us as individuals, as seen for instance in the digital self-defence movement (Heuer & Tranberg, 2013). Data ethics missions such as these are rooted in deeply personal experiences of living in coded networks, but they are also based on growing social and political movements and sentiments (Hasselbalch & Tranberg, 2016).

Much remains to be explored and developed regarding the power dynamics embedded in the evolving data ethics debate, not only in policy-making, but also in business, technology and public discourse in general. This article seeks to open up a more inclusive and holistic discussion of data ethics in order to advance investigation and understanding of the ways in which values are negotiated, rights and authority are distributed, and conflicts are resolved.

Acknowledgements

Clara; Francesco Lapenta for the many informed discussions regarding the sociology of data ethics; Jens-Erik Mai for insightful comments on the drafts of this article; The team at DataEthics.eu for inspiration.

Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016, May 23). Machine bias - There’s software used across the country to predict future criminals. And it’s biased against blacks. Propublica . Retrieved from https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

Albrecht, J. P. (2017, January 26) MEP debate: The regulation is here! What now? [video file] Retrieved from https://www.youtube.com/watch?v=28EtlacwsdE

Bergson, H. (1988). Matter and Memory (N. M. Paul & W. S. Palmer, Trans.) New York: Zone Books.

Bergson, H. (1998), Creative Evolution (A. Mitchell, Trans.). Mineola, NY: Dover Publications.

Bijker, W. E., Hughes, T. P., & Pinch, T. (1987). General introduction. In W. E. Bijker, T. P. Hughes, & T. Pinch. (Eds.), The Social Construction of Technological Systems (pp. 1-7). Cambridge, MA: MIT Press.

Brey, P. (2000). Disclosive computer ethics. Computer and Society , 30 (4), 10-16. doi: 10.1145/572260.572264

Brey, P. (2010). Values in technology and disclosive ethics. In L. Floridi (Ed.), The Cambridge Handbook of Information and Computer Ethics (pp. 41-58). Cambridge: Cambridge University Press.

Bourdieu, P. (1997). Outline of a Theory of Practice . Cambridge: Cambridge University Press.

Bourdieu, P., & Wacquant, L. (1992). An Invitation to Reflexive Sociology . Cambridge: Polity Press.

Callon, M. (1987). Society in the making: the study of technology as a tool for sociological analysis. In Wiebe E. Bijker, Thomas P. Hughes, & Trevor Pinch (Eds.), The Social Construction of Technological Systems (pp. 83-103). Cambridge, MA: MIT Press.

Carr, J. (2015, December 13). Should I laugh, cry or emigrate? [Blog post]. Retrieved from Desiderata https://johnc1912.wordpress.com/2015/12/13/should-i-laugh-cry-or-emigrate/

Clifford, J. (1997).  Routes: Travel and Translation in the Late Twentieth Century . Cambridge: Harvard University Press.

Cohen, J. E. (2013). What privacy is for. Harvard Law Review , 126 (7). Retrieved from https://harvardlawreview.org/2013/05/what-privacy-is-for/

Danish Business Authority. (2018, March 12). The Danish government appoints new expert group on data ethics [Press release]. Retrieved from https://eng.em.dk/news/2018/marts/the-danish-government-appoints-new-expert-group-on-data-ethics

Deleuze, G. (1992). Postscript on the societies of control. October, 59 , p. 3-7. Retrieved from http://www.jstor.org/stable/778828

Deleuze, G. (1966). Bergsonism (H. Tomlinson, Trans.). New York: Urzone Inc.

Edwards, P. (2002). Infrastructure and modernity: scales of force, time, and social organization in the history of sociotechnical systems. In Misa, T. J., Brey, P., & A. Feenberg (Eds.), Modernity and Technology (pp. 185-225). Cambridge, MA: MIT Press.

Ess, C. M. (2014). Digital Media Ethics . Cambridge, UK: Polity Press

Flanagan, M., Howe, D. C., & Nissenbaum, H. (2008). Embodying values in technology – theory and practice. In J. van den Hoven, & J. Weckert (Eds.), Information Technology and Moral Philosophy (pp. 322-353). Cambridge, UK: Cambridge University Press.

Floridi, L., & Taddeo, M. (2016). What is data ethics?. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences , 374 (2083). doi: 10.1098/rsta.2016.0360

Friedman, B., & Nissenbaum, H. (1996). Bias in computer systems. ACM Transactions on Information Systems , 14 (3), 330-347. doi: 10.1145/230538.230561

Frohmann, B. (2007). Foucault, Deleuze, and the ethics of digital networks. In R. Capurro, J. Frühbauer, & T. Hausmanninger (Eds.), Localizing the Internet. Ethical Aspects in Intercultural Perspective (pp. 57-68). Munich: Fink.

Goffman, E. (1974). Frame Analysis: An Essay on the Organization of Experience . Boston, MA: Northeastern University Press

Harvey, P., Jensen, C. B., & Morita, A. (2017). Introduction: infrastructural complications. In P. Harvey, C. B. Jensen, & A. Morita (Eds.), Infrastructures and Social Complexity: A Companion. p. 1-22. London: Routledge.

Hasselbalch, G., & Tranberg, P. (2016, December 1). The free space for data monopolies in Europe is shrinking [Blog post]. Retrieved from Opendemocracy.net https://www.opendemocracy.net/gry-hasselbalch-pernille-tranberg/free-space-for-data-monopolies-in-europe-is-shrinking

Hasselbalch, G., & Tranberg, P. (2016, September 27). Personal data stores want to give individuals power over their data [Blog post]. Retrieved from DataEthics.eu https://dataethics.eu/personal-data-stores-will-give-individual-power-their-data/

Hasselbalch, G., & Tranberg, P. (2016). Data Ethics. The New Competitive Advantage . Copenhagen: Publishare.

Heisenberg, D. (2005). Negotiating Privacy: The European Union, The United States and Personal Data Protection . Boulder, CA: Lynne Reinner Publishers.

Heuer, S., & Tranberg, P. (2013). Fake It! Your Guide to Digital Self-Defense . Copenhagen: Berlingske Media Forlag.

Hill, R. (24 November 2017). Another toothless wonder? Why the UK.gov’s data ethics centre needs clout. The Register . Retrieved from https://www.theregister.co.uk/2017/11/24/another_toothless_wonder_why_the_ukgovs_data_ethics_centre_needs_some_clout/

Hofmann, J., Katzenbach, C., & Gollatz, K. (2016). Between coordination and regulation: finding the governance in Internet governance. New Media & Society, 19 (9), 1406-1423. doi: 10.1177/1461444816639975

Holst, H. K. (2018, March 11). Regeringen vil lovgive om dataetik: det handler om, hvilket samfund vi ønsker [The government will legislate on data: it is about what we want to do in society]. Berlingske . Retrieved from https://www.berlingske.dk/politik/regeringen-vil-lovgive-om-dataetik-det-handler-om-hvilket-samfund-vi-oensker

Hughes, T. P. (1987). The evolution of large technological systems. In W. E. Bijker, T. P. Hughes, & T. Pinch (Eds.), The Social Construction of Technological Systems (pp. 51-82). Cambridge, MA: MIT Press.

Ingold, T. (2000) The Perception of the Environment: Essays in Livelihood, Dwelling and Skill. London: Routledge.

Introna, L. D. (2005). Disclosive ethics and information technology: disclosing facial recognition systems. Ethics and Information Technology , 7 (2), 75-86. doi: 10.1007/s10676-005-4583-2

Ingeniøren. (2018, March 16). Start nu med at overholde loven Brian Mikkelsen [Now start complying with the law, Brian Mikkelsen]. Version 2 . Retrieved from https://www.version2.dk/artikel/leder-start-nu-med-at-overholde-loven-brian-mikkelsen-1084631

in ‘t, Veld, S. (2017, January 26). European Privacy Platform [video file]. Retrieved from https://www.youtube.com/watch?v=8_5cdvGMM-U

Lauristin, M. (2017, January 26). MEP debate: The regulation is here! What now? [video file] Retrieved from: https://www.youtube.com/watch?v=28EtlacwsdE

Lyon, D. (2014). Surveillance, Snowden, and big data: capacities, consequences, critique. Big Data & Society , 1 (2). doi: 10.1177/2053951714541861

Lyon, D. (2010). Liquid surveillance: the contribution of Zygmunt Bauman to surveillance studies. International Political Sociology, 4 (4). (pp. 325-338). doi: 10.1111/j.1749-5687.2010.00109.x

Mayer-Schonberger, V., & Cukier, K. (2013). Big Data: A Revolution That Will Transform How We Live, Work and Think . London: John Murray.

Meyrowitz, J. (1985). No Sense of Place: The Impact of the Electronic Media on Social Behavior . Oxford: Oxford University Press.

Mittelstadt, B. D., Allo, P., Taddeo, M., Wachter, S., & Floridi, L. (2016). The ethics of algorithms: mapping the debate. Big Data & Societ y, 3 (2), 1-21. doi: 10.1177%2F2053951716679679

Moor, J. H. (1985). What is computer ethics? Metaphilosophy, 16 (4), 266-275. doi: 10.1111/j.1467-9973.1985.tb00173.x

Nemitz, P. (2017, January 26) European Privacy Platform [video file]. Retrieved from: https://www.youtube.com/watch?v=8_5cdvGMM-U

O’Neil, C. (2016). Weapons of Math Destruction . New York: Penguin Books.

Pasquale, F. (2015). The Black Box Society – The Secret Algorithms That Control Money and Information . Cambridge, MA: Harvard University Press

Poikola, A., Kuikkaniemi, K., & Honko, H. (2018). Mydata – A Nordic Model for human-centered personal data management and processing [White paper]. Helsinki: Open Knowledge Finland. Retrieved from https://www.lvm.fi/documents/20181/859937/MyData-nordic-model/2e9b4eb0-68d7-463b-9460-821493449a63?version=1.0

Pohle, J., Hosl, M. & Kniep, R. (2016). Analysing internet policy as a field of struggle. Internet Policy Review , 5 (3) doi: 10.14763/2016.3.412

Powles, J. (2015–2018). Julia Powles [Profile]. The Guardian . Retrieved from https://www.theguardian.com/profile/julia-powles

Richards, N. M., & King J. H. (2014). Big data ethics. Wake Forest Law Review , 49 , 393- 432.

Richardson, J. (2015, December 10). European General Data Protection Regulation draft: the debate. Retrieved from Medium https://medium.com/@janicerichardson/european-general-data-protection-regulation-draft-the-debate-8360e9ef5c1

Schultz, M. (2016, March 3) Technological totalitarianism, politics and democracy [video file] Retrieved from: https://www.youtube.com/watch?v=We5DylG4szM

Solove, D. J. (2008). Understanding Privacy . Cambridge: Harvard University Press.

Spiekermann, S., Hampson P., Ess, C. M., Hoff, J., Coeckelbergh, M., & Franckis, G. (2017). The Ghost of Transhumanism & the Sentience of Existence., Retrieved from The Privacy Surgeon http://privacysurgeon.org/blog/wp-content/uploads/2017/07/Human-manifesto_26_short-1.pdf

The Guardian. (2013, November 1). NSA Prism Programme Slides. The Guardian . Retrieved from https://www.theguardian.com/world/interactive/2013/nov/01/prism-slides-nsa-document

Vestager, M. (2016, September 9). Making Data Work for Us. Retrieved from European Commission https://ec.europa.eu/commission/commissioners/2014-2019/vestager/announcements/making-data-work-us_en Video available at https://vimeo.com/183481796

de Wachter, M. A. M. (1997). The European Convention on Bioethics. Hastings Center Report , 27(1 ), 13-23. Retrieved from https://onlinelibrary.wiley.com/doi/full/10.1002/j.1552-146X.1997.tb00015.x

Wagner, B. (2018). Ethics as an escape from regulation: from ethics-washing to ethics-shopping? In M. Hildebrandt (Ed.), Being Profiling. Cogitas Ergo Sum . Amsterdam: Amsterdam University Press. Retrieved from https://www.privacylab.at/wp-content/uploads/2018/07/Ben_Wagner_Ethics-as-an-Escape-from-Regulation_2018_BW9.pdf

Warman, M. (2012, February 8). EU Privacy regulations subject to ‘unprecedented lobbying’. The Telegraph . Retrieved from https://www.telegraph.co.uk/technology/news/9070019/EU-Privacy-regulations-subject-to-unprecedented-lobbying.html

Williams, R. (1993). Culture is ordinary. In A. Gray, & J. McGuigan (Eds.), Studying Culture: An Introductory Reader (pp. 5-14). London: Edward Arnold.

Winner, L. (1980). Do artifacts have politics? Daedalus, 109 (1), 121-136. Retrieved from https://www.jstor.org/stable/20024652

Wong, S. (2009) Tales from the frontline: The experiences of early childhood practitioners working with an ‘embedded’ research team. Evaluation and Program Planning , 32 (2), 99–108. doi: 10.1016/j.evalprogplan.2008.10.003

Zuboff, S. (2016, March 5). The secrets of surveillance capitalism. Frankfurter Allgemeine . Retrieved from http://www.faz.net/aktuell/feuilleton/debatten/the-digital-debate/shoshana-zuboff-secrets-of-surveillance-capitalism-14103616.html

Zuboff, S. (2014, September 9). A digital declaration. Frankfurter Allgemeine . Retrieved from http://www.faz.net/aktuell/feuilleton/debatten/the-digital-debate/shoshan-zuboff-on-big-data-as-surveillance-capitalism-13152525.html

Zuboff, S. (2019). The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power . London; New York: Profile Books; Public Affairs.

Policy documents and reports

AI Task Force & Agency for Digital Italy. (2018). Artificial Intelligence at the service of the citizen [White paper]. Retrieved from: https://libro-bianco-ia.readthedocs.io/en/latest/

Council of Europe. (1997). Convention for the Protection of Human Rights and Dignity of the Human Being with Regard to the Application of Biology and Medicine: Convention on Human Rights and Biomedicine. (The “Oviedo Convention”) Treaty No.164 . Retrieved from https://www.coe.int/en/web/conventions/full-list/-/conventions/treaty/164

Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the Protection of Individuals with Regard to the Processing of Personal Data and on the Free Movement of such Data . Retrieved from https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A31995L0046

EC High-Level Expert Group. (2018). Draft Ethics Guidelines for Trustworthy AI . Working document, 18 December 2018 (final document was not published when this article was written). Retrieved from https://ec.europa.eu/digital-single-market/en/news/draft-ethics-guidelines-trustworthy-ai

European Commission. (2012, January 25). Proposal for a Regulation of the European Parliament and of the Council on the Protection of Individuals with Regard to the Processing of Personal Data and on the Free Movement of such Data (General Data Protection Regulation) . Retrieved from http://www.europarl.europa.eu/registre/docs_autres_institutions/commission_europeenne/com/2012/0011/COM_COM (2012)0011_EN.pdf

European Commission. (2018). Communication from the Commission to the European Parliament, the European Council, the Council, the European Economic and Social Committee and the Committee of the Regions - Coordinated Plan on Artificial Intelligence (COM(2018) 795 final). Retrieved from https://ec.europa.eu/digital-single-market/en/news/coordinated-plan-artificial-intelligence

European Parliament. (2019, February 12). European Parliament Resolution of 12 February 2019 on a Comprehensive European Industrial Policy on Artificial Intelligence and Robotics (2018/2088(INI)). Retrieved from http://www.europarl.europa.eu/sides/getDoc.do?pubRef=-//EP//NONSGML+TA+P8-TA-2019-0081+0+DOC+PDF+V0//EN

European Union Regulation 2016/679 of the European Parliament and of the Council of 27 April 2016 on the Protection of Natural Persons with Regard to the Processing of Personal Data and on the Free Movement of such Data, and Repealing Directive 95/46/EC (General Data Protection Regulation). Retrieved from https://eur-lex.europa.eu/legal-content/EN/TXT/?qid=1528874672298&uri=CELEX%3A32016R0679

Gov.uk. (2018, January 25). Digital Strategy . Retrieved from https://www.gov.uk/government/publications/digital-charter/digital-charter

European Data Protection Supervisor (EDPS). (2015). Towards a New Digital Ethics Data Dignity and Technology. Retrieved from https://edps.europa.eu/sites/edp/files/publication/15-09-11_data_ethics_en.pdf

European Data Protection Supervisor (EDPS). Ethics Advisory Group. (2018). Towards a Digital Ethics. Retrieved from https://edps.europa.eu/sites/edp/files/publication/18-01-25_eag_report_en.pdf

1. By “European” I am not only focusing on the European Union (EU), but on a constantly negotiated cultural context, and thus for example I do not exclude organisations like the Council of Europe or instruments such as the European Convention of Human Rights.

2. Interviews informing the article (anonymous, all audio recorded, except from one based on written notes, four directly quoted in the article): two policy advisors; four European institution officers; one data protection commissioner; one representative of a European country to the Committee of Ministers of the Council of Europe; one European parliamentarian.

3. I am the vice chair of the IEEE P7006 standard on personal data AI agents.

4. I was one of the 12 appointed members of this committee (2018).

5. I was one of the 52 appointed members of this group (2018-2020).

Add new comment

Copy to clipboard

Adjusts contrasts, text, and spacing in order to improve legibility for people with dyslexia.

Contrasts, text, and spacing are adjusted in order to improve legibility for people with dyslexia. Also, links look like this and italics like this . Font is changed to Atkinson Hyperlegible

Is this feature helpful for you, or could the design be improved? If you have feedback please send us a message .

  • Data ethics
  • Data protection
  • European policymaking
  • DataEthics.eu
  • Danish government data ethics expert committee
  • European Commission Artificial Intelligence High Level Expert group
  • MyData Global Movement
  • Human in the Loop (HITL)
  • International Organization for Standardization (ISO)
  • Council of Europe
  • European Data Protection Supervisor (EDPS)
  • European Commission
  • United Kingdom Centre for Data Ethics and Innovation
  • European Director for Fundamental Rights and Union Citizenship
  • Cambridge Analytica

Stakeholder

  • Inter-governmental
  • Civil Society

Related Articles

Transnational collective actions for cross-border data protection violations.

Although the GDPR paves the way for a coordinated EU-wide legal action against data protection infringements, only a reform of private international law rules can enhance the opportunities of data subjects to enforce their rights.

Going global: Comparing Chinese mobile applications’ data and user privacy governance at home and abroad

This paper examines data and privacy governance by four China-based mobile applications and their international versions - including the role of the state. It also highlights the role of platforms in gatekeeping mobile app privacy standards.

The regulation of online political micro-targeting in Europe

This paper discusses how online political micro-targeting is regulated in Europe, from the perspective of data protection law, freedom of expression, and political advertising rules.

EU data protection: bumpy piece of road ahead

The European Civil Liberties Committee LIBE is pushing the EU data protection regulation draft forward. Yet, many compromises are made along the way, leaving Europeans wondering who will be the good, the bad and the ugly in the data protection saga.

Data control and digital regulatory space(s): towards a new European approach

This article examines the stance of the European Union vis-à-vis internet services company Google in two controversial instances: the ‘right to be forgotten’ and the implementation of EU competition rules.

Internet Policy Review is an open access and peer-reviewed journal on internet regulation.

peer reviewed

Not peer reviewed.

SCImago Journal & Country Rank

email Subscribe NEWSLETTER

data ethics thesis

Mobile Menu Overlay

The White House 1600 Pennsylvania Ave NW Washington, DC 20500

FACT SHEET: Biden- ⁠ Harris Administration Announces New AI Actions and Receives Additional Major Voluntary Commitment on   AI

Nine months ago, President Biden issued a landmark Executive Order to ensure that America leads the way in seizing the promise and managing the risks of artificial intelligence (AI). This Executive Order built on the voluntary commitments he and Vice President Harris received from 15 leading U.S. AI companies last year. Today, the administration announced that Apple has signed onto the voluntary commitments, further cementing these commitments as cornerstones of responsible AI innovation. In addition, federal agencies reported that they completed all of the 270-day actions in the Executive Order on schedule, following their on-time completion of every other task required to date . Agencies also progressed on other work directed for longer timeframes. Following the Executive Order and a series of calls to action made by Vice President Harris as part of her major policy speech before the Global Summit on AI Safety, agencies all across government have acted boldly. They have taken steps to mitigate AI’s safety and security risks, protect Americans’ privacy, advance equity and civil rights, stand up for consumers and workers, promote innovation and competition, advance American leadership around the world, and more. Actions that agencies reported today as complete include the following: Managing Risks to Safety and Security: Over 270 days, the Executive Order directed agencies to take sweeping action to address AI’s safety and security risks, including by releasing vital safety guidance and building capacity to test and evaluate AI. To protect safety and security, agencies have:

  • Released for public comment new technical guidelines from the AI Safety Institute (AISI) for leading AI developers in managing the evaluation of misuse of dual-use foundation models. AISI’s guidelines detail how leading AI developers can help prevent increasingly capable AI systems from being misused to harm individuals, public safety, and national security, as well as how developers can increase transparency about their products.
  • Published final frameworks on managing generative AI risks and securely developing generative AI systems and dual-use foundation models. These documents by the National Institute of Standards and Technology (NIST) will provide additional guidance that builds on NIST’s AI Risk Management Framework, which offered individuals, organizations, and society a framework to manage AI risks and has been widely adopted both in the U.S. and globally. NIST also submitted a report to the White House outlining tools and techniques to reduce the risks from synthetic content.
  • Developed and expanded AI testbeds and model evaluation tools at the Department of Energy (DOE). DOE, in coordination with interagency partners, is using its testbeds to evaluate AI model safety and security, especially for risks that AI models might pose to critical infrastructure, energy security, and national security. DOE’s testbeds are also being used to explore novel AI hardware and software systems, including privacy-enhancing technologies that improve AI trustworthiness. The National Science Foundation (NSF) also launched an initiative to help fund researchers outside the federal government design and plan AI-ready testbeds.
  • Reported results of piloting AI to protect vital government software.  The Department of Defense (DoD) and Department of Homeland Security (DHS) reported findings from their AI pilots to address vulnerabilities in government networks used, respectively, for national security purposes and for civilian government. These steps build on previous work to advance such pilots within 180 days of the Executive Order.
  • Issued a call to action from the Gender Policy Council and Office of Science and Technology Policy to combat image-based sexual abuse, including synthetic content generated by AI. Image-based sexual abuse has emerged as one of the fastest growing harmful uses of AI to-date, and the call to action invites technology companies and other industry stakeholders to curb it. This call flowed from Vice President Harris’s remarks in London before the AI Safety Summit, which underscored that deepfake image-based sexual abuse is an urgent threat that demands global action.

Bringing AI Talent into Government Last year, the Executive Order launched a government-wide AI Talent Surge that is bringing hundreds of AI and AI-enabling professionals into government. Hired individuals are working on critical AI missions, such as informing efforts to use AI for permitting, advising on AI investments across the federal government, and writing policy for the use of AI in government.

  • To increase AI capacity across the federal government for both national security and non-national security missions, the AI Talent Surge has made over 200 hires to-date, including through the Presidential Innovation Fellows AI cohort and the DHS AI Corps .
  • Building on the AI Talent Surge 6-month report , the White House Office of Science and Technology Policy announced new commitments from across the technology ecosystem, including nearly $100 million in funding, to bolster the broader public interest technology ecosystem and build infrastructure for bringing technologists into government service.

Advancing Responsible AI Innovation President Biden’s Executive Order directed further actions to seize AI’s promise and deepen the U.S. lead in AI innovation while ensuring AI’s responsible development and use across our economy and society. Within 270 days, agencies have:

  • Prepared and will soon release a report on the potential benefits, risks, and implications of dual-use foundation models for which the model weights are widely available, including related policy recommendations. The Department of Commerce’s report draws on extensive outreach to experts and stakeholders, including hundreds of public comments submitted on this topic.
  • Awarded over 80 research teams’ access to computational and other AI resources through the National AI Research Resource (NAIRR) pilot —a national infrastructure led by NSF, in partnership with DOE, NIH, and other governmental and nongovernmental partners, that makes available resources to support the nation’s AI research and education community. Supported projects will tackle deepfake detection, advance AI safety, enable next-generation medical diagnoses and further other critical AI priorities.
  • Released a guide for designing safe, secure, and trustworthy AI tools for use in education. The Department of Education’s guide discusses how developers of educational technologies can design AI that benefits students and teachers while advancing equity, civil rights, trust, and transparency. This work builds on the Department’s 2023 report outlining recommendations for the use of AI in teaching and learning.
  • Published guidance on evaluating the eligibility of patent claims involving inventions related to AI technology,  as well as other emerging technologies. The guidance by the U.S. Patent and Trademark Office will guide those inventing in the AI space to protect their AI inventions and assist patent examiners reviewing applications for patents on AI inventions.
  • Issued a report on federal research and development (R&D) to advance trustworthy AI over the past four years. The report by the National Science and Technology Council examines an annual federal AI R&D budget of nearly $3 billion.
  • Launched a $23 million initiative to promote the use of privacy-enhancing technologies to solve real-world problems, including related to AI.  Working with industry and agency partners, NSF will invest through its new Privacy-preserving Data Sharing in Practice program in efforts to apply, mature, and scale privacy-enhancing technologies for specific use cases and establish testbeds to accelerate their adoption.
  • Announced millions of dollars in further investments to advance responsible AI development and use throughout our society. These include $30 million invested through NSF’s Experiential Learning in Emerging and Novel Technologies program—which supports inclusive experiential learning in fields like AI—and $10 million through NSF’s ExpandAI program, which helps build capacity in AI research at minority-serving institutions while fostering the development of a diverse, AI-ready workforce.

Advancing U.S. Leadership Abroad President Biden’s Executive Order emphasized that the United States lead global efforts to unlock AI’s potential and meet its challenges. To advance U.S. leadership on AI, agencies have:

  • Issued a comprehensive plan for U.S. engagement on global AI standards.  The plan, developed by the NIST, incorporates broad public and private-sector input, identifies objectives and priority areas for AI standards work, and lays out actions for U.S. stakeholders including U.S. agencies. NIST and others agencies will report on priority actions in 180 days. 
  • Developed guidance for managing risks to human rights posed by AI. The Department of State’s “Risk Management Profile for AI and Human Rights”—developed in close coordination with NIST and the U.S. Agency for International Development—recommends actions based on the NIST AI Risk Management Framework to governments, the private sector, and civil society worldwide, to identify and manage risks to human rights arising from the design, development, deployment, and use of AI. 
  • Launched a global network of AI Safety Institutes and other government-backed scientific offices to advance AI safety at a technical level. This network will accelerate critical information exchange and drive toward common or compatible safety evaluations and policies.
  • Launched a landmark United Nations General Assembly resolution . The unanimously adopted resolution, with more than 100 co-sponsors, lays out a common vision for countries around the world to promote the safe and secure use of AI to address global challenges.
  • Expanded global support for the U.S.-led Political Declaration on the Responsible Military Use of Artificial Intelligence and Autonomy.   Fifty-five nations now endorse the political declaration, which outlines a set of norms for the responsible development, deployment, and use of military AI capabilities.

The Table below summarizes many of the activities that federal agencies have completed in response to the Executive Order:

data ethics thesis

Stay Connected

We'll be in touch with the latest information on how President Biden and his administration are working for the American people, as well as ways you can get involved and help our country build back better.

Opt in to send and receive text messages from President Biden.

medRxiv

An Automated Pipeline for the Identification of Liver Tissue in Ultrasound Video

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Eloise S Ockenden
  • ORCID record for Simon Mpooya
  • ORCID record for J. Alison Noble
  • ORCID record for Goylette F Chami
  • For correspondence: [email protected]
  • Info/History
  • Preview PDF

Liver diseases are a leading cause of death worldwide, with an estimated 2 million deaths each year. Causes of liver disease are diffi- cult to ascertain, especially in sub-Saharan Africa where there is a high prevalence of infectious diseases such as hepatitis B and schistosomi- asis, along with alcohol use. Point-of-care ultrasound often is used in low-resource settings for diagnosis of liver disease due to its portabil- ity and low cost. For classification models that can automatically stage liver disease from ultrasound video, the region of interest is liver tissue. A fully-automated pipeline for liver tissue identification in ultrasound video is presented. Ultrasound video data was collected using a low-cost, portable ultrasound machine in rural areas of Uganda. The pipeline first detects the diaphragm in each ultrasound video frame, then segments the diaphragm to ultimately use this segmentation to infer the position of liver tissue in each frame. This pipeline outperforms directly segmenting liver tissue with an intersection over union of 0.83 compared to 0.62. This pipeline also shows improved results with respect to the ease of clinical interpretation and anticipated clinical utility.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

ESO receives a DPhil studentship (2593890) associated with the EPSRC CDT in Health Data Science (EP/S02428X/1). GFC receives funding from the Wellcome Trust Institutional Strategic Support Fund (204826/Z/16/Z) and John Fell Fund as part of the SchistoTrack Project, Robertson Foundation Fellowship, and UKRI EPSRC Award (EP/X021793/1). AN acknowledges EPSRC grants EP/X040186/1, ERC grant PULSE (ERC-2015-AdG-694581) and the NIHR Oxford Biomedical Research Centre Imaging Theme. For the purpose of Open Access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Data collection and use were reviewed and approved by Oxford Tropical Research Ethics Committee (OxTREC 509-21), Vector Control Division Research Ethics Committee of the Uganda Ministry of Health (VCDREC146), and Uganda National Council of Science and Technology (UNCST HS 1664ES).

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Data Availability

Data is not publicly available due to data protection and ethics restrictions related to the ongoing nature of the SchistoTrack Cohort and easily identifiable nature of the participants. Code is available from the authors upon request.

View the discussion thread.

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Twitter logo

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager
  • Tweet Widget
  • Facebook Like
  • Google Plus One
  • Addiction Medicine (336)
  • Allergy and Immunology (658)
  • Anesthesia (177)
  • Cardiovascular Medicine (2570)
  • Dentistry and Oral Medicine (310)
  • Dermatology (218)
  • Emergency Medicine (390)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (913)
  • Epidemiology (12083)
  • Forensic Medicine (10)
  • Gastroenterology (743)
  • Genetic and Genomic Medicine (3990)
  • Geriatric Medicine (375)
  • Health Economics (666)
  • Health Informatics (2577)
  • Health Policy (992)
  • Health Systems and Quality Improvement (959)
  • Hematology (357)
  • HIV/AIDS (825)
  • Infectious Diseases (except HIV/AIDS) (13569)
  • Intensive Care and Critical Care Medicine (783)
  • Medical Education (396)
  • Medical Ethics (107)
  • Nephrology (423)
  • Neurology (3756)
  • Nursing (206)
  • Nutrition (559)
  • Obstetrics and Gynecology (718)
  • Occupational and Environmental Health (687)
  • Oncology (1957)
  • Ophthalmology (567)
  • Orthopedics (233)
  • Otolaryngology (301)
  • Pain Medicine (247)
  • Palliative Medicine (72)
  • Pathology (469)
  • Pediatrics (1088)
  • Pharmacology and Therapeutics (453)
  • Primary Care Research (442)
  • Psychiatry and Clinical Psychology (3352)
  • Public and Global Health (6425)
  • Radiology and Imaging (1360)
  • Rehabilitation Medicine and Physical Therapy (793)
  • Respiratory Medicine (858)
  • Rheumatology (394)
  • Sexual and Reproductive Health (399)
  • Sports Medicine (336)
  • Surgery (431)
  • Toxicology (51)
  • Transplantation (184)
  • Urology (163)

data ethics thesis

Elena Kagan Endorses High Court Ethics Enforcement Mechanism (1)

By Suzanne Monyak and Lydia Wheeler

Suzanne Monyak

Justice Elena Kagan proposed Chief Justice John Roberts appoint a panel of judges to enforce the US Supreme Court’s code of conduct.

While speaking Thursday at a judicial conference in Sacramento, California, Kagan said she trusts Roberts and if he creates “some sort of committee of highly respected judges with a great deal of experience and a reputation for fairness,” that seems like a good solution.

The court has been dogged by controversy following reports of lavish vacations, private jet flights, and other gifts received by Justice Clarence Thomas. The justices adopted a code of conduct for the first time in November in response to growing demands for transparency and accountability, but it’s been criticized for lacking an enforcement mechanism.

Kagan, in response to a moderator’s question at the US Court of Appeals for the Ninth Circuit’s annual judicial conference, acknowledged there are difficulties in deciding who should enforce an ethics code for the justices.

“But I feel as though we, however hard it is, that we could and should try to figure out some mechanism for doing this,” she said.

WATCH: We examine why the Supreme Court doesn’t have a Code of Ethics

Separate Writings

During a discussion with lawyer Roger Townsend and US Bankruptcy Judge Madeleine Wanslee, Kagan also criticized her colleagues for writing multiple opinions in a single case, saying it complicates matters for lower courts.

“It prevents us, I think, from giving the kind of guidance that lower courts have a right to expect, that the public has a right to expect,” she said.

While there are times when separate writings make sense, Kagan said justices shouldn’t be writing separately just because they would have written the majority decision differently. The court should have a “higher threshold” than that, she said.

Kagan cited the court’s fractured decision in the United States v. Rahimi gun case . The court upheld a federal law that bans people subject to domestic violence restraining orders from possessing a gun in a 8-1 decision in which seven of the nine justices wrote their own opinions despite there being only one dissent.

The opinions signaled divisions among the justices on how to use history and tradition to analyze the constitutionality of firearm restrictions in the wake of the court’s 2022 decision in New York State Rifle & Pistol Association Inc. v. Bruen. In that case, the justices told lower courts to look at history and tradition when deciding what gun regulations are permissible.

Kagan’s comments follow a momentous Supreme Court term. The justices issued a string of controversial decisions that limited the power of federal regulators, eliminated a federal ban on a gun accessory used in America’s deadliest mass shooting, and gave former President Donald Trump immunity from criminal prosecution for some official acts while in office.

Townsend is a trial attorney with Breskin Johnson & Townsend in Seattle.

To contact the reporters on this story: Suzanne Monyak in Sacramento, California at [email protected] ; Lydia Wheeler in Washington at [email protected]

To contact the editors responsible for this story: Seth Stern at [email protected] ; John Crawley at [email protected]

Learn more about Bloomberg Law or Log In to keep reading:

Learn about bloomberg law.

AI-powered legal analytics, workflow tools and premium legal & business news.

Already a subscriber?

Log in to keep reading or access research tools.

IMAGES

  1. The ethics of secondary data analysis

    data ethics thesis

  2. 4. Ethical uses of data

    data ethics thesis

  3. An Introduction to Data Ethics

    data ethics thesis

  4. (PDF) Data ethics

    data ethics thesis

  5. Research Report

    data ethics thesis

  6. Data ethics is more important than ever

    data ethics thesis

VIDEO

  1. Ethics in Research Publication

  2. Types of Research| Objectives of Research| PhD Thesis

  3. Everything about Plagiarism with PYQ #Ph.D. Coursework, #Research and Publication Ethics

  4. Ethical Aspects of Metadata Analysis with SQL (NSA Part 2)

  5. Ethics in Educational Research (Hindi)

  6. Data Ethics and Governance

COMMENTS

  1. PDF Data ethics in the context of data literacy

    5.1 The beginning phase of the analysis. The main research objects of this inductive analysis are data literacy curricula/frameworks that address training on data ethics for undergraduate and graduate students. As shown in Table 1, there were nine from 27 collected sources used for the inductive analysis.

  2. Ethical Considerations in Data Collection and Analysis: a Review

    In an era where data profoundly influences decision-making across various sectors, this comprehensive review critically examines the evolving landscape of data science ethics, particularly ...

  3. PDF Making sense of data ethics. The powers behind the data ethics debate

    A central thesis of this article is that we should perceive data ethics ... data ethics expert committee (2018) and am part of the European Commission's Artificial Intelligence High Level Expert group (2018-2020). I am also the founder of the non profit

  4. PDF Data Ethics and Non-compliance Challenges in Devops

    constantly in-tune with Data Ethics best practices for different clients in the global marketplace. 1.2 Scope of the thesis The scope of this thesis is limited to ethics related to the usage of client data entrusted to the care of the DevOps team. More emphasis will be on the use of personal data and the basis of ethical

  5. The Ethics of Publicly Available Data Research: A Situated Ethics

    In a systematic review of Reddit studies, Proferes et al. (2021) investigated ethical approaches concerning issues of informed consent, de-identifying data, and dataset sharing. They found that most papers (86%) about Reddit do not explicitly discuss ethical considerations, and if they do, it is often only to state an exemption from ethics review, sought formally or self-decided.

  6. Own Data? Ethical Reflections on Data Ownership

    In discourses on digitization and the data economy, it is often claimed that data subjects shall be owners of their data. In this paper, we provide a problem diagnosis for such calls for data ownership: a large variety of demands are discussed under this heading. It thus becomes challenging to specify what—if anything—unites them. We identify four conceptual dimensions of calls for data ...

  7. What is data ethics?

    In achieving this task, data ethics can build on the foundation provided by computer and information ethics, which has focused for the past 30 years on the main challenges posed by digital technologies [1-3]. This rich legacy is most valuable. It also fruitfully grafts data ethics onto the great tradition of ethics more generally.

  8. Ethics as Methods: Doing Ethics in the Era of Big Data Research

    This is an introduction to the special issue of "Ethics as Methods: Doing Ethics in the Era of Big Data Research." Building on a variety of theoretical paradigms (i.e., critical theory, [new] materialism, feminist ethics, theory of cultural techniques) and frameworks (i.e., contextual integrity, deflationary perspective, ethics of care), the Special Issue contributes specific cases and ...

  9. (PDF) What is data ethics?

    ethics that studies and evaluates moral problems. related to data (including generation, recording, curation, processing, dissemination, sharing and. use), algorithms (including artificial ...

  10. PDF An Introduction to Data Ethics MODULE AUTHOR: Shannon Vallor, Ph.D

    An Introduction to Data Ethics MODULE AUTHOR: 1 Shannon Vallor, Ph.D. William J. Rewak, S.J. Professor of Philosophy, Santa Clara University TABLE OF CONTENTS Introduction 2-7 PART ONE: What ethically significant harms and benefits can data present? 7-13 Case Study 1 PART TWO: Common ethical challenges for data practitioners and users Case Study 2

  11. An Introduction to Data Ethics: What is the Ethical Use of Data?

    What is Data Ethics? In short, data ethics refers to the principles behind how organizations gather, protect, and use data. It's a field of ethics that focuses on the moral obligations that entities have (or should have) when collecting and disseminating information about us. In a world where data is more valuable and ubiquitous than ever ...

  12. The dilemma and countermeasures of educational data ethics in the age

    However, educational data ethics is an important factor that hinders the application of educational data. ... and Thesis or Meeting or Published online or Review paper or Books (Type of literature ...

  13. The digital traveller: implications for data ethics and data governance

    This paper aims to suggest a framework for ethical data management in tourism and hospitality designed to facilitate and promote effective data governance practices.,The paper adopts an organisational and stakeholder perspective through a scoping review of the literature to provide an overview of an under-researched topic and to guide further ...

  14. 5 Principles of Data Ethics for Business

    5 Principles of Data Ethics for Business Professionals. 1. Ownership. The first principle of data ethics is that an individual has ownership over their personal information. Just as it's considered stealing to take an item that doesn't belong to you, it's unlawful and unethical to collect someone's personal data without their consent.

  15. Data ethics: What it means and what it takes

    In this article, we define data ethics as data-related practices that seek to preserve the trust of users, patients, consumers, clients, employees, and partners. Most of the business leaders we spoke to agreed broadly with that definition, but some have tailored it to the needs of their own sectors or organizations (see sidebar, "What is data ...

  16. Data Rights and Responsibilities: A Human Rights Perspective on Data

    Human rights are a shared, internationally recognized framework.The Universal Declaration of Human Rights (UDHR), adopted unanimously by the United Nations in 1948, arose from World War II as a global statement of the dignity of all people and the limitations of governments.Many of the principles articulated in the UDHR and later codified in the core international human rights treaties are ...

  17. Ethics, Privacy and Data Collection: A Complex Intersection

    The technology around us enables incredible abilities such as high-resolution video calls and the ability to stay connected with everyone we care about through social media. This technology also comes with a hidden cost in the form of data collection. This work explores what privacy means and how users understand what data social media companies collect and monetize. This thesis also proposes ...

  18. AI & Data Ethics

    The AI and Data Ethics initiative aims to develop a robust ethics ecosystem for responsible development and use of autonomous, computational, and data driven systems through foundational research, translational research focused on policy and practice, education and training programs, and public scholarship. Semester. Projects and Initiatives.

  19. PDF Ethical Choices in Research: Managing Data, Writing Reports, and

    debriefing subjects once their data have been collected. You will need to describe how you will ensure the confidentiality of the data after the subjects have gone. suppose you and some colleagues decide to do research on parents' involvement in their children's lives. You decide to go to a youth soc-

  20. PDF Federal Data Strategy Data Ethics Framework

    working with data in the public sector should have a foundational understanding of the Data Ethics Tenets. Federal leaders should also foster a data ethics-driven culture and lead by example. The Data Ethics Tenets are: 1 - Uphold applicable statutes, regulations, professional practices, and ethical standards. Existing laws reflect and reinforce

  21. Ethical Considerations in Research

    Revised on May 9, 2024. Ethical considerations in research are a set of principles that guide your research designs and practices. Scientists and researchers must always adhere to a certain code of conduct when collecting data from people. The goals of human research often include understanding real-life phenomena, studying effective treatments ...

  22. Research ethics and data protection

    Research ethics and data protection. When planning for the management of data collected from research participants, it is essential to consider issues of research ethics and data protection from the outset, because how you handle the information and consent processes may affect your ability to share data later on.

  23. Thesis Defense: Emilia Emerson

    Thesis Defense . Emilia Emerson "The Effect of Hydraulic Retention Time on Recoverable Ammonia, Virus-Particle Association, and Bacterial, Fungal, and Viral Populations in a Bench Scale Activated Sludge Municipal Wastewater Treatment System" Thursday, August 8, 2024. 9:00 AM - 10:00 AM EST. Farrall Hall room 208

  24. Exploring the role of nicotine and smoking in sleep behaviours: A

    Research has shown bidirectional relationships between smoking and adverse sleep behaviours, including late chronotype and insomnia, but the underlying mechanisms are not understood. One potential driver is nicotine, but its role in sleep is unclear. For this study, we estimated the direct effect of nicotine on six sleep behaviours measured in UK Biobank (chronotype, ease of getting up in the ...

  25. Swimming in Data: How Kate Douglass' statistical studies have fueled

    Douglass co-authored an academic paper, "Swimming in Data," which detailed how these techniques helped her refine her breaststroke, resulting in breaking a 12-year-old American record in the 200-meter breaststroke. "Data analytics has kind of started to become big around swimming," Douglass said.

  26. What Is Qualitative Research? An Overview and Guidelines

    Concluding with a discussion on ethical considerations, from participant recruitment to data stewardship, this guide serves as an essential resource that offers insightful, actionable guidance for conducting effective and impactful qualitative research.

  27. Making sense of data ethics. The powers behind the data ethics debate

    Data ethics has gained traction in policy-making. The article presents an analytical investigation of the different dimensions and actors shaping data ethics in European policy-making. ... A central thesis of this article is that we should perceive data ethics policy initiatives as open-ended spaces of negotiation embedded in complex socio ...

  28. FACT SHEET: Biden-Harris Administration Announces New AI Actions and

    Nine months ago, President Biden issued a landmark Executive Order to ensure that America leads the way in seizing the promise and managing the risks of

  29. An Automated Pipeline for the Identification of Liver Tissue in

    Liver diseases are a leading cause of death worldwide, with an estimated 2 million deaths each year. Causes of liver disease are diffi- cult to ascertain, especially in sub-Saharan Africa where there is a high prevalence of infectious diseases such as hepatitis B and schistosomi- asis, along with alcohol use. Point-of-care ultrasound often is used in low-resource settings for diagnosis of ...

  30. Elena Kagan Endorses High Court Ethics Enforcement Mechanism (1)

    Justice Elena Kagan proposed Chief Justice John Roberts appoint a panel of judges to enforce the US Supreme Court's code of conduct. While speaking Thursday at a judicial conference in Sacramento, California, Kagan said she trusts Roberts and if he creates "some sort of committee of highly respected judges with a great deal of experience and a reputation for fairness," that seems like a ...