“Don’t Reinvent the Wheel, Just Realign It.”Just Realign It.”[1] How Lessons from the Belmont Report Can Help Govern the Use of AI in Research How Lessons from the Belmont Report Can Help Govern the Use of AI in Research
Steven Hammerton
Background
Artificial intelligence (AI) is becoming increasingly integrated into many areas of life, including research. However, legislation and regulation lag. Years into the widespread adoption of AI and the United States is still without meaningful guardrails to address the ethical quandaries that stem from the use of AI. Until there is comprehensive legislation, the burden of ensuring ethical training, development, and usage of AI will be on the developers, deployers, and users of AI, such as researchers and research participants. This article will explore three different ethical issues associated with AI and how principles from the Belmont Report can guide researchers and other users of AI in their pursuit of ethical AI.
The Trouble with AI Training and Development
The Black Box
The way in which AI algorithms are constructed pose one easily identifiable problem for AI developers, deployers, and users: a lack of explainability.
The lack of explainability is a direct consequence of the use of black box algorithms. Black box algorithms can be explained as a process or set of rules followed by a computer to perform a specific task, but the logic used to complete that task is either hidden or opaque.[2] Reasons for maintaining a black box algorithm vary, but they include the proprietary nature of the algorithm and the complexity of the decision making process.[3] Black box AI models are similar in the sense that they are based on the algorithmic process of machine learning, though the decision making process that analyzes a user provided input into a model generated output is not transparent.[4] Because the black box is not transparent problems may arise when such models are trained under conditions or deployed in situations affected by bias.[5]
For example, Amazon designed an AI model to vet job applicants.[6] The model was trained using the resumes of individuals working at Amazon at the time. Those resumes were used to identify characteristics that would be present in a successful candidate. The result? A sexist AI model. Because Amazon’s workforce was predominantly male, the model presumably made a negative association between references to femininity–even the use of the word woman–and successful candidacy.[7] While software engineers at Amazon eventually took notice of this biased behavior before the system was adopted beyond the testing phase, it highlights the real-world danger of the black box.[8]
A New (Digital) Colonial Era
The conversation surrounding AI and labor often revolves around job eradication. However, the current reality is that the AI boom has created many jobs, but those jobs generally lack fair wages and worker rights.[9] The exploitative practices used to procure the talent needed to train AI models represent a new age of colonialism characterized by US-based AI outsourcing firms targeting economic hardship in developing nations in the Global South.[10] For example, OpenAI, the developers of ChatGPT, paid Kenyan workers less than $2 per hour to view and label extremely graphic content to help make ChatGPT more user friendly.[11] OpenAI is not the only company to use international labor. The AI outsourcing firm, Scale, likewise has been accused of underpaying and even withholding payments to workers in the Philippines.[12] Filipino workers have reported that they were paid less than what was promised and removed from Scale’s remote work platform without payment despite completing tasks without any explanation from supervisors. In other instances, companies have taken advantage of economic catastrophes in order to form a cheap labor force to refine the world’s most advanced generative AI models.[13] Venezuela, currently experiencing the worst peacetime economic collapse in the last forty-five years,[14] is one of those sources of labor.[15] The economic conditions in Venezuela have created a significant imbalance in bargaining power, leaving workers subject to the wills of outsourcing companies leading to the exploitation of well-educated professionals and unskilled workers alike.[16] Meanwhile, the AI “gold rush” has allowed companies like OpenAI to increase their value by billions of dollars.[17]
Privacy
The black box and the way AI models are trained pose ethical dilemmas, and so do the methods used to obtain the data needed to train AI. In many cases, AI models are trained using data available on the internet, which may include personal data. To acquire the necessary training data for AI models, web scrapers are deployed to harvest vast datasets from all corners of the internet.[18] This process is conducted indiscriminately, often involving personal information and content, without the consent of the individuals who created the data.[19]
Amongst the most widespread examples of privacy invasions committed in the pursuit of AI is the development and deployment of Clearview AI’s facial recognition model.[20] The model was trained using billions of images of faces, and all of which were collected from social media sites by using a web scraper.[21] Clearview AI’s facial recognition model is now sold to law enforcement and private companies, compounding the rub against privacy by moving from unauthorized data collection to active surveillance.
The failure of AI models, like the one developed by Clearview, can lead to invasions of both privacy and personal autonomy, especially for historically marginalized groups. For example, historic and present underrepresentation and exclusion of minorities resulted in AI performing poorly in facial recognition tasks when provided with images of minorities.[22] Specifically, AI systems exhibit the highest rates of misclassification when attempting to identify darker-skinned females.[23] These misclassifications have resulted in ramifications for misidentified individuals. For instance, the Detroit Police Department used facial recognition to falsely arrest a Black woman, Porcha Woodruff, for carjacking and robbery.[24] As a result of the misidentification, the woman, eight months pregnant at the time, spent eleven hours in jail before being released on $100,000 bail.[25] While it was a traumatic event for Woodruff, it is far from the only example of AI exhibiting and contributing to systemic biases perpetrated against marginalized communities.[26]
Guidance from Belmont
Given the global and exploitative pitfalls of AI development, users and institutions are left asking a critical question: how can we engage with this technology ethically? While the digital landscape of AI may seem novel, we can find powerful guidance in the Belmont Report, a foundational document in human research ethics.[27]
The Belmont Report was commissioned by the U.S. government in the 1970s in response to atrocities committed during World War Two by Nazi scientists and revelations about the Tuskegee Syphilis Study.[28] After four years of deliberation, the authors established three clear and enduring principles: (1) respect for persons; (2) beneficence; (3) and justice.[29]
Although AI might seem removed from the biomedical and behavioral research contexts to which Belmont was meant to apply, Belmont’s principles are directly applicable to the challenges of AI. The creation and training of AI models is, in effect, an experiment in which all of humanity participates, often unknowingly. As generative AI’s role in research and society expands, the Belmont Principles offer an essential framework to steer its development in a more ethical direction, ensuring that technological advancement does not come at the cost of human dignity and rights.
Beneficence
Beneficence means treating persons “in an ethical manner not only by respecting their decisions and protecting them from harm, but also by making efforts to secure their well-being.”[30] In short, beneficence promotes the weighing of benefits and harms and the reduction of harm. In the context of the highlighted pitfalls of AI, this principle can be applied to the issue of the black box.
A researcher using an AI model with a black box algorithm will be unable to view the logical processes an AI model may use to draw discriminatory classifications about research participants or greater populations. Furthermore, fairwashing techniques may be employed to further justify and obfuscate discriminatory conclusions drawn by AI models, like the one deployed by Amazon and described above. Therefore, researchers and those that review research projects should pay particular attention to their ability to ascertain and understand the processes used by an AI model to turn queries and data into outputs used to answer research questions. Without the ability to see how an AI model arrives at a conclusion, a researcher will be unable to perform an accurate risk-benefit analysis, which runs counter to Belmont’s suggestion that researchers and their institutions are “obliged to give forethought to the maximization of benefits and the reduction of risk that might occur from the research investigation.”[31]
Practically speaking, to work toward incorporating beneficence into research using AI, researchers can conduct bias auditing exercises, strive for inclusive data, and consider using explainable AI models.
To conduct a bias audit, researchers should critically evaluate any conclusions drawn by AI or conclusions drawn based upon AI outputs. Additionally, reviewers may evaluate the AI model itself to determine whether it can be implemented in a way that minimizes harm to research participants or the public at large. By auditing the AI platforms they employ, researchers can target hidden biases that may jeopardize the quality of their research or unseen partialities that could later be used to justify real-world discrimination.
Striving to collect inclusive data may also help researchers ensure that the principle of beneficence is upheld. Like the situation of Amazon’s AI recruitment tool, underinclusive data can lead AI algorithms to draw erroneous conclusions and lead to misleading impressions. To mitigate this risk, researchers should keep in mind that the usefulness and accuracy of an AI model is only as good as the data that model is provided. Additionally, researchers should strive to produce data that is representative of the entire population to ensure that future harm arising from underinclusive AI training data sets is mitigated.
Lastly, explainable AI models do exist[32] and some regulations even require AI to be explainable.[33] Researchers should consider prioritizing the use of explainable AI models to aid in their ability to conduct bias audits and ensure the inclusivity of AI model outputs. Using explainable models may even bolster credibility due to the greater degree of reproducibility.
In sum, researchers should avoid black box AI models whenever possible. In cases where the benefit derived from the use of a black box model is substantial, researchers should evaluate the potential harm that would arise from its use by auditing AI drawn conclusions for bias and inquiring about the inclusivity of the data set used to train that model.
Justice
Belmont frames justice as the fair distribution of benefits and burdens of research. In the context of AI, these formulations could be achieved by scrutinizing the sources used to construct AI models used in research and ensuring that the often marginalized data labelers that underpin the success of AI models receive fair treatment.
Indeed, Belmont’s principle of justice directly cautions against selecting research subjects based on their “easy availability, their compromised position, or their manipulability, rather than for reasons directly related to the problem being studied.” Contrary to Belmont’s warning, companies, as noted above, systematically recruit workers in struggling economies to perform AI related tasks. These workers are not selected their relatedness to AI research and development but instead for their compromised positions. This practice runs counter to Belmont’s ethical mandate as it leverages the comprised positions of entire populations for reasons entirely unrelated to the advancement of AI.
To incorporate justice in AI, researchers could consider the conditions under which an AI model is trained before using it in research. For example, a researcher or institutional review board could implement a review process to determine whether an AI model was trained under unfair conditions, whether use of that AI model in research will continue to exacerbate those conditions, and whether the potential benefit derived from the use of that specific AI model would provide a benefit that substantially outweighs the harm or potential harm associated with the development of that model. Whenever possible, researchers and institutions should strive to ensure that AI models they use are produced in an ethical manner.
In sum, there is a human cost underlying the many generative AI models and that cost should be weighed when assessing the ethics of using AI in research.
Respect for Persons
Respect for persons is built on two core ethical convictions: First, “that individuals should be treated as autonomous agents,”[34] and second, “that persons with diminished autonomy are entitled to protection.”[35] An autonomous person is “an individual capable of deliberation about personal goals and of acting under the direction of such deliberation.” This principle is challenged at all stages of the AI life cycle from data collection to deployment.
AI models are buttressed by immense amounts of human-generated data and therefore can be viewed as a global human subjects research project. In this light, available AI models would likely fail Belmont’s standard that “subjects enter into the research voluntarily and with adequate information.”[36] For instance, automated web scrapers collect textual and visual data without no element of consent from data’s creators. This practice treats individuals not as autonomous agents with agency over their data and likenesses, but as freely available data points. Without the opportunity to make a considered participation in the advancement of AI, persons cannot be respected in the pursuit of stronger AI models.
Furthermore, the deployment of AI technology can actively undermine individual autonomy, particularly when the systems exhibit bias. The wrongful arrest of Porcha Woodruff, a case of mistaken identity by a biased facial recognition system discussed above, exemplifies how flawed AI can deprive individuals of their freedom through unjust imprisonment and surveillance. As Belmont states, “[t]o respect autonomy is to give weight to autonomous persons’ considered opinions and choices while refraining from obstructing their actions unless they are clearly detrimental to others.”[37] Further, “[t]o show lack of respect for an autonomous agent is to repudiate that person’s considered judgments, to deny an individual the freedom to act on those considered judgments, or to withhold information necessary to make a considered judgment, when there are no compelling reasons to do so.”[38] In cases of AI-assisted misidentification, an individual’s autonomy is severely obstructed, leaving them with little to no agency.
To counteract these ethical breaches, AI researchers and developers can take practical steps to uphold the principle of respect for persons. First, they can commit to using AI models trained exclusively on consensually collected data, upholding individual autonomy. Second, to combat the kind of algorithmic bias that affected Porcha Woodruff, researchers must strive to create datasets that are truly representative of the entire population, not just a majority group. Doing so would uphold collective autonomy. By actively working to build more inclusive technology, the AI community can begin to unwind historical exclusivity in research and build systems that respect the autonomy of all persons.
Conclusion
The issues facing researchers and their use of AI are complex and difficult to navigate. At times it seems that entirely new systems of governance may be needed to steer emerging technologies like AI. To be clear, the pursuit of innovation is fundamental to the progress of humanity, and that innovation often solves global challenges and improves lives. However, history and recent trends caution that progress without clear, principled guardrails leads to exploitation. Researchers share in the collective responsibility to govern their use of AI and promote fairness in AI systems through the production of data reflective of reality. Crafting effective AI governance is complex, but the ethical principles of beneficence, respect for persons, and justice offer a clear starting point. These values should not be read as limitations to innovation, but instead as keys to a future where technology is sustainable and equitable. Innovation in the absence of human dignity is exploitation.
[1] Anthony J. D’Angelo.
[2] Tom Cassauwers, Opening the ‘Black Box’ of Artificial Intelligence, Horizon (Dec. 1, 2020), https://projects.research-and-innovation.ec.europa.eu/en/horizon-magazine/opening-black-box-artificial-intelligence.
[3] Cynthia Rudin & Joanna Radin, Why Are We Using Black Box Models in AI When We Don’t Need To? A Lesson From an Explainable AI Competition, Harv. Data Sci. Rev. (Nov. 22, 2019), https://hdsr.mitpress.mit.edu/pub/f9kuryi8/release/8.
[4] Ellen Glover updated by Abel Rodriguez, What Is Black Box AI?, Built In (Jun. 30, 2025), https://builtin.com/articles/black-box-ai.
[5] Lou Blouin, Can We Make Artificial Intelligence More Ethical?, Univ. Mich. Dearborn (Jun. 14, 2021), https://umdearborn.edu/news/can-we-make-artificial-intelligence-more-ethical.
[6] Jeffrey Dastin, Insight – Amazon Scraps Secret AI Recruiting Tool That Showed Bias Against Women, Reuters (Oct. 10, 2018, 8:50 PM), https://www.reuters.com/article/world/insight-amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK0AG/.
[7] Id.
[8] Id.
[9] See Jonas Valente & Mark Graham, Fairwork Cloudwork Ratings 2023: Work in the Planetary Labour Market, Oxford Internet Inst. (July 17, 2023), https://fair.work/wp-content/uploads/sites/17/2023/07/Fairwork-Cloudwork-Ratings-2023-Red.pdf (discussing studies from the University of Oxford that found digital work platforms, including the ones used to curate talent for training AI, created unfair working conditions for their workers).
[10] Rebecca Tan and Regine Cabato, Behind the AI Boom, an army of overseas workers in ‘digital sweatshops’, Wash. Post (Aug. 28, 2023), https://www.washingtonpost.com/world/2023/08/28/scale-ai-remotasks-philippines-artificial-intelligence/. The “Global South” is a term used to “denote regions outside Europe and North America, mostly (though not all) low-income and often politically or culturally marginalized.” Nour Dados and Raewyn Connell, The Global South, Jargon (Feb 14, 2012), https://journals.sagepub.com/doi/pdf/10.1177/1536504212436479/.
[11] Billy Perrigo, OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic, Time (Jan. 18, 2023), https://time.com/6247678/openai-chatgpt-kenya-workers/.
[12] Tan, supra note 10.
[13] Karen Hao and Andrea Hernández, How the AI Industry Profits From Catastrophe, MIT Tech. Rev. (April 20, 2022), https://www.technologyreview.com/2022/04/20/1050392/ai-industry-appen-scale-data-labels/.
[14] Vanessa Buschschlüter, Venezuela crisis in brief, BBC (Aug. 5, 2024), https://www.bbc.com/news/world-latin-america-48121148.
[15] Hao, supra, note 13.
[16] Id.
[17] Phoebe Liu, The Billionaires Getting Rich From AI 2024, Forbes (Apr. 2, 2024), https://www.forbes.com/sites/phoebeliu/2024/04/02/the-billionaires-getting-rich-from-ai-2024.
[18] See Müge Fazlioglu, Training AI on personal data scraped from the web, IAPP (Nov. 8, 2023), https://iapp.org/news/a/training-ai-on-personal-data-scraped-from-the-web.
[19] Kashmir Hill, Clearview AI Used Your Face. Now You May Get a Stake in the Company, N.Y. Times (June 13, 2024), https://www.nytimes.com/2024/06/13/business/clearview-ai-facial-recognition-settlement.html
[20] Id.
[21] See id.
[22] Joy Buolamwini & Timnit Gebru, Gender Shades: Intersectional Accuracy Disparities in
Commercial Gender Classification, Proceedings of Machine Learning Rsch. (2018), https://proceedings.mlr.press/v81/buolamwini18a/buolamwini18a.pdf.
[23] Id.
[24] Joey Cappelletti, Pregnant Woman’s Arrest in Carjacking Case Spurs Call to End Detroit Police Facial Recognition, AP News, https://apnews.com/article/detroit-police-facial-recognition-lawsuit-cab0ae44c1671fc30617d301b21b2d13 (last updated Aug. 7, 2023).
[25] Christina Swarns, When Artificial Intelligence Gets It Wrong, Innocence Project (Sep. 19, 2023), https://innocenceproject.org/when-artificial-intelligence-gets-it-wrong/.
[26] See Thaddeus L. Johnson & Natasha N. Johnson, Police Facial Recognition Technology Can’t Tell Black People Apart, Sci. Am. (May. 18, 2023), https://www.scientificamerican.com/article/police-facial-recognition-technology-cant-tell-black-people-apart/.
[27] See The Belmont Report, 44 Fed. Reg. at 23 (1979) (available at https://www.hhs.gov/ohrp/regulations-and-policy/belmont-report/read-the-belmont-report/index.html). [hereinafter Belmont].
[28] Kailee K. Muscente, Ethics and the IRB: The History of the Belmont Report, Teachers college, Columbia Univ. (Aug. 3, 2020), https://www.tc.columbia.edu/institutional-review-board/irb-blog/2020/the-history-of-the-belmont-report/.
[29] Hiroyuki Nagai et. al., The Creation of the Belmont Report and its Effect on Ethical Principles: a Historical Study, Monash Bioeth Rev. (Nov. 10, 2022), https://pmc.ncbi.nlm.nih.gov/articles/PMC9700634/.
[30] Belmont, supra note 27.
[31] Id.
[32] What Is Explainable AI?, IBM, https://www.ibm.com/topics/explainable-ai (last visited Oct 13, 2025).
[33] See Artificial Intelligence Act No. 2024/1689 of 13 June 2024, art. 50, 2024 O.J. (206 COD), https://artificialintelligenceact.eu/article/50/.
[34] Id.
[35] Id.
[36] Id.
[37] Id.
[38] Id.
