• A theory of appropriateness with applications to generative artificial intelligence

    Item Type Preprint
    Author Joel Z. Leibo
    Author Alexander Sasha Vezhnevets
    Author Manfred Diaz
    Author John P. Agapiou
    Author William A. Cunningham
    Author Peter Sunehag
    Author Julia Haas
    Author Raphael Koster
    Author Edgar A. Duéñez-Guzmán
    Author William S. Isaac
    Author Georgios Piliouras
    Author Stanley M. Bileschi
    Author Iyad Rahwan
    Author Simon Osindero
    Abstract What is appropriateness? Humans navigate a multi-scale mosaic of interlocking notions of what is appropriate for different situations. We act one way with our friends, another with our family, and yet another in the office. Likewise for AI, appropriate behavior for a comedy-writing assistant is not the same as appropriate behavior for a customer-service representative. What determines which actions are appropriate in which contexts? And what causes these standards to change over time? Since all judgments of AI appropriateness are ultimately made by humans, we need to understand how appropriateness guides human decision making in order to properly evaluate AI decision making and improve it. This paper presents a theory of appropriateness: how it functions in human society, how it may be implemented in the brain, and what it means for responsible deployment of generative AI technology.
    Date 2024-12-26
    Library Catalog arXiv.org
    URL http://arxiv.org/abs/2412.19010
    Accessed 1/3/2025, 9:48:40 AM
    Extra arXiv:2412.19010 [cs]
    DOI 10.48550/arXiv.2412.19010
    Repository arXiv
    Archive ID arXiv:2412.19010
    Date Added 1/3/2025, 9:48:40 AM
    Modified 1/3/2025, 9:48:40 AM

    Tags:

    • Computer Science - Artificial Intelligence

    Notes:

    • Comment: 115 pages, 2 figures

    Attachments

    • Preprint PDF
    • Snapshot
  • Desire-Fulfilment and Consciousness - Andreas Mogensen

    Item Type Blog Post
    Author Christian Panzer
    Abstract I show that there are good reasons to think that some individuals without any capacity for consciousness should be counted as welfare subjects, assuming that desire-fulfilment is a welfare good and that any individuals who can accrue welfare goods are welfare subjects. While other philosophers have argued for similar conclusions, I show that they have done so by relying on a simplistic understanding of the desire-fulfilment theory. My argument is intended to be sensitive to the complexities and nuances of contemporary developments of the theory, while avoiding highly counter-intuitive implications of previous arguments for the same conclusion.
    Date 2024-10-28T14:33:10+00:00
    Language en-GB
    URL https://globalprioritiesinstitute.org/desire-fulfilment-and-consciousness-andreas-mogensen/
    Accessed 1/26/2025, 10:48:08 AM
    Blog Title Global Priorities Institute
    Date Added 1/26/2025, 10:48:08 AM
    Modified 1/26/2025, 10:48:08 AM

    Attachments

    • PDF
    • Snapshot
  • How to tell if a rule was broken: The role of codification, norms, morality, and legitimacy

    Item Type Preprint
    Author Jordan Wylie
    Author Dries H. Bostyn
    Author Ana P. Gantman
    Abstract Rules are essential for the successful coordination of large-scale societies, with official, codified rules (e.g., laws) proscribing behaviors for everyone in their jurisdiction. These rules ostensibly provide a clear signal about what is permitted or prohibited, making it straightforward to identify when they have been broken. However, signals from descriptive norms, moral prohibition, and (lack of) legitimacy of enforcement can sometimes provide conflicting accounts of what behaviors really violate rules, possibly shaping whether someone thinks a rule has been broken at all. Across three experiments (N = 2,262), we explored how each of these signals affect rule concept judgments. In Study 1, we used a variety of real rules in the US and found that all four signals—descriptive norms, codification, moral wrongness, and legitimacy of punishment—are associated with judgments of whether a rule was broken, but to varying degrees. Study 2 replicated these findings in a preregistered study. Study 3 experimentally manipulated these four signals in a novel context using a conjoint design. We found that codification and moral wrongness most strongly influence rule concepts. Together, these findings suggest that judgments about what constitutes a rule are shaped by the integration of multiple distinct signals, including but not limited to the literal codification of the rule itself.
    Date 2024-12-27
    Language en-us
    Short Title How to tell if a rule was broken
    Library Catalog OSF Preprints
    URL https://osf.io/mvkwy
    Accessed 1/3/2025, 9:47:47 AM
    DOI 10.31234/osf.io/mvkwy
    Repository OSF
    Date Added 1/3/2025, 9:47:47 AM
    Modified 1/3/2025, 9:47:53 AM

    Attachments

    • OSF Preprint
  • Imperfect Recall and AI Delegation - Eric Olav Chen, Alexis Ghersengorin and Sami Petersen

    Item Type Blog Post
    Author Christian Panzer
    Abstract A principal wants to deploy an artificial intelligence (AI) system to perform some task. But the AI may be misaligned and aim to pursue a conflicting objective. The principal cannot restrict its options or deliver punishments. Instead, the principal is endowed with the ability to impose imperfect recall on the agent. The principal can then simulate the task and obscure whether it is real or part of a test. This allows the principal to screen misaligned AIs during testing and discipline their behaviour in deployment. By increasing the number of tests, the principal can screen arbitrarily well and may even discipline perfectly in finite time. We show that, in equilibrium, screening can only be achieved with imperfect recall. The perfect screening result is robust to the agent observing any amount of noisy information revealing the nature of the task.
    Date 2024-11-28T08:49:43+00:00
    Language en-GB
    URL https://globalprioritiesinstitute.org/imperfect-recall-and-ai-delegation-chen-ghersengorin-and-petersen/
    Accessed 1/26/2025, 10:48:32 AM
    Blog Title Global Priorities Institute
    Date Added 1/26/2025, 10:48:32 AM
    Modified 1/26/2025, 10:48:32 AM

    Attachments

    • PDF
    • Snapshot
  • Who Does the Giant Number Pile Like Best: Analyzing Fairness in Hiring Contexts

    Item Type Preprint
    Author Preethi Seshadri
    Author Seraphina Goldfarb-Tarrant
    Abstract Large language models (LLMs) are increasingly being deployed in high-stakes applications like hiring, yet their potential for unfair decision-making and outcomes remains understudied, particularly in generative settings. In this work, we examine the fairness of LLM-based hiring systems through two real-world tasks: resume summarization and retrieval. By constructing a synthetic resume dataset and curating job postings, we investigate whether model behavior differs across demographic groups and is sensitive to demographic perturbations. Our findings reveal that race-based differences appear in approximately 10% of generated summaries, while gender-based differences occur in only 1%. In the retrieval setting, all evaluated models display non-uniform selection patterns across demographic groups and exhibit high sensitivity to both gender and race-based perturbations. Surprisingly, retrieval models demonstrate comparable sensitivity to non-demographic changes, suggesting that fairness issues may stem, in part, from general brittleness issues. Overall, our results indicate that LLM-based hiring systems, especially at the retrieval stage, can exhibit notable biases that lead to discriminatory outcomes in real-world contexts.
    Date 2025-01-08
    Short Title Who Does the Giant Number Pile Like Best
    Library Catalog arXiv.org
    URL http://arxiv.org/abs/2501.04316
    Accessed 1/26/2025, 10:47:59 AM
    Extra arXiv:2501.04316 [cs]
    DOI 10.48550/arXiv.2501.04316
    Repository arXiv
    Archive ID arXiv:2501.04316
    Date Added 1/26/2025, 10:47:59 AM
    Modified 1/26/2025, 10:47:59 AM

    Tags:

    • Computer Science - Computation and Language

    Attachments

    • Preprint PDF
    • Snapshot