AI deception : A survey of examples, risks, and potential solutions
Journal article
Park, Peter S., Goldstein, Simon, O'Gara, Aidan, Chen, Michael and Hendrycks, Dan. (2024). AI deception : A survey of examples, risks, and potential solutions. Patterns. 5(5), pp. 1-16. https://doi.org/10.1016/j.patter.2024.100988
Authors | Park, Peter S., Goldstein, Simon, O'Gara, Aidan, Chen, Michael and Hendrycks, Dan |
---|---|
Abstract | AI systems are already capable of deceiving humans. Deception is the systematic inducement of false beliefs in others to accomplish some outcome other than the truth. Large language models and other AI systems have already learned, from their training, the ability to deceive via techniques such as manipulation, sycophancy, and cheating the safety test. AI’s increasing capabilities at deception pose serious risks, ranging from short-term risks, such as fraud and election tampering, to long-term risks, such as losing control of AI systems. Proactive solutions are needed, such as regulatory frameworks to assess AI deception risks, laws requiring transparency about AI interactions, and further research into detecting and preventing AI deception. Proactively addressing the problem of AI deception is crucial to ensure that AI acts as a beneficial technology that augments rather than destabilizes human knowledge, discourse, and institutions. |
Year | 01 Jan 2024 |
Journal | Patterns |
Journal citation | 5 (5), pp. 1-16 |
Publisher | Cell Press |
ISSN | 2666-3899 |
Digital Object Identifier (DOI) | https://doi.org/10.1016/j.patter.2024.100988 |
Web address (URL) | https://www.sciencedirect.com/science/article/pii/S266638992400103X?via%3Dihub |
Open access | Published as ‘gold’ (paid) open access |
Research or scholarly | Research |
Page range | 1-16 |
Publisher's version | License File Access Level Open |
Output status | Published |
Publication dates | |
Online | 10 May 2024 |
Publication process dates | |
Deposited | 10 Oct 2024 |
Additional information | © 2024 The Authors. Published by Elsevier Inc. |
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). | |
Supplementary material available at: https://www.sciencedirect.com/science/article/pii/S266638992400103X?... | |
Place of publication | United States |
https://acuresearchbank.acu.edu.au/item/9100x/ai-deception-a-survey-of-examples-risks-and-potential-solutions
Download files
Publisher's version
OA_Goldstein_2024_AI_deception_A_survey_of_examples.pdf | |
License: CC BY-NC-ND 4.0 | |
File access level: Open |
30
total views6
total downloads12
views this month1
downloads this month