🤖 AI Summary
It remains unclear whether laypeople can reliably distinguish legal advice generated by large language models (LLMs) from that provided by licensed attorneys—and how source attribution affects their evaluation and adoption of such advice.
Method: We conducted three double-blind randomized controlled experiments (N = 288) using a text-source masking paradigm to manipulate awareness of advice origin.
Contribution/Results: When source information was withheld, participants significantly preferred LLM-generated advice over human attorney advice (p < 0.001). When source was explicitly disclosed, participants achieved 68% accuracy in distinguishing LLM from attorney advice—significantly above chance (p < 0.01). This reveals a “high-discrimination–low-source-trust” cognitive paradox in legal contexts: individuals can identify LLM output yet remain reluctant to trust it when its origin is known. These findings challenge the intuitive assumption that LLM advice is *under*-valued due to source ambiguity; instead, they demonstrate that transparency may *reduce* perceived credibility despite improved discrimination. The results provide critical behavioral evidence for assessing AI legal service trustworthiness and designing human-AI collaborative governance frameworks.
📝 Abstract
Large Language Models (LLMs) are seemingly infiltrating every domain, and the legal context is no exception. In this paper, we present the results of three experiments (total N = 288) that investigated lay people's willingness to act upon, and their ability to discriminate between, LLM- and lawyer-generated legal advice. In Experiment 1, participants judged their willingness to act on legal advice when the source of the advice was either known or unknown. When the advice source was unknown, participants indicated that they were significantly more willing to act on the LLM-generated advice. The result of the source unknown condition was replicated in Experiment 2. Intriguingly, despite participants indicating higher willingness to act on LLM-generated advice in Experiments 1 and 2, participants discriminated between the LLM- and lawyer-generated texts significantly above chance-level in Experiment 3. Lastly, we discuss potential explanations and risks of our findings, limitations and future work.