Artificial intelligence (AI) refers to the application of computers to tasks for which human intelligence was traditionally required. The definition, scope, and usage of the term have changed and evolved since it was first coined.
Computers have some distinct advantages over humans in their capacity to process large amounts of data, but can they be relied on to synthesise data rationally and make ethical decisions or human-like judgements?
OpenAI’s ChatGPT, Google’s BARD, and Meta’s LLaMA are examples of an AI-based technology called ‘large language models’ or LLMs. LLMs ‘learn’ the relationship between words through processing huge amounts of text.
It can parse and formulate linguistic constructs like nouns and verbs, sentence formation, and how related phrases span paragraphs. This learning is ultimately based on complex statistics related to word relationships and occurrence frequencies.
The edge of tomorrow
It is achieved using a number of mechanisms; firstly, through the equivalent of book learning by giving ChatGPT 300 billion words of documents, books, and websites to read. Subsequently, human feedback is used to reinforce preferred responses from ChatGPT by surveying people (significantly, who are not subject-matter experts) on the best answer from a selection of alternatives. The system then relies more on the preferred style of response. Understanding this highlights the potential for bias in the system.
Media coverage of ChatGPT has led to hype about its capabilities, such as writing a legal contract or composing a sonnet about climate change. Most lawyers are familiar with PowerPoint’s capability to automatically change a presentation’s style format. For some time, email and search engines have been able to suggest what you might want to type and to ‘auto-complete’ a phrase. When an internet search, autocomplete, and style are integrated, it is impressive – but it’s not intelligent. As Meta’s chief scientist Yann LeCun commented: “It’s nothing revolutionary, although that’s the way it’s perceived by the public. It’s just that, you know, it’s well put together, it’s nicely done” (ZDNet.com, 23 January 2023).
ChatGPT uses grammar rules to structure a sonnet. It has a vast library of information to draw from but, in the same way that a computer can beat a chess grandmaster, it cannot think for itself and extrapolate ideas beyond the tasks it has been trained to complete.
Chess and sonnets are examples of highly structured activities with clear rules. With more subtle tasks or more obscure topics, filling in the gaps is a risk. LLMs like ChatGPT are prone to what is called ‘hallucination’, resulting in assertions that can be wildly incorrect – for example, that infant digestion of breast milk is improved by adding porcelain chips, as reported in a recent Financial Times article.
The gods themselves
Legal professionals are trusted advisors whose work product is relied upon in situations where there may be significant associated risk to clients and society. This is why the profession is regulated.
ChatGPT is a commercial product, based in another jurisdiction, and the output cannot be relied upon with respect to compliance with the GDPR. It does not provide any guarantees about reliability of facts or error-free work product. Whereas with a traditional search via Google or Lexis Nexis you can control which of the presented search results to review or rely upon, ChatGPT provides a synthesised single result.
You are not presented with any context through which to interpret the material. It is just that – material that is moderated by an algorithm based on statistics and probability of the frequency of words in the training data, which is further optimised and tweaked through small group surveys asking people to rate their preferred ChatGPT responses to given questions. This presents risks that the system could, in the future, be targeted to make it learn and provide false or biased material.
Lawyers rely on the relevant law, background and matter context, and their own experience when assessing a new case or matter. While ChatGPT may have access to knowledge, it cannot synthesise it through the lens of the current matter or apply experience and expertise. Topic summaries provided by ChatGPT, even when they are basically correct, have been characterised to have “lacked depth and insight” (Stem Cell Reports editorial, 10 January 2023).
So, how might ChatGPT and similar technologies be applied by practitioners?
Reducing repetition and human error: ChatGPT’s dialogue-based interaction provides a natural and iterative style of querying, compared with a traditional search engine. A question can have follow-up questions or requests to refine, adapt, or augment your query. ChatGPT ‘remembers’ the context across the conversation.
ChatGPT’s command of the English language and capability to rephrase, or translate into a plain-English style, is a valuable tool for practitioners. This has great potential for providing alternative jargon-free versions of legal text – although it must still be subject to human review and approval for correctness.
Care needs to be exercised in considering the reverse. While ChatGPT can comply with a request to rephrase something in the style of a judge, its reliability to correctly apply terms of art in more formal language cannot be relied upon.
It also has the potential to reduce human error through acting as a prompt –for example, by providing reminders or examples of standard clauses for a contract and factors to consider. The potential to streamline and speed up repetitive boilerplate drafting processes (for example, letters or memos) is clear.
However, even if you opt-out, asking ChatGPT to reword a contract for you and uploading the draft would, most likely, breach your client-confidentiality obligations.
Summarisation, legal research, and discovery: ChatGPT provides a rapid summarisation capability for documents and can potentially help with reading into a topic, as long as the reader is aware that the summary provided may have omitted a piece of information that is essential to their case. For research, summaries provided by ChatGPT cannot be relied upon as comprehensive or accurate. Without the capacity to identify its sources, it can only act as a starting prompt for research.
ChatGPT has potential for discovery and many legal research activities, including information retrieval, document review, language translation, document labelling and classification, scheduling, and even automated redaction. For all of these, it needs access to in-house files.
ChatGPT achieves this through a service called ‘plugins’ that are still in the very early stages of development. They will allow integration with in-house knowledge-management and IT systems so that responses can integrate private documents or carry out tasks on your behalf, such as booking flights. However, the potential uses and associated risks remain unclear, as LLMs lack the understanding of potentially tacit concepts relating to privileged information or ethical walls. What is clear is the potential for data leakage, which alone is concerning.
The caves of steel
Over-reliance on a technology to deliver work products to a client presents an existential risk to the profession’s position as trusted advisors. If content produced by ChatGPT were used in the provision of legal advice, would the solicitor be able to explain their thought process in applying their mind to the problem presented to them?
If a client became aware that ChatGPT contributed to the provision of legal advice, would it devalue the work? Reliance on ChatGPT or other LLM-based systems could also reduce opportunities for upskilling in knowledge acquisition and critical thinking.
There are also implications with regards to professional obligations, professional negligence, and insurance: “Solicitors are under a duty to exercise their professional skill and judgement when acting on instructions. A solicitor must then advise on whether what is required is proper and legal” (Solicitor’s Guide to Professional Conduct, 2022).
Solicitors will be familiar with the definition of the standard of skill required of a professional as being that of an “ordinary skilled man exercising and professing to have that special skill” (Bolam v Friern Hospital Management Committee as approved in Ireland by Ward v McMaster).
In a negligence action, if a solicitor has relied on ChatGPT in providing advice, they may find it difficult to argue that the service they provided was of a reasonable standard, due to a failure to exercise their skill.
Even if the use of ChatGPT becomes an accepted practice within the profession, this will not necessarily provide a full defence to a negligence claim (see Roche v Peilow, ACC Bank Plc v Johnston, Kelleher v O’Connor). As to whether insurance companies will seek to limit indemnity in cases where ChatGPT has been used, this remains to be seen.
Prelude to foundation
Given that the provenance of information is of critical importance in law, the use of ChatGPT is problematic. The origin and factual reliability, along with the GDPR and copyright status of material produced by ChatGPT, is undisclosed and potentially unknown to the creators.
According to OpenAI, ChatGPT’s data comes from many sources, including publicly available data, scraped webpages (for example, Wikipedia), and data licensed from third-party providers. Therefore, while a vast amount of data is available, its accuracy is questionable. Indeed, ChatGPT itself provides a conflicting position on sources when asked about which third-party data it uses, asserting that it “does not have access to any licensed third-party data providers”.
There are significant copyright questions arising from the underlying source material used to train ChatGPT. While fair-dealing and fair-use exceptions vary between jurisdictions, the relevant law in Ireland would not permit the use of the copyrighted material in this instance.
It is common practice for law firms to retain copyright over their work product in their terms and conditions of business. The question arises whether the output of ChatGPT can be copyrighted. OpenAI specifically assigns the IP of the output to the user. However, it does not guarantee that the output of ChatGPT is a unique response, nor can it guarantee that copyright has not been infringed in the creation. The ability to assert copyright over work that encompasses ChatGPT outputs is unclear.
The positronic man
In scraping social-media websites for use in training data, it seems highly likely that ChatGPT has breached the GDPR in respect of personal data, and this data may find its way into ChatGPT responses. The wider GDPR issues surrounding data collection (including obligations regarding data minimisation), security, fairness and transparency, accuracy and reliability, and accountability also arise.
A response from ChatGPT tries to reassure users, saying: “It’s worth noting that the developers of ChatGPT have taken steps to ensure that the training data does not include any personally identifiable information or sensitive data that may infringe on user privacy or violate any data-protection regulations.”
Every user must register with a telephone number and login to use it, meaning that every interaction with ChatGPT is bound to an individual user, across every device or computer they use.
From a privacy perspective, as the Cambridge Analytica and Facebook case showed, it is naive to blindly trust a company not to use such data for personalisation, targeted advertising, or other unexpected or unwanted ways. Italy’s data protection agency is already investigating a suspected breach of data collection rules, as ChatGPT may have failed to verify that users were over 13 years of age.
The end of eternity
It is only six months post-launch, and ChatGPT has already been rapidly and widely adopted. It has the potential to change the way we search and access information, how we write everything from emails to contracts, and even how software code is written.
As a profession that deals in words, lawyers are faced with a technology that has implications for practices as large as the transitions from typewriters to word processors, and from paper to electronic records. The benefits and appeal of ChatGPT are apparent from an efficiency and cost perspective, but need to be tempered by an awareness of the significant underlying risks of undermining the value and integrity of the legal services provided.
Labhaoise Ní Fhaoláin is a member of the Law Society’s Technology Committee. She is completing a PhD in the governance of artificial intelligence, funded by Science Foundation Ireland at the School of Computer Science, University College Dublin. Dr Andrew Hines is an assistant professor in the School of Computer Science, University College Dublin. He is an investigator at the SFI Insight Centre for Data Analytics, a senior member of the IEEE, and a member of the RIA Engineering and Computer Sciences Committee.
Look it up
- Europol (Tech Watch Flash Report, 27 March 2023), ChatGPT – The Impact of Large Language Models on Law Enforcement
- Floridi, L, and M Chiriatti (2020), ‘GPT-3: its nature, scope, limits, and consequences’, Minds & Machines,Springer [30, 681-694
- Gal, Uri (2023), ‘ChatGPT is a data privacy nightmare. If you’ve ever posted online, you ought to be concerned’, The Conversation (8 February 2023)
- Hilton, Jacob, and Leo Gao (2023), ‘Measuring Goodhart’s law’ (OpenAI.com, 13 April 2023)
- Kasneci, E, et al (2023), ‘ChatGPT for good? On opportunities and challenges of large language models for education’, Learning and Individual Differences, 103  102274
- Online-chatgpt.com (10 February 2023), ‘Online ChatGPT: Optimising language models for dialogue’
- Salomon, G (1993), ‘On the nature of pedagogic computer tools: The case of the Writing Partner’, Computers as Cognitive Tools, 179 , p196
- Terwiesch, Christian (2023), ‘Let’s cast a critical eye over business ideas from ChatGPT’ (Financial Times, 12 March 2023)
- Vincent, James (15 March 2023), ‘OpenAI co-founder on company’s past approach to openly sharing research: ‘We were wrong’ (theverge.com)
Read and print a PDF of this article here.