LTH Insights/FROM INPUT TO EVIDENCE: WILL YOUR GENAI PROMPTS BE DISCOVERABLE?

From Input to Evidence: Will Your GenAI Prompts Be Discoverable?

logo

EDiscovery professionals are no strangers to having to master new data types—from the advent of email to the latest ephemeral messaging apps, they’ve been doing it for decades. 

When it comes to generative AI and eDiscovery, however, much of the buzz has been aimed at how the technology might revolutionize eDiscovery workflows, whether by replacing TAR 2.0 or offering whole new ways of reviewing documents with techniques like plain language or semantic search.  

Much less focus has been placed on the notion that how legal professionals are using GenAI might itself be discoverable. 

Debates persist over how GenAI can and should be used in law, but one point is clear: legal professionals are using it. Not always in the right way, but they’re definitely using it.  

And if they’re using it for legal work, it’s not a huge leap for the data involved in that usage to soon become the subject of discovery requests. 

 

A New Form of Potentially Relevant Data 

 

By now, we all know that good prompting is key to producing the most useful GenAI outputs. For sure, there’s a degree of trial and error involved in achieving even the best prompts, but the more you know about prompting and the more experience you have getting to your optimal result faster, the better. 

Stop for a minute, though, and think about every GenAI-powered tool your team uses. Whether it's an eDiscovery-specific tool, an enterprise solution like Copilot, or any of the numerous other AI-enabled legal tech tools or AI legal assistants on the market today, you create new data every time you use GenAI. 

And with GenAI use at an all-time high, we’ve started hearing reports that even the less-tech-savvy lawyers who historically avoided getting their hands dirty with technology, particularly eDiscovery tools, are now jumping in and playing around.  

“Playing” means prompting, which creates two types of new data to consider: 

 

  1. The prompts being input to generate new content; and 

  1. The generated content itself. 

 

For many, it’s easier to think of the latter data as potentially discoverable for a number of reasons. First, the AI output is often quite lengthy and “feels” more like traditional work product. Second, the output, generally speaking, is the purpose of using the AI in the first place. It’s not surprising, therefore, that for most litigators, the prompts that generated the output are easily forgotten once the new content—a brief, a timeline, an outline, etc.—is in hand and improving case efficiency. 

Nonetheless, prompts can reveal a lot of insights into case strategy, which facts are considered important, and more. It might be tempting to write off prompts as a few words in a vacuum that hold little meaning, but, in reality, good prompts can be paragraphs or even pages long. The more well-crafted and useful a prompt is, the more information it’s likely to contain about the nature of the legal work being done. 

Now think about a standard discovery request—with its notorious “any and all” language—seeking “documents”—today, typically defined broadly to include any kind of data—pertaining to a laundry list of topics and issues related to a case. 

If you have a prompt that touches on and is intended to produce outputs regarding an issue involved in that case (even if it was not yet a matter in litigation), there’s a strong argument that that GenAI data—both the prompt and the output—are responsive to the discovery request. 

 

A Matter of Privilege 

 

Of course, content created by lawyers tends to find itself the subject of privilege arguments. In the case of GenAI prompts and outputs created in the course of legal work, work product protection would be the strongest potential argument on the privilege front. 

This, of course, requires the work—here, the prompting of the AI tools and the outputs generated—to be performed in anticipation of litigation. And in many instances, that may well be true, shielding the GenAI data from production. 

As GenAI becomes more ubiquitous in the practice of law, however, its use has fallen and will continue to expand far beyond the realm of litigation. For example, one of the first areas to see a major GenAI push was contracting—everything from drafting to redlining and review to negotiating and more. While deals often become the subject of litigation, the work done by legal professionals in getting them done falls firmly outside the scope of work product protection. The more Gen AI is ingrained in the standard course of everyday legal work, the more it will fall within the universe of potentially discoverable data. 

Attorney-client privilege, on the other hand, will likely not come into play with GenAI data. Even if, contrary to currently prevailing opinion, GenAI use were to be considered a “communication,” that communication would be between a human and non-human, not an attorney and a client. In fact, some exchanges that may have once taken place between lawyers and their clients are likely slowly being replaced by GenAI interactions, whether in the name of self-service, a desire to reduce legal fees, or something else. This arguably results in even more data that’s potentially discoverable than was created in the course of the traditional, pre-GenAI legal representation dynamic. 

Of course, we’re in the very early days of GenAI use in terms of the courts weighing in on eDiscovery or other legal issues. Eventually we’ll likely have clearer guidance on these discoverability and privilege issues. For now, though, the adage of “better safe than sorry” should rule the day. 

 

GenAI and Legal Hold Considerations 

 

Although the status of GenAI data in terms of its discoverability is still undecided, many legal professionals are starting to at least contemplate the possibility that it will eventually be discoverable. With this knowledge in hand, ignoring the preservation of GenAI prompts and outputs is the wrong path to take. 

Particularly once a legal hold is put in place, GenAI data should be subject to the same retention and preservation procedures as all other documents and data pertaining to the matter in issue. This doesn’t mean everything preserved will definitely be produced—that’s never the case with litigation holds.  

The flip side, however, is what matters. If you’re on notice that you have GenAI data that would fall under a litigation hold if it were any other kind of data, and you then fail to preserve that GenAI data, you could be looking at spoliation sanctions down the road. 

 

The Impact of Prompt Libraries 

 

When the term prompt engineering started making the rounds in early 2023, some wanted to dismiss the concept as a passing fad. If anything, though, the opposite has become true. 

Good prompts are considered so important they’re rising to the level of precedent at many firms. Firms that have committed to successful, responsible GenAI deployment and use are building centralized libraries of best-in-class prompts with the help of dedicated internal teams who test and refine existing prompts before making them available to lawyers and practice groups. It’s similar to the concept of clause banks for internally preferred contract language, except the repository is full of those paragraphs- or pages-long prompts that have been proven to generate ideal outputs. 

Likewise, many vendors are now also offering built-in prompt libraries as part of their AI legal assistant tools. This includes both substantive, pre-vetted prompts for specific tasks or workflows, as well as the ability for users to create their own prompt libraries within the platform. 

In both instances, many of the prompts in the prompt library function almost like templates, with placeholders for case- or matter-specific information to be added by users and the ability to tweak aspects of the stored prompts as needed in practice. For firms with prompt libraries, these repositories exist in addition to whatever prompt repositories might be created as part of a GenAI data preservation process. 

This means prompt libraries offer a dual benefit. First is the driver for creating prompt libraries in the first place—better, more consistent, and more efficient GenAI outcomes. A second potential benefit, however, is a reduction in rogue or trial-and-error prompting by those who have yet to master the art of it —which is more likely to result in data you’d rather not have your opponent see if it’s ever ordered to be produced. Using precedent prompts helps control the amount and nature of sensitive, case-specific data being input into GenAI tools because it significantly reduces the need to reinvent the wheel.  

 

Prompts and Data Retention Policies 

 

In theory, data retention policies (whether related to legal holds or not) should require organizations to preserve both the template precedent prompts and the prompts tailored for specific uses. Given the lack of established guidance so far on what prompts, if any, might be discoverable, it’s impossible to say for certain if both types of prompts would be deemed responsive to discovery requests. There’s a much stronger argument, however, that the template prompts would not be responsive if they contained only placeholders to be tailored with fact-specific information before use. 

As for how long you should retain matter-specific prompts, there’s no right answer, but the considerations are similar to those that eDiscovery professionals by now know well from learning how to handle data from newer technologies like Slack or ephemeral messaging apps. The most important thing is to have a clear data retention policy for all types of electronic data and to apply it consistently. If your internal guidelines don’t currently contemplate GenAI data, you should consider amending them sooner rather than later. 

Additionally, some tools that are already automatically preserving GenAI data may have default retention periods (sometimes forever). You should speak with your internal team and your vendors to determine what those are, what your options are, and what retention periods most align with your internal policies.  

If you receive a litigation hold and can point to a standard retention policy when questioned about deleted data, you’ll be in a much better position than if you have not set standards or allow individual users the ability to delete or retain data at will. 

Even if there are no clear answers yet, the time to start having these conversations is now, not when you’re facing litigation and in the depths of discovery. 

 

The Way Forward 

 

With potential discoverability on the horizon, some eDiscovery providers are already implementing mechanisms for collecting and preserving GenAI prompts and outputs. Because this field is so rapidly evolving, legal teams should consult with their individual eDiscovery vendors to understand the extent of their prompt preservation capabilities. This includes whether GenAI data is collected at all and, if so, whether it’s preserved with all necessary context (for example, preserving related prompts and outputs together). 

Once you know GenAI data preservation processes are in place, it’s also important to know where that data is being stored and who can access it. Sometimes it might be in a separate repository, while other times it might be within a specific tool like Copilot. Either way, the preserved data will likely not be easily visible to or accessible by the average user and will instead require cooperation with IT departments or service providers if it needs to be reviewed and produced. 

GenAI data may seem like a daunting new frontier of collection and preservation issues, but eDiscovery professionals are equipped to handle it. New paradigms needed to be created for emerging data types like social media, Slack, and ephemeral messages. While GenAI data is not the same, legal teams should draw from the lessons learned in tackling those modern data types. Doing so will significantly help with proactively addressing potential challenges and mitigating eDiscovery risks down the road. This includes anticipating GenAI use and monitoring the evolving features of any GenAI tools being used. 

Finally, education goes a long way. All organization stakeholders should understand that the data they produce while using AI-powered tools might eventually be just as discoverable as all other tech-enabled work they do. 

GenAI experimentation should not be stifled, but “think before you prompt” is a great framework for approaching it. 

Search Legaltech Jobs
Legaltech Jobs provides targeted job listings for alternative careers in law, including roles in legal technology, legal data, legal operations, legal design, and legal innovation. Click and browse to find your next opportunity!
Search Now

Loading...