We would like to ensure that you are still receiving content that you find useful – please confirm that you would like to continue to receive ILO newsletters.
23 February 2016
A recent interlocutory judgment in Pyrrho Investments Limited v MWB Property Limited  EWHC 256 (Ch) endorses, for the first time, the use of predictive coding when conducting disclosure in English civil proceedings. The decision is of significant interest to all parties involved in disputes with significant volumes of documents.
Disclosure in any large case can often be challenging. The extent of the "reasonable search" to be conducted by parties pursuant to Part 31 of the Civil Procedure Rules and its practice directions can be a particularly vexing question, given the sharp increase in the amount of electronic data being created in all walks of life. Frequently, the starting point for the number of documents captured by the search can run into several million. Associated costs can be vast, as can be the amount of time required to review documents.
In the case at hand the bulk of documents to be reviewed for the purposes of disclosure were held by the second claimant because it controlled back-up tapes on which all data from its servers (including emails) was stored during the relevant period. The restoration of data from a selection of those back-up tapes yielded more than 17.6 million documents. This was reduced to approximately 3.1 million by a process of electronic de-duplication; but reviewing this number of documents remained a large and costly exercise which, in broad terms, the parties agreed would be disproportionate in this case. As a result, the parties sought to consider ways in which the second claimant's disclosure review process could be improved by the use of technology.
Parties ordered to give standard disclosure in English proceedings must, in summary, make a reasonable search for documents (including electronic documents) which are helpful or unhelpful to their case or the case of another party.(1) Factors relevant in deciding the extent of the reasonable search include matters such as:
These factors are expanded specifically in relation to the reasonable search for electronic documents to include a requirement that parties "should bear in mind that the overriding objective includes dealing with the case in ways which are proportionate".(3)
However, the question of how the search for and review of electronic documents should be conducted is not dealt with in any detail. There are comments in Practice Direction B to Civil Procedure Rule 31 about the use of "Keyword Searches or other automated methods of searching if a full review of each and every document would be unreasonable". The judges of the Technology and Construction Court support an eDisclosure Protocol, produced by practitioners and available on the website of the Technology and Construction Solicitors' Association. This contemplates the use of computer software in appropriate cases. However, it is only a protocol and has no normative force.
Historically, this lack of guidance has been less of an issue because fewer electronic documents needed to be searched for and reviewed, and the amount of paper documents tended to be more manageable in most cases. As a result, litigators have for decades tended to employ a manual 'linear review' process to review documents which are collected by a reasonable search, where a lawyer or teams of lawyers review each document collected to determine whether it is disclosable. In cases with large numbers of documents, this can lead to a substantial army of lawyers and paralegals reviewing several hundreds of thousands of documents over a period of many months. However, as the amount of data – particularly electronic documents – continues to increase, such linear reviews often become ever more unworkable from both a time and proportionality perspective. Step in predictive coding.
Predictive coding goes by many names, including 'technology-assisted review' and 'computer-aided review'. It involves the review of documents by proprietary computer software, rather than human beings. A number of potential variables and processes are involved; but in essence, the computer software is 'trained' by lawyers who are familiar with the issues in the case. The lawyers review various subsets of the global dataset available and the computer then categorises all other available documents as relevant or not relevant, essentially by applying complex algorithms and looking for common concepts and language used in documents.
There follows a further level of manual review, after which the documents to be disclosed can be finalised. The extent of this tends to lie somewhere between, on the one hand, a full manual review of all documents considered by the computer to be relevant and, on the other, a less extensive review only of further subsets of documents for the purposes of quality assurance. Various mechanisms can also be built into the process to seek to identify material which may be privileged for full manual review.
At the least, predictive coding helps to ensure that documents most likely to be relevant are reviewed earlier in the process and documents least likely to be relevant are not subject to manual review. However, if employed to a greater extent in appropriate cases, predictive coding can allow only a relatively small proportion of the overall pool of documents to be subject to manual review, with a far greater number ultimately selected by the computer for disclosure without necessarily having been manually reviewed at all. The latter approach – although perhaps not suitable in all cases – was contemplated (and ultimately agreed) in this case, for various reasons.
In light of the large amount of documents involved in the second claimant's disclosure review, and the projected costs and timescales involved to conduct a full linear review of them, the parties engaged in extensive correspondence to seek to agree a sensible and proportionate process to be utilised by the second claimant.
At the case management conference, the court ordered that a further hearing be held to deal with issues pertaining to e-disclosure. Leading up to that hearing, the parties sought to agree various methods to make the second claimant's disclosure review process more targeted, including by identifying particular data custodians, applying date ranges and using keyword searches in the usual way. In addition, the parties agreed in principle to use predictive coding to significantly reduce the amount of manual review to be undertaken. However, as the method of predictive coding contemplated would mean that not all documents disclosed by the second claimant would have been reviewed by its legal team prior to disclosure, and given that no prior English authority endorsed the use of predictive coding to any extent, the parties considered it appropriate to seek the court's endorsement of the proposed approach.
In its helpful judgment on these issues, the court explained the matters summarised above in further detail. It referred to previous comments on electronic disclosure made by the English Court in Goodale v Ministry of Justice,(4) which contemplated the use of computer software to aid a disclosure review (but went no further than that). It also referred to judgments in US federal court(5) and the Irish High Court,(6) which both gave helpful commentary on the use of computer-assisted review in those jurisdictions. Drawing all of this together, the court then cited 10 factors in favour of approving the use of predictive coding in this case, including the following:
The court was also of the view that no factors pointed against the use of predictive coding in this case.
As a result, the court concluded that this "was a suitable case in which to use [predictive coding], and that it would promote the overriding objective set out in Part 1 of the CPR".
The court's approval of predictive coding is welcome and a significant development. With the ever-increasing amounts of data often being handled in litigation and automated search techniques becoming ever more sophisticated, perhaps the only surprise is that it has taken until now for the court to formally endorse its use. In any event, the potential benefits of predictive coding in appropriate cases are obvious. At the least, it presents a viable alternative to traditional linear reviews for consideration.
For those involved in complex litigation with vast numbers of documents, this judgment is likely to provide the comfort needed to allow serious consideration to be given to the use of predictive coding, which previously had perhaps been seen a riskier and less defensible alternative to linear reviews. As a result, the use of predictive coding in such cases may well increase notably following this judgment (as, perhaps, will the body of English judicial authority supporting its use).
For further information on this topic please contact Daniel Wyatt or Simon Hart at RPC by telephone (+44 20 3060 6000), fax (+44 20 3060 7000) or email (email@example.com or firstname.lastname@example.org). The RPC website can be accessed at www.rpc.co.uk.
The materials contained on this website are for general information purposes only and are subject to the disclaimer.
ILO is a premium online legal update service for major companies and law firms worldwide. In-house corporate counsel and other users of legal services, as well as law firm partners, qualify for a free subscription.