Print Bookmark


The second in a series of Technology Assisted Review (TAR) related blogs

As my esteemed colleague, Chris Yoshida, effectively argued in his November 8, 2012 introductory TAR blog, the use of TAR technology platforms does appear to be burgeoning and for all of the right reasons. 

Absent any strategic guidance regarding the what, when, why and how to use it, however, it faces the difficult position of being viewed either as a natural sequel to The Avengers (attempting to save our entire Free Enterprise System from the gloom and doom of otherwise exorbitant electronic document review costs) or to being viewed as a fancy paper shredder (where it merely slices and dices our e-review data generally faster and better).  In either case, it still requires what can amount to a significant additional upfront project expenditure.

As it turns out, in my opinion, the additional upfront project expenditure is largely warranted in light of the deep and powerful TAR technology platforms required to make the TAR process actually work.  I also agree that there are significant bases for comparison across the leading best-of-breed TAR technology provider tools, foremost among them including solutions offered by OracTec, Epiq Systems, Recommind, and Equivio.  Comparative assessment should indeed be made regarding the pricing models for each, various front-end software features, GUI stylings, analytics and audit-trail reporting functionality, and perhaps more importantly, the varying manner in which the actual “go find more documents like this one” math and language algorithms are designed, run and perform. 

I still believe, however, that the selection of the actual TAR technology platform, while important, is not the most salient factor required for making optimal use of TAR.  Rather, as posited above, identifying and seeking collaborative agreement as to the ultimate strategic use and objectives sought from TAR, and really understanding the what, when, why and how to use it, is the most salient factor required to inform your pragmatic decision regarding whether or not to employ a TAR methodology.

Following that acknowledgement, a fundamental question does seem to be whether or not to use TAR in the first place.  Indeed, for those of us working in a wide cross-section of legal, technology, and information management disciplines know well, a little ECA sampling, mixed with a killer project management team, added to a finely crafted keyword search terms list, topped off with a prioritized review strategy for dessert, may in fact be more than sufficient for the overall case valuation, risk assessment, time and budget constraints mandated in many litigation matters.

Costs Associated With Use of Only Traditional EDD/ECA Methodologies, Without TAR

The first and most significant cost I see is that in order to achieve at least a comparable (or preferably better) e-review performance result, measured mainly in terms of accuracy and exhaustiveness (e.g.,. precision and recall) over time, you and your team must plan and commit to working much harder.  Indeed, after you assemble a dream-team consisting of your client’s in-house counsel and information technology/information security members, your outside counsel litigation legal team, and your collection of either in-house or out-sourced e-discovery provider partner(s) you must work significantly harder and effectively together not only at the onset of the review project start-up phase (as advocated in the TAR model), but consistently for the long haul over the duration of the review project. A snapshot of this project work will include maintaining robust and statistically sound QC sampling, performing the QC itself, including, but not limited to, the often real-time dynamically evolving review protocol decision log updates, and daily reviewer QC feedback provided in a variety of ways.  After having personally performed this role both as reviewer and later as project manager it is only accomplished successfully – automation or no-automation, as we did back in the glory days of paper – with a lot of blood, sweat and tears. 

A second cost, of course, is that the science is simply not there to support your final relevancy versus privilege versus non-responsiveness determinations among your review set data/document population.  That is, absent the use of TAR, no matter how knowledgeable and savvy you may think you are (and trust me, I also place myself in that camp, right or wrong) you’re still left sitting in the troubling position of asserting your word over the computer, which as we have seen in virtually every study beginning with the foundational Blair & Maron study from 1985, has proven to be a sadly misguided and erroneous assumption more often than not.

Benefits Associated With Use of Only Traditional EDD/ECA Methodologies, Without TAR:  Doing It Right

On the other hand, what if over the course of your traditionally constructed electronic document review model, your admittedly somewhat mechanical series of multiple, iterative query results and random sampling tests are yielding, say, a greater than 90% precision and recall rate, as ours historically yield?  Markedly better than those generally found in the sentinel TREC 2008 Study, most pundits would be forced to agree that these results are pretty darn good, and hence this approach may in fact be both justified and sufficient in many cases and projects.  I would argue that this method, especially when combined with use of clustering search logic bucketing, may be sufficient for the majority of your small volume electronic review set data populations, and perhaps at least 50% of your mid- to large-volume electronic review set data populations.  Importantly, this further assumes that in addition to all of the above prerequisites, you also utilize a sufficiently robust e-review tool serving as the backbone of your successful electronic review (and production) project (i.e., FTI Ringtail, kCura Relativity, Lexis Nexis’ Concordance, Kroll Ontrack’s Inview/AdvanceView, etc.).

You may be asking yourself why I’ve now essentially made the case against TAR.  I do so by design in an attempt to provide a balanced, real-world comparison model between non-TAR and TAR methodologies.  Further, long prior to the issuance of De Silva, many of our preeminent e-discovery thought leaders have unequivocally made the case as to its efficacy (i.e., Baron, Scheineman, Grossman, Losey, Socha, Ball):  Simply put, when used properly, the science and the process clearly works.  

Understanding the why is frankly the easy part.  Understanding the more nebulous when, what and how is a little trickier.  The following reflects a strategic assessment framework for use when first approaching the when, what and how to use TAR, and hopefully derive the greatest possible value from it.

Leveraging the Best & Brightest (We Hope) but Refreshingly Affordable:  TAR as Law Clerk

  1. Consider using TAR as an objectively accurate and strategically targeted method for robust QC,  likely achieved best by selection of a TAR technology platform that relies on Iterative Random Sampling back-end architecture versus propagation of a foundational Seed Set Collection of Documents;
  2. Consider using TAR for coding of both small- to mid-sized volume and subjectively complex / obscure / highly technical documents, where realistically few if any outside counsel or in-house counsel legal team members have sufficient knowledge to accurately, exhaustively and efficiently code the documents without retaining an expensive third-party subject matter expert to do so.  This approach is likely also achieved best by selection of a TAR technology platform that relies on Iterative Random Sampling back-end architecture versus propagation of a foundational Seed Set Collection of Documents (Seeding/Control Set Sample).

Leveraging the Successful Veteran Performer, the Former Golden Child Turned Closer Who Will Yield a Unanimous Verdict…But at a Price:  TAR as First Chair Trial Attorney

  1. Consider using TAR for coding of objectively simple, large volume first level review documents (e.g., new-school batch coding of sorts) where client, case, key players and nature of documents are both relatively simple and well known up-front, likely achieved best by selection of a TAR technology platform whose back-end architecture relies on establishment of “‘expert training documents” upfront via the Seeding/Control Set Sample architecture.  Here TAR becomes the dominating force in the room, which after trying the case successfully all the way to that precedent-setting litigation jury trial, all eyes are on, watching TAR demolish any obstacles in its path and relish ultimate victory.
  2. Consider using TAR for the coding of small- to mid-sized populations of subjectively complex and highly disputed discovery evidence among parties, where ultimately no amount of attempted Sedona Conference Collaboration Proclamation of 2008 negotiation techniques, nor Fred Flinstonesque shaking of fists have resulted in a joint e-discovery protocol, much less any discovery agreement, much less agreement over proper parties and venues (you all know the cases *twitch* to which I refer…).  Here TAR becomes the silent arbitrator, allowing the objective evidentiary facts to speak for themselves, and in essence, properly and objectively inform one’s coding decisions, and all in a documented and defensible manner.  This approach is likely also achieved best by selection of a TAR technology platform that relies on Iterative Random Sampling back-end architecture versus propagation of a foundational Seed Set Collection of Documents (Seeding/Control Set Sample).

Leveraging the Non-Player:  TAR as Pouting Second String Benched Team Member (I know, this one wasn’t in the script, but it’s important…)

  1. Experts, studies and even our own limited real-world TAR experience have shown that use of TAR for purposes of privilege review, and to a lesser extent, identification of subjective legal issue analyses coding, even with all the stars aligned, is generally not able to successfully “go find more documents like this one.”  This is understandable knowing even the best algorithm would inherently find it difficult to accurately make what are ultimately subjective contextual distinctions between, for example, the occurrence of the practicing attorney’s name within a corporation who wears multiple hats, sometimes speaking as in-house counsel, other times speaking as an operations representative and not offering any legal advice.  Yes, the technology can find documents containing that name with similar subject and linguistic language patterns and relationships, but it cannot read and interpret the language and correctly decide, “‘Yes, you are privileged” in Document 1 but ”No, you are not privileged”’ in Document 2, noting each document’s unique contextual background.  As a result, we find better results using our human eyes, brains and little fingers to feverishly render their fate and determine privilege.  Further, my own personal two cents’ worth suggests that all clawback provisions and rules aside, this premise still feels too risky to me.

If this were election night, I suppose I would be that fidgety second or third news anchor calling the race – definitely not the first to boldly call it, but finally calling it nonetheless.  Regardless of your particular vantage point seeing TAR as potential friend or foe, when used properly, I do believe the TAR movement seeks to normalize and help return the future of civil litigation “back to the future” toward a more pure and less-costly form of advocacy on behalf of the parties, allowing litigators on both sides of the bar a greater ability to try cases based on their merits, rather than merely battling over the oft-bemoaned but genuinely painful and potentially crippling e-discovery costs.  TAR has indeed lived up to its predicted game-changer, rock-star status hinted at several years ago now.

As Dr. Spock once surmised, “Curious how often you humans manage to obtain that which you do not want.”  Fascinating….


Get Updates by Email


Subscribe to RSS

Recent Posts

Other KMK Blogs