Print Bookmark

Moving TARget: The Hunt for a More Cost Effective and Efficient Review

The first in a series of Technology Assisted Review (TAR) related blogs

With hockey-stick-style trends in the use of email, mobile computing devices, cloud data storage and social media, it’s no wonder that the costs associated with the review of documents have ballooned in recent years. 

A 2010 survey conducted by Duke University found that the average outside litigation cost in 2008 was a staggering $115 million per respondent.  That number is up exponentially from $66 million per respondent in the year 2000, which computes to a whopping 73% increase in 8 years.1 

Couple this growth of litigation cost with the general consensus that legal review is the most expensive phase of e-discovery, and it becomes clear why highly-talented employees’ energies and budgets have been focused on developing methodologies that would utilize technology to make the large amounts of electronically-stored information more manageable.  Most of these methodologies are ones with which we have all become quite familiar:  Keyword and Boolean searches, de-duplication, near de-duplication, filtering, email clustering, concept searching, and so on. 

Emerging from that haze of methodologies is a concept called Technology Assisted Review (TAR).  It’s referred to by many names, but the concept is the same: Use analytics in conjunction with human oversight to increase the accuracy and reduce the costs generally associated with the attorney review phase of e-discovery. 

How TAR differs from traditional search methodology

Unlike traditional search methodology, TAR employs analytics, or, more plainly, algorithms, which are able to identify correlations and relationships in word patterns.  This gives TAR the ability to group similar documents in contextual ways traditional search methodology can’t.  As I understand it, some of the algorithms deployed in TAR do this by mathematically mapping term frequency count “vectors” which give the document population a latent semantic dimension.  Then an Expectation Maximization algorithm is applied to “fill in the gaps,” i.e. assign probability attributes to documents where data is either lacking or insufficient. 

In theory, these algorithms work to solve a couple of key problems inherent in using keywords when searching documents.  First, they allow the computer to mathematically make sense out of words with multiple meanings. Secondly, they enable the computer to parse out contextual similarities, namely synonyms, between documents.2

Arguably, if TAR tools can accomplish these two things, then the need to use a manual, linear review model to find relevant and/or responsive documents when dealing with large universes of data would be greatly reduced.  According to the mounting evidence in studies on the subject of information retrieval (like the NIST’s TREC 2010 Legal Track), TAR advocates are gaining the upper hand in this debate. 

Real World Application of TAR

There are numerous ways to employ these algorithms and countess more vendors who would like to tell us how and when to deploy them.  However, if you are looking for real world cases that exhibit a defensible, repeatable roadmap to utilizing TAR in litigated matters, two cases of note stand out: Da Silva Moore v. Publicis Groupe, No. 11 Civ. 1279 (ALC) (AJP), 2012 U.S. Dist. LEXIS 23350 (S.D.N.Y. Feb. 24, 2012) and In Re: Actos (Pioglitazone) Products Liability Litigation, No. 6:11-md-2299, (W.D. La. July 27, 2012). 

First, Da Silva. If you are unfamiliar with the case, it’s notable due to Magistrate Judge Andrew Peck’s permission to implement TAR technologies and methodologies.  One of the fascinating aspects of this case is that there will now, quite possibly, be a defensible road map to utilizing TAR in future cases. 

Another case that follows along that same defensible road map vein is In Re: Actos, a products liability case out of the Western District of Louisiana.  In this case, there is a case management order that reads like a TAR “how-to” pamphlet, describing in detail the how, what and when of the deployment of TAR tools (in this case, Epiq’s deployment of Equivio’s Software) throughout the discovery phase of the case. 

These two cases, in particular, have the potential to greatly help the legal community dissect the implementation of TAR in a real world scenario. As different deployments of this technology in different matters are catalogued, standards of how to use this technology will begin to emerge. 


Whether sifting through opposing counsel’s latest “document dump” or assessing privileged documents nestled on your company’s or your client company’s servers, TAR tools can be deployed in a myriad of ways.  However, be advised, not all TAR tools are built equally.  What is important is having an understanding of how this technology works in order to properly leverage it to reduce the high costs associated with the current state of e-discovery. 

Being armed with this understanding will likely be an invaluable asset to your firm’s or your company’s future success in complex litigation.  Hopefully this post has provided some guidance and understanding to those who wish to begin hunting “moving TARgets” in the wilds of the information retrieval jungle! 

I'd be very interested in hearing your thoughts on this topic.  Please email me at with any comments or questions.

[1] Litigation Cost Survey of Major Companies, 2010 Conference on Civil Litigation; Duke Law School (May, 10-11, 2010)

[2] Unsupervised Learning by Probabilistic Latent Semantic Analysis, Machine Learning, Volume 42, pp. 177-196, January 2001, by Thomas Hofmann

Get Updates by Email


Subscribe to RSS

Recent Posts

Other KMK Blogs