Skip to content
  • News & Events
  • Careers
  • Contact
New Array Logo_White with Green-1Homepage
  • Solutions
    • Solutions
    • eDiscovery
      • Early Case Assessment
      • Processing
      • Web Hosting
      • Data Analysis
      • Consulting
      • Expert Witness & Technical Consulting
      • Multi-Language Litigation
      • eDiscovery for Government Agencies
    • Document Review
      • Incident Response Review
    • Digital Forensics
      • Evidence Collection & Preservation
      • Forensic Expert Services
      • Data Forensic Process
      • Accredited Lab & Facilities
    • Court Reporting Services
    • Record Retrieval & Subpoena Services <br><small>CA & TX</small>
    • Traditional Services
      • Locations
    • Contract Legal Staffing & Legal Recruiting
  • Technology
    • Technology
    • Array In-House Solutions
      • Acumen
    • Strategic Partnerships
  • Experience
    • Experience
    • Our Team
    • Company Timeline
    • Testimonials
  • Insights
Get started
  • Solutions
    • Solutions
    • eDiscovery
      • Early Case Assessment
      • Processing
      • Web Hosting
      • Data Analysis
      • Consulting
      • Expert Witness & Technical Consulting
      • Multi-Language Litigation
      • eDiscovery for Government Agencies
    • Document Review
      • Incident Response Review
    • Digital Forensics
      • Evidence Collection & Preservation
      • Forensic Expert Services
      • Data Forensic Process
      • Accredited Lab & Facilities
    • Court Reporting Services
    • Record Retrieval & Subpoena Services <br><small>CA & TX</small>
    • Traditional Services
      • Locations
    • Contract Legal Staffing & Legal Recruiting
  • Technology
    • Technology
    • Array In-House Solutions
      • Acumen
    • Strategic Partnerships
  • Experience
    • Experience
    • Our Team
    • Company Timeline
    • Testimonials
  • Insights
Get started
  • News & Events
  • Careers
  • Contact
Logo
  • Solutions
    • Solutions
    • eDiscovery
      • Early Case Assessment
      • Processing
      • Web Hosting
      • Consulting
    • Document Review
    • Traditional Services
      • Locations
  • Technology
    • Technology
    • Array In-House Solutions
      • Acumen
    • Strategic Partnerships
  • Experience
    • Experience
    • Our Team
    • Company Timeline
    • Testimonials
  • Insights
    • Resources
    • Blog
    • Case Studies
    • Our Video Library
  • Solutions
    • Solutions
    • eDiscovery
      • Early Case Assessment
      • Processing
      • Web Hosting
      • Consulting
    • Document Review
    • Traditional Services
      • Locations
  • Technology
    • Technology
    • Array In-House Solutions
      • Acumen
    • Strategic Partnerships
  • Experience
    • Experience
    • Our Team
    • Company Timeline
    • Testimonials
  • Insights
    • Resources
    • Blog
    • Case Studies
    • Our Video Library
  • News & Events
  • Careers
  • Contact

Get Started

Get Started

Blog: Array UK

  • En gb
  • Insights
  • Blog

How clustering can improve all areas of document review

Admin | 18 May 2022

Clustering identifies documents which are conceptually similar to each other and divides them into smaller sub-sets of documents called clusters.

Another way to think of clustering is to imagine it as an act of sorting documents into boxes. Documents which discuss the same type of subjects are boxed together. These boxes are the clusters.

Clustering can be used to improve all stages of your review.

How Clustering Improves ECA

Early Case Assessment (or ECA) refers to the process of the primary assessment of your data looking to pull out any immediately identifiable relevant files.

When working with an unfamiliar data set, clustering is a quick and efficient way of gaining a high-level overview of themes discussed within your collection of documents. A cluster is named by the top 10 terms that best represent it, so you can quickly identify key topics being discussed within your dataset. This can help determine clusters that contain both key documents and clusters likely to contain irrelevant material. This information can then assist with generating a list of search terms to target those key documents.

Clustering can improve keyword filtering by allowing you to identify synonyms of your predefined keywords that may not have been considered, and display documents which although not responsive to the explicit keywords are conceptually similar and as such may hold relevant information. Clustering identifies these related documents which may otherwise have been missed with just traditional keyword searching.

How Clustering Improves Review

A single cluster of documents can be assigned to the same reviewer. Reviewing similar documents together can improve coding consistency and provide valuable context which can lead to more accurate coding.

Review can be prioritised to focus on the clusters which contain the most relevant concepts and help in identifying the key documents early on.

It is worth pointing out that this approach is likely to break the chronology of the review queue.

How Clustering Improves QC

It is reasonable that you would expect similar documents to be coded in the same way. Filtering can be applied to clusters to quality-check the coding decisions applied to specific clusters. This allows you to check that coding has been applied consistently and critical documents haven’t been missed or coded incorrectly.

Additional Benefit of Clustering

A further benefit of clustering is that it is a process entirely removed from human input, meaning that the computer algorithm scans the documents and clusters them without having a human read over the contents. This is particularly useful if you are dealing with extremely sensitive data that you are trying to limit exposure to. By making use of concept clustering you can ensure that only the clusters and subclusters most likely to return relevant documents are reviewed. 

Share this post

Keep reading

 Future of eDiscovery in UK Pharma: Building a Proactive Legal Tech Stack for Risk Management
Array

Future of eDiscovery in UK Pharma: Building a Proactive Legal Tech Stack for Risk Management

Oct 16, 2025 10:46:27 AM

 Protecting Pharma Reputation During Litigation: Fast, Defensible Document Review Strategies
Array

Protecting Pharma Reputation During Litigation: Fast, Defensible Document Review Strategies

Oct 16, 2025 10:19:13 AM

 Cross-Border Pharma Litigation: Managing Multi-Jurisdictional Discovery and GDPR Compliance
Array

Cross-Border Pharma Litigation: Managing Multi-Jurisdictional Discovery and GDPR Compliance

Oct 16, 2025 10:05:15 AM

 UK Pharma Patent Litigation: Using eDiscovery to Defend Biosimilar and IP Disputes
Array

UK Pharma Patent Litigation: Using eDiscovery to Defend Biosimilar and IP Disputes

Oct 16, 2025 9:52:44 AM

Frame 35

People.
Process.
Excellence.

Lets Talk
  • QUICK LINKS
  • Solutions
  • Technology
  • Experience
  • Insights
  • Careers
  • News & Events

Stay Current. Stay Competitive. Stay Informed.

SOC 2
Privacy Policy Terms & Conditions Report Abuse
All Rights Reserved. ©2025 Array Trust Array