Everything you need to know about Data Retention
In lieu of the closing of another tax season I felt it would be a good time to touch on data retention. Is it perhaps time for you to do some spring cleaning?
If your inbox were a fridge, would you say that some of those old emails are starting to resemble a science experiment?
Data is duplicated. It’s at the core of how networks, well… work. In the simplest of examples, you send an email, a copy is stored in your sent items, and another copy is stored in the recipients inbox. If you’re both working at the same company it doesn’t take long for this to compound continuously and bloat your servers with unnecessary copies of the same file over and over again. From an IT perspective, this data is one big massive store that needs to “managed”. Whether it be back up for disaster recovery, (another copy), archiving, (a new copy), subject to a legal hold, (copy, copy, copy), etc. etc…
For a geeky breakdown of how data is duplicated across wide area networks check out http://www.snia.org/sites/default/education/tutorials/2009/spring/data-management/JacobFarmer_Crash_Course_Wide_Area_Replication.pdf from the folks at SNIA.org.
For a great backup solution that offers inline deduplication checkout ExaGrid.com. In a nutshell: [Begin Shameless Plug] Backup Windows Cut by 43% on Average, File Restores 64% Faster than Previous Backup Solution on Average, Data Backed Up has Doubled on Average (108%) in Two Years.
From a personal standpoint, if you’re like me, you’ve got a box or a drawer somewhere labeled Important Documents. Off of the top of my head, I couldn’t tell you what’s still in there. I think there’s a copy of the certificate of participation I received for playing in an intramural, (aka beer), dodge-ball league back in 2004. Obviously not as important as the Certified Litigation Support Professional doc sitting right next to it, but a certificate none the less, so it found it’s way into the box.
For super oversimplified pointers on what personal (hard copy) documents you can shred, and what you should keep, check out LifeHacker’s post at http://lifehacker.com/5977082/what-documents-should-i-shred-and-what-should-i-keep.
If you’re really interested in the granular, our friends over at GetYourShitTogether.org can provide you with the questions that you didn’t even know you needed answers to… until it was too late. Check out the GYST checklist at http://getyourshittogether.org/forms/GYST_Checklist.pdf.
From a corporate compliance or litigation risk standpoint, the ramifications can be pretty serious. (Boring bits start now). There’s a mountain of data that sits on your corporate servers that can likely be shredded. Some of this information is often synonymous with skeletons or buried bodies. The problem facing most, is the identification and defensible (aka documented), deletion of this information.
See the sidebar “Morgan Stanley Has a Very Bad Day” in Christine Taylor‘s “When eDiscovery Goes Wrong” post at http://christineltaylor.com/wp-content/uploads/When_eDiscovery_Goes_Wrong.pdf
- [Begin Plagiarism] Judge Elizabeth Maass told the investment bank to pay a full $604.3 million claim made against it by billionaire financier Ronald Perelman, plus $850 million in punitive damages. The nature of the damages? Morgan Stanley repeatedly failed to produce emails that were vital to Perelman’s suit. [End Plagiarism]
Unfortunately there is no, (and probably never will be), any one-size-fits-all rule that can be applied across industry verticals. Additional obstacles abound, due to the amount of hands that need to be in the cookie jar when it comes making organizational decisions about what can go, and when. Not to mention, the constantly evolving rules that are trying to keep up with the technology that supports the way we all do business.
In maintaining the information overload nature of this post, and in an effort to truly hook you up to the fire-hose, I’ve expertly curated the information that will likely matter to you most, from those in the know:
- [Begin Plagiarism] We’re not psychiatrists, never played one on TV, heck we’ve never even stayed at a Holiday Inn Express. That said, we’ve been around a few ediscovery blocks so we’re willing to offer up a dime store diagnosis of whether your company may be guilty of e-hoarding. [End Plagiarism]
(2) If you’ve actually read this post the day it went live, you’ve got time to register for a free webinar entitled “Clean Up the eDiscovery Leftovers“.
- [Begin Plagiarism] The case has closed. The appellate process is over. Time to release the legal hold right? Well for some yes, but for the majority of organizations the answer is unfortunately no. Organizations fail to develop consistent processes around the release of legal holds, resulting in the over-preservation of electronically stored information (ESI). Not only does retaining this ESI result in increased storage costs but more importantly for legal teams, it exposes organizations to increased legal risk down the road as ESI that could have been safely deleted can be tied to future new cases. [End Plagiarism]
(3) For a high-level, and current, overview of the rules, ramifications, and relevant readings – redirect your browser to:
“Data Retention and eDiscovery: New Rules Mean New Approaches Are Required” by Rory Welch (Twitter | LinkedIn | Wired)
- [Begin Plagiarism] The proposed amendments to the FRCP would mandate that parties to a legal action come together in conference to create a discovery plan, which state the parties’ views relating to matters of the discovery, disclosure, and preservation of ESI that may be required to be submitted at trial. [End Plagiarism]
- [Begin Plagiarism] Lots of data can mean only one thing when it comes to litigation: crippling discovery costs. One of the leading document review tools had a 100 percent increase in the number of documents hosted from 2010 to 2011, from fewer than 5 billion to nearly 10 billion. When faced with this explosion of electronic data, your first and best line of defense in limiting discovery costs is having a well-drafted document retention policy and following it to the letter. [End Plagiarism]
What you probably didn’t know…
Many corporate clients that find themselves as serial litigants will rely on outside counsel to manage their discovery data. These firms, in turn, rely on technology and professional service providers that specialize in managing electronically stored information in a number of different capacities. Larger firms have an army of outside vendors that they call upon for connected, but discreet tasks. Think forensic acquisition, data culling, document hosting, and attorney review.
In the last decade it has become a widely accepted practice to have this data sit outside the firm firewall due to volume and turn-around constraints. It’s hard enough for you to keep track of your own company’s information. Are you keeping an eye on your outside counsel’s retention policies. Are they creating skeletons that you didn’t even know to look for? Take it a step further, and the subcontractors employed during the discovery process create a spiderweb of exposure that can be lurking right under the surface waiting to bite you in years to come. Are your subcontractors using subcontractors?
Due to the competitive nature of the industry, many eDiscovery providers (worth their salt), typically offer a clear and concise picture up front to let you know how your data is going to be handled, and will often provide full transparency into any subcontractors they have vetted. However, in the onset of a high-stakes matter, the fine print is often overlooked due to higher priority concerns.
If/when you are ever in doubt refer to the list of data types below to understand what is created during the eDiscovery process, and ask how it is handled… not just of your representative counsel, but of their preferred providers as well.
- PROJECT TRACKING DATA
- All project-related material that corresponds with specific matters and/or work orders. This includes email correspondence, contracts, Chain of Custody forms, etc. and any data that is tied to any project request handled by a subcontractor employee. Primarily, identified as external communication or exchange of specific detail regarding potential, ongoing, or completed work.
- SOURCE DATA
- Raw data originating from an outside entity, and submitted to subcontractor for the purpose of processing via media, ftp/sftp and/or email correspondence. This includes data forensically extracted and/or original data that is provided/created by a client, and accompanied with executed Chain of Custody (COC) documentation.
- CASE OR PRODUCTION DATA
- Any data or database derived or as a result of any internal application or network processes. Specifically, any exports, replicated, hosted or outputted data that stems from any service application used or managed by subcontractor.
- WORK-PRODUCT (WORK SPACE)
- Any data derived or as a result of any internal application or network processes, but used primarily for the purpose of prepping, experimentation, cataloging, preliminary/in-depth analysis, and/or investigation.
- DELIVERABLE DATA
- The copy of data provided to an outside party in the form of digital media or upload.
- WORKPLACE DATA
- Data identified as internal and external documents related to general business operations. External documents are typified by invoices, payment invoices, tax returns, etc. and are in some format presented to outside entities as needed. Internal documents such as pricing, payment forecasts, sales statistics, employee files, bank statements, etc. are used inside the company’s daily functions.
- OTHER DATA
- Any other data which is not classified by one of the above categories. This is usually created by analyst / user / employee as bi-product for given task.
The purpose of classifying the data types is not only to delineate and organize on an internal network, but also to apply different retention policies to each. Most providers will offer, at minimum, three to four different options to their clients and associate a price-point with each:
- ON-LINE STORAGE
- Subcontractor will retain matter-specific data (Source, Case/Production, Deliverable) online housed on a fast storage network server for requested period, and will guarantee immediate access at any given time. There is an associated cost with this option that will be submitted for client approval prior to charges being incurred.
- NEAR-LINE STORAGE
- Subcontractor will retain matter-specific data (Source, Case/Production, Deliverable) online on a slower storage network server for requested period, and will guarantee immediate access at any given time. There is an associated cost with this option that will be submitted for client approval prior to charges being incurred.
- Subcontractor will back up the matter-specific data (Source, Case/Production, Deliverable) on an off-line storage medium such as Tape or External HDD. Restoration within a time-frame of 24 to 72 hours would be required in order to access the archived data. There is an associated cost with this option that will be submitted for client approval prior to charges being incurred.
- Subcontractor will completely remove the matter-specific data (Source, Case/Production, Deliverable) from network server upon prior notification. Certificate of destruction is available upon request.
Key Take Away
Take the blinders off. Make sure you know where your data is housed, and have a full grasp of your exposure.
Be proactive. Don’t wait for your first eDiscovery nightmare to implement a retention plan. Start by asking your outside providers to supply a copy of their retention plan, and use that to help you create a template of your own.
Revisit often. The rules governing what information needs to be stored and/or preserved, what information is reasonably within your control not only vary by industry, but are constantly evolving in today’s fast-paced age of information.
Ask an expert. They’re out there, they’ve been there before. Don’t be afraid to ask for directions. Personally I would recommend the belts and suspenders approach — consult a technology expert, and eDiscovery counsel before making decisions in a vacuum.