Discovery Management Glossary

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | W | X | Z


Back to Top

Active Data: Information residing on the direct access storage media of computer systems, which is readily visible to the operating system and/or application software with which it was created and immediately accessible to users without un-deletion, modification or reconstruction.

Affidavit: A formal sworn statement of fact, signed by the declarant (who is called the affiant or deponent) and witnessed (as to the veracity of the affiant's signature) by a taker of oaths, such as a notary public.

Alternative Dispute Resolution (ADR): Extrajudicial processes such as arbitration, collaborative law, and mediation used to resolve conflict and potential conflict between and among individuals, business entities, governmental agencies, and (in the public international law context) states.

Analysis: The process of determining relevancy of paper and electronic discovery materials through evaluation based on the variables of the case.

Analytics: A unique technology that was designed to address the challenges of unstructured information, to make computers search and process this information in a more human-like manner.

Application Programming Interface (API): A set of routines, data structures, object classes and/or protocols provided by libraries and/or operating system services in order to support the building of applications.

Archive: A copy of data on a computer drive, or on a portion of a drive, maintained for historical reference.

Archival Data: Information that is not directly accessible to the user of a computer system but that the organization maintains for long-term storage and record keeping purposes. Archival data may be written to removable media such as a CD, magneto-optical media, tape or other electronic storage device, or may be maintained on system hard drives in compressed formats.

ASCII (American Standard Code for Information Interchange): A coding standard that can be used for interchanging information, if the information is expressed mainly by the written form of English words. It is implemented as a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that work with text. Most modern character-encoding schemes—which support many more characters than did the original—have a historical basis in ASCII.

Association of Corporate Counsel ("ACC"): Formerly the American Corporate Counsel Association ("ACCA") is an association of in-house counsel, attorneys who work for corporations. The association publishes the magazine ACC Docket and arranges one of the United States’ largest annual meetings for in-house attorneys. ACC was founded in 1982.

Association for Information and Image Management or AIIM: (pronounced aim) The community that provides education, research, and best practices to help organizations find, control, and optimize their information. There are many terms that are used somewhat interchangeably to describe the information management space covered by AIIM, including document management and enterprise content management. In the legal profession, the term that is often used to describe this area is electronically stored information.

AIIM is a non-profit organization focused on helping users to understand the challenges associated with managing documents, content, records, and business processes. AIIM was founded in 1943 as the National Microfilm Association and later became the Association for Information and Image Management. AIIM is also known as the enterprise content management (ECM) association.

Today, AIIM is international in scope, independent, and implementation-focused. As the industry's intermediary, AIIM represents the entire industry - including users, suppliers, and the channel.

Association of Records Managers and Administrators (ARMA International): A not-for-profit professional association for records and information managers and related industry practitioners and vendors. The association provides educational opportunities and educational publications covering the principles of records management. It also is known worldwide for its standards and guidelines.

The Association was founded in 1955. In 1975, the Association of Records Executives and Administrators (AREA) and the American Records Management Association merged to form the present ARMA International. The headquarters for ARMA International are in Lenexa, Kansas.

Attachment: A memorandum, letter, spreadsheet, or any other electronic document appended to another document or email.

Attorney-client Privilege: A legal concept that protects communications between a client and his or her attorney and keeps those communications confidential.

Attribute: A characteristic that identifies it, such as type, length or location.

Audit Log / Audit Trail: A chronological sequence of audit records, each of which contains evidence directly pertaining to and resulting from the execution of a business process or system function.

Author: A person or position who originated a document.


Back to Top

Backup: A copy of inactive data, intended for use in the restoration of data lost to catastrophic failure of system memory. Most users backup some of their files, and many computer networks utilize automatic backup software to make regular copies of some or all of the data on the network. Some backup systems use digital audio tape (DAT) as a storage medium.

Backup Data: Information that is not presently in use by an organization and is routinely stored separately upon portable media, to free up space and permit data recovery in the event of disaster.

Backup Site or Disaster Recovery Center (DR center): In the event of a disaster, the data on backup media will not be sufficient to recover. Computer systems onto which the data can be restored and properly configured networks are necessary too. Some organizations have their own data recovery centers that are equipped for this scenario. Other organizations contract this out to a third-party recovery center. Because a DR site is itself a huge investment, backup is very rarely considered the preferred method of moving data to a DR site. A more typical way would be remote disk mirroring, which keeps the DR data as up-to-date as possible.

Backup Tape: A magnetic tape that has long been the most commonly used medium for bulk data storage, backup, archiving, and interchange.

Backup Tape Recycling: The process whereby an organization’s backup tapes are overwritten with new backup data, usually on a fixed schedule (e.g., the use of nightly backup tapes for each day of the week with the daily backup tape for a particular day being overwritten on the same day the following week; weekly and monthly backups being stored offsite for a specified period of time before being placed back in the rotation).

Bandwidth: The amount of information or data that can be sent over a network connection in a given period of time. Bandwidth is usually stated in bits per second (bps), kilobits per second (kbps), or megabits per second (mps).

Batch file: Instructions defined within a file used to instruct a computer program to perform a function or series of functions.

Bates Number: Sequential numbering used to track documents and images in production data sets, where each page is identified by a unique production number. Often used in conjunction with a suffix or prefix to identify the producing party, the litigation, or other relevant information.

Bates Numbering: A process that is commonly used as an organizational method to label and identify legal documents. During the discovery phase of litigation, a large number of documents may necessitate the use of unique identifiers for each page of each document for reference and retrieval. Such "numbering" may be solely numeric or may contain a combination of letters and numbers (alphanumeric). There is no standard method for numbering documents.

Binary: Mathematical base 2, or numbers composed of a series of zeros and ones. Since zeros and one's can be easily represented by two voltage levels on an electronic device, the binary number system is widely used in digital computing.

Bit: A binary digit, taking a value of either 0 or 1. Binary digits are a basic unit of information storage and communication in digital computing and digital .

Blu-ray Disc (also known as Blu-ray or BD): An optical disc storage medium. Its main uses are high-definition video and data storage. The disc has the same physical dimensions as standard DVDs and CDs.

Boolean Search: The term "Boolean" refers to a system of logic developed by an early computer pioneer, George Boole. In Boolean searching, an "and" operator between two words results in a search for documents containing both of the words. An "or" operator between two words creates a search for documents containing either of the target words. A "not" operator between two words creates a search result containing the first word but excluding the second.

Burn: Slang for making (burning) a CD, DVD or Blu-Ray Disk copy of data, whether it is music, software, or other data.

Business Risk Management: A structured approach to managing uncertainty related to a threat, through a sequence of human activities including risk assessment, strategies development to manage it and mitigation of risk using managerial resources.

Byte: A basic unit of measurement of information storage in computer science. In many computer architectures it is a unit of memory addressing. There is no standard but a byte most often consists of eight bits.


Back to Top

CAB File: (.cab) The Microsoft Windows native compressed archive format.

Cache: A form of high-speed memory used to temporarily store frequently accessed information; once the information is stored, it can be retrieved quickly from memory rather than from the hard drive.

Case De-Duplication: Retains only single copies of documents per case. For example, if an identical document resides with Mr. A, Mr. B and Mr. C, only the first occurrence of the file will be saved (Mr. A's). Contrast with custodian de-duplication and production de-duplication.

CD-ROM: Data storage medium that uses compact discs to store about 1,500 floppy discs worth of data.

Chain of Custody: Refers to the chronological documentation, and/or paper trail, showing the seizure, custody, control, transfer, analysis, and disposition of evidence, physical or electronic. Because evidence can be used in court to convict persons of crimes, it must be handled in a scrupulously careful manner to avoid later allegations of tampering or misconduct which can compromise the case.

Chain of Custody Procedure: Procedure that specifies how evidence is to be moved from location to location to preserve its integrity and prove to the court that the evidence has not been altered.

Civil Law: As opposed to criminal law, is the branch of law dealing with disputes between individuals and/or organizations, in which compensation may be awarded to the victim. For instance, if a car crash victim claims damages against the driver for loss or injury sustained in an accident, this will be a civil law case.

Claw-back Agreement: An agreement that sets forth procedures to protect against waiver of privilege due to inadvertent production of documents or data.

Client: An application or system that accesses a remote service on another computer system, known as a server, by way of a network.

Client Server: A client-server application is a distributed system comprising both client and server software. A client software process may initiate a communication session, while the server waits for requests from any client.

Cloud Computing: Internet ("cloud") based development and use of computer technology ("computing")

Cluster: In operating systems that use a file allocation table (FAT) architecture, the smallest unit of storage space required for data written to a drive. Also called an allocation unit.

Coding: (Document Coding or Indexing): The manual extraction of key data/information from a document collection used for discovery. Such as; the Author, Recipients, Copyee's, Document Type, Document Date, and Document Characteristics. Coding for standard "bibliographic" fields is now commonly outsourced to firms where labor costs are lower than in the countries that generate the litigation in the first place. Coding of paper documents, however, will not go away until the pen is completely replaced by the computer.

Compression: A technology that reduces the size of a file. Compression programs are valuable to network users because they help save both time and bandwidth.

Computer Forensics: The use of specialized techniques for recovery, authentication, and analysis of electronic data when a case involves issues relating to reconstruction of computer usage, examination of residual data, and authentication of data by technical analysis or explanation of technical features of data and computer usage. Computer forensics requires specialized expertise that goes beyond normal data collection and preservation techniques available to end-users or system support personnel.

Cookie: Small data files written to a user's hard drive by a web server. These files contain specific information that identifies users (e.g., passwords and lists of pages visited).

Compound document: A file that combines more than one document into one by embedding objects or linked data. Data may be from different applications. The document type typically produced using word processing software, and is a regular text document intermingled with non-text elements such as spreadsheets, pictures, digital videos, digital audio, and other multimedia features. It can also be used to collect several documents into one.

Compression: A technology for storing data in fewer bits, it makes data smaller so less disk space is needed to represent the same information. Compression programs like WinZip and UNIX compress are valuable to network users because they save both time and bandwidth. Data compression is also widely used in backup utilities, spreadsheet applications, and database management systems.

Computer: A machine that manipulates data according to a list of instructions. This includes, but is not limited to, network servers, desktops, laptops, notebook computers, employees’ home computers, mainframes, the PDA’s of [party name] and its employees (personal digital assistants, such as Palm Pilot, Blackberry and other such handheld computing devices), digital cell phones, smart phones and pagers.

Computer Forensics: A branch of forensic science pertaining to legal evidence found in computers and digital storage mediums. Computer forensics is also known as digital forensics.

Computer Security: A branch of technology known as information security as applied to computers. The objective of computer security can include protection of information from theft or corruption, or the preservation of availability, as defined in the security policy.

Concept Search: Analyzing conceptual groups of words in a document to understand the true meaning, rather than searching only for a word (keyword).

Confidentiality: Has been defined by the International Organization for Standardization (ISO) as "ensuring that information is accessible only to those authorized to have access" and is one of the cornerstones of information security.

Container file: One file that contains multiple documents and document types. Requires decompression or ripping to process.

Contextual search: Searching surrounding text to analyze the context in which a word is used.

Corporate Investigations: Criminal, regulatory, securities and/or other investigations pertaining to the activities and/or electronically stored information of one or more corporations.

Cost Sharing/Shifting: Shifting the cost or a portion of the cost of production of inaccessible electronically stored documents to the requesting party.

Criminal Law: or penal law, is the bodies of rules with the potential for severe impositions as punishment for failure to comply. Criminal punishment, depending on the offense and jurisdiction, may include execution, loss of liberty, government supervision (parole or probation), or fines. There are some archetypal crimes, like murder, but the acts that are forbidden are not wholly consistent between different criminal codes, and even within a particular code lines may be blurred as civil infractions may give rise also to criminal consequences. Criminal law typically is enforced by the government, unlike the civil law, which may be enforced by private parties.

Culling: Removing a document prior to production or review; generally reduces the volume of data that is produced or reviewed.

Custodian: Person having administrative control of a document; for example, the data custodian of an email is the owner of the mailbox which contains the email.

Custodian De-Duplication: Culls a document if multiple copies of that document reside within the same custodian's data set. For example, if Mr. A and Mr. B each have a copy of a specific document, and Mr. C has two copies, the system will maintain one copy each for Mr. A, Mr. B, and Mr. C. Contrast with case de-duplication and production de-duplication.

Customer-Added Metadata: Data or work product created by a user while reviewing a document. For example: annotation text of a document or subjective coding information. Contrast with vendor-added metadata.


Back to Top

DAT: Digital Audio Tape. Used as a storage medium in some backup systems.

Data: Any information stored on a computer.

Database: A structured collection of records or data that is stored in a computer system.

Database Management System (DBMS): Is computer software that manages databases. In large systems, a DBMS allows users and other software to store and retrieve data in a structured way.

Data Collection: A term used to describe a process of preparing and collecting data.

Data Custodian: Person having administrative control of a document; for example, the data custodian of an email is the owner of the mailbox which contains the email.

Data Formats: The organization of information for display, storage, or printing. Data is maintained in certain common formats so that it can be used by various programs, which may only work with data in a particular format. This term is commonly used in the industry when asking another person about the state in which particular information exists. For example, "What format is it in, PDF or HTML?"

Data Hosting: A service provided for the storage and access of electronic data, images and metadata.

Data Mapping: The process of creating data element mappings between two distinct data models.

Data Migration: The process of transferring data between storage types, formats, or computer systems.

Data Mining: The process of extracting hidden patterns from data.

Data Set (or Dataset): A collection of data.

Deleted File: Removing or erasing a file from a computer's file system.

De-Duplication: The process of identifying (or some vendors includes actually removing) additional copies of identical documents in a document collection. There are three types of de-duplication: case, custodian, and production.

Digital Certificate: A means of providing heightened security for the access of a website or a specific document. Digital certificates are electronic records that contain keys used to decrypt information, especially information sent over a public network like the internet. Digital certificates must be applied for and granted by a Certificate Authority (CA).

Disaster Recovery: The process, policies and procedures related to preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster.

Document: A bounded physical or digital representation of a body of information designed with the capacity (and usually intent) to communicate.

Document Metadata: Data stored with in a document about the document. Often this data is not immediately viewable in software application used to create/edit the document, but often can be accessed via a "Properties" view. Contrast with file system metadata and email metadata. Most programs that create documents, including Microsoft SharePoint, Microsoft Word and other Microsoft Office products, save metadata with the document files. These metadata can contain the name of the person who created the file (obtained from the operating system), the name of the person who last edited the file, how many times the file has been printed, and even how many revisions have been made on the file. Other saved material, such as deleted text (saved in case of an undelete command), document comments and the like, is also commonly referred to as "metadata", and the inadvertent inclusion of this material in distributed files has sometimes led to undesirable disclosures.

De-Duplication: “De-Duping” is the process of comparing electronic records based on their characteristics and removing duplicate records from the data set. The process is base on the unique HASH* algorithm. *See HASH.

Defendant: Any party who is required to answer the complaint of a plaintiff or pursuer in a civil lawsuit before a court, or any party who has been formally charged or accused of violating a criminal statute.

Deleted Data: Data that, in the past, existed on the computer as live data and which have been deleted by the computer system or end-user activity. Deleted data remains on storage media in whole or in part until it is overwritten by ongoing usage or “wiped” with a software program specifically designed to remove deleted data. Even after the data itself has been wiped, directory entries, pointers, or other metadata relating to the deleted data may remain on the computer.

Deleted file: A file with disk space that has been designated as available for reuse. The deleted file remains intact until it has been overwritten with a new file.

Deletion: The process whereby data is removed from active files and other data storage structures on computers and rendered inaccessible except using special data recovery tools designed to recover deleted data. Deletion occurs in several levels on modern computer systems:

    (a) File level deletion: Deletion on the file level renders the file inaccessible to the operating system and normal application programs and marks the space occupied by the file’s directory entry and contents as free space, available to reuse for data storage.

    (b) Record level deletion: Deletion on the record level occurs when a data structure, like a database table, contains multiple records; deletion at this level renders the record inaccessible to the database management system (DBMS) and usually marks the space occupied by the record as available for reuse by the DBMS, although in some cases the space is never reused until the database is compacted. Record level deletion is also characteristic of many e-mail systems.

    (c) Byte level deletion: Deletion at the byte level occurs when text or other information is deleted from the file content (such as the deletion of text from a word processing file); such deletion may render the deleted data inaccessible to the application intended to be used in processing the file, but may not actually remove the data from the file’s content until a process such as compaction or rewriting of the file causes the deleted data to be overwritten.

Deposition: Witness testimony given under oath and recorded for use in court at a later date.

Desktop: Usually refers to an individual PC -- a user's desktop computer.

Digital: Storing information as a string of digits – namely “1”s and “0”s.

Digital Image: A representation of a two-dimensional image using ones and zeros (binary). Depending on whether or not the image resolution is fixed, it may be of vector or raster type. Without qualifications, the term "digital image" usually refers to raster images.

Directory, folder, catalog, or drawer: A virtual container within a digital file system, in which groups of files and other directories can be kept and organized.

Disaster Recovery Tape: Portable media used to store data that is not presently in use by an organization to free up space but still allow for disaster recovery. May also be called “Backup Tapes.”

Disc (disk): May be a floppy disk, or a hard disk. Either way, it is a magnetic storage medium on which data is digitally stored. May also refer to a CD-ROM or DVD.

Disc Mirroring: A method of protecting data from a catastrophic hard disk failure. As each file is stored on the hard disk, a "mirror" copy is made on a second hard disk or on a different part of the same disk.

Discovery: The pre-trial phase in a lawsuit in which each party through the law of civil procedure can request documents and other evidence from other parties or can compel the production of evidence by using a subpoena or through other discovery devices, such as requests for production of documents, and depositions. In other words, discovery includes (1) interrogatories; (2) motions or requests for production of documents; (3) requests for admissions; and (4) depositions.

Discovery Compliance: Complying with the federal, state and local regulations around discovery (e.g. Federal Rules of Civil Procedure).

Discovery Cost Distribution or Allocation: The distribution or allocation of the discovery costs incurred among multiple parties compelled to produce Hard Copy Documents and Electronically Stored information.

Discovery Response: This is a response to a discovery request.

Discovery Response Plan: A reactive or proactive plan developed to guide the activities to be taken in response to a discovery request in addition to mitigating the cost and risk.

Discovery Response Strategy: A strategic plan developed to guide the response to a request for discovery in addition to mitigating the cost and risk.

Discovery Response Team: A team of individuals assembled to coordinate and execute a Discovery Response Plan. A discovery response team may include members from legal, IT, business management and other resources from within an organization legal consulting vendors and outside counsel.

Distributed Data: The information belonging to an organization which resides on portable media and non-local devices such as home computers, laptop computers, floppy disks, CD-ROMs, personal digital assistants (“PDAs”), wireless communication devices (e.g., Blackberry), zip drives, Internet repositories such as e-mail hosted by Internet service providers or portals, web pages, and the like. Distributed data also includes data held by third parties such as application service providers and business partners.

Document Imaging (Scanning): An information technology category for systems capable of replicating documents commonly used in business. Document Imaging Systems can take many forms including microfilm, on demand printers, facsimile machines, copiers, multifunction printers, document scanners, Computer Output Microfilm (COM) and archive writers. In the last 15 years Document Imaging has been used to describe software-based computer systems that capture, store and reprint images.

Due Diligence: A term used for a number of concepts involving either the performance of an investigation of a business or person, or the performance of an act with a certain standard of care. It can be a legal obligation, but the term will more commonly apply to voluntary investigations. A common example of due diligence in various industries is the process through which a potential acquirer evaluates a target company or its assets for acquisition.

DVD: A popular optical disc storage media format. Its main uses are video and data storage. Most DVDs are of the same dimensions as compact discs (CDs) but store more than six times as much data.


Back to Top

ECA: Early Case Assessment: Refers to estimating risk (cost of time and money) to prosecute or defend a legal case.

EDRM Metrics: The EDRM Metrics project is designed to provide a standard approach and generally accepted language for measuring the full range of electronic discovery activities. The Metrics project follows the electronic discovery process described in the Electronic Discovery Reference Model: identification, preservation, collection, processing, review, analysis and production. For each stage of the process, the Metrics project will offer guidelines for how to measure associated costs, time and volumes.

EDRM XML: The EDRM XML project is designed to provide a standard, generally accepted XML schema to facilitate the movement of electronically stored information (ESI) from one step of the electronic discovery process to the next, from one software program to the next and from one organization to the next.

Electronic Discovery or e-discovery: Refers to discovery in civil litigation which deals with information in electronic format also referred to as Electronically Stored Information "ESI". In the legal context, electronic form is the representation of information as binary numbers. Electronic information is different from paper information because of its intangible form, volume, transience, and persistence. Also, electronic information is usually accompanied by metadata, which is never present in paper information unless manually coded.

Electronic Document: Is any electronic media content (other than computer programs or system files) that are intended to be used in either an electronic form or as printed output.

Electronic Mail: Most commonly abbreviated email or e-mail, is a method of exchanging digital messages. E-mail systems are based on a store-and-forward model in which e-mail server computer systems accept, forward, deliver and store messages on behalf of users, who only need to connect to the e-mail infrastructure, typically an e-mail server, with a network-enabled device for the duration of message submission or retrieval. Originally, e-mail was always transmitted directly from one user's device to another's; nowadays this is rarely the case.

An electronic mail message consists of two components, the message header, and the message body, which is the email's content. The message header contains control information, including, minimally, an originator's email address and one or more recipient addresses. Usually additional information is added, such as a subject header field.

Electronically Stored Information (ESI): Any type of information that can be stored electronically, including all current types of computer-based information as well as any that might occur as a result of future changes and technological developments. Examples of ESI include E-mail messages, word processing files, voice mail messages, databases, websites and wikis. For the purpose of the Federal Rules of Civil Procedure (FRCP) is information created, manipulated, communicated, stored, and best utilized in digital form, requiring the use of computer hardware and software.

Email Address: Identifies a location to which e-mail messages can be delivered. An e-mail address on the modern Internet looks like, for example, and is usually read as "jsmith at example dot com".

Email Archiving: A stand-alone IT application that works with an email server to help manage an organization’s email messages. It captures and preserves all email traffic flowing into and out of the email server so it can be accessed quickly at a later date from a centrally managed location.

Email Attachment: A computer file which is sent along with an e-mail message.

Email Metadata: Data stored in the email about the email. Often this data is not even viewable in email client application used to create the email. The amount of email metadata available for a particular email varies greatly depending on the email system. Contrast with file system metadata and document metadata.

Email Spam or Junk Email: A subset of spam that involves nearly identical messages sent to numerous recipients by e-mail.

Embedded Metadata: Text, numbers, content, data or information that is directly or indirectly input into a Native File by a user and which is not typically visible to the user viewing the output of display of the Native File on screen or as a print-out.

Embedded Object/File: An electronic file contained within another electronic file.

Encryption: Technology that renders the contents of a file unintelligible to anyone not authorized to read it. Encryption is used to protect information as it moves from one computer to another, and is an increasingly common way of sending credit card numbers over the Internet when conducting e-commerce transactions.

Enterprise Content Management (ECM): refers to the technologies, strategies, methods and tools used to capture, manage, store, preserve, and deliver content and documents related to an organization and its processes. ECM tools allow the management of an enterprise level organization's information.

ESI Processing: Capturing an electronic data image or a representation of the image, generally in native format, entering it into a computer system and processing and or manipulating it so that it can be exported into a review application.

Ethernet: A common way of networking PCs to create a LAN.

European Union Data Protection Directive 95/46/EC: Directive 95/46/EC on the protection of individuals with regard to the processing of personal data and on the free movement of such data a European Union directive legislating protection of data pertaining to individuals. It is an important component of EU privacy and human rights law. The directive was implemented in 1995 by the European Commission.

Extranet: a private network that uses Internet protocols, network connectivity, and possibly the public telecommunication system to securely share part of an organization's information or operations with suppliers, vendors, partners, customers or other businesses.


Back to Top

Federal Rules of Civil Procedure: (FRCP) Rules governing civil procedure in United States district (federal) courts, that is, court procedures for civil suits.

Federal Rules of Evidence (FRE): Govern the admission of facts by which parties in the federal courts of the United States may prove their cases.

File: An element of data storage in a file system. A collection of data or information that has a name, called the filename. Almost all information stored in a computer must be in a file. There are many different types of files: data files, text files, program files, directory files, and so on.

File Extension: A tag of three or four letters, preceded by a period, which identifies a data file's format or the application used to create the file. File extensions can streamline the process of locating data. For example, if one is looking for incriminating pictures stored on a computer, one might begin with the .gif and .jpg files.

File Format: A particular way that information is encoded for storage in a computer file. Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for different kinds of information. Within any format type, e.g., word processor documents, there will typically be several different formats.

File Sharing: One of the key benefits of a network is the ability to share files stored on the server among several users.

File Server: A computer attached to a network that has the primary purpose of providing a location for the shared storage of computer files(such as documents, sound files, photographs, movies, images, databases, etc.) that can be accessed by the workstations that are attached to the computer network.

File System: The system that an operating system or program uses to organize and keep track of files. For example, a hierarchical file system is one that uses directories to organize files into a tree structure. Types of file systems include file allocation table (FAT) and Windows® NT file system (NTFS).

File System Metadata: Data that can be obtained or extracted about a file from the file system storing the file. Contrast with document metadata and email metadata.

Filename: A special kind of string used to uniquely identify a file stored on the file system of a computer.

Filename Extension: In DOS and some other operating systems, one or several letters at the end of a filename. A suffix to the name of a computer file applied to indicate the encoding convention (file format) of its contents. Filename extensions usually follow a period (dot) and indicate the type of information stored in the file. For example, in the filename LETTER.DOC, the extension is DOC, which indicates that the file is a word processing file.

Filtering: Electronic filtering of emails and files for privilege or by keyword, file, type, or name. Filtering removes files that do not fit the search criteria and reduces the volume of data that requires further investigation.

Firewall: an integrated collection of security measures designed to prevent unauthorized electronic access to a networked computer system. It is also a device or set of devices configured to permit, deny, encrypt, decrypt, or proxy all computer traffic between different security domains based upon a set of rules and other criteria.

Flash Drive: A portable, USB storage device that can hold between various amounts of ESI.

Floppy: An increasingly rare storage medium consisting of a thin magnetic film disk housed in a protective sleeve.

Forensic Copy/Image: An exact bit-by-bit copy of the entire physical hard drive of a computer system, including slack and unallocated space.

Forensic Identification: The application of forensic science and technology to identify specific objects from the trace evidence often left on computer storage media; such as a hard drive.

Fragmented Data: Fragmented data is live data that has been broken up and stored in various locations on a single hard drive or disk.

FTP (File Transfer Protocol): An Internet protocol that enables you to transfer files between computers on the Internet.


Back to Top

GIF (Graphic interchange format): A computer compression format for pictures.

Gigabyte (GB): A measure of computer data storage capacity and is a billion (1,000,000,000) bytes.

GUI (Graphical User Interface): is a type of user interface which allows people to interact with electronic devices such as computers; hand-held devices such as MP3 Players, Portable Media Players or Gaming devices; household appliances and office equipment. Examples of common contemporary operating systems include Microsoft Windows, Mac OS, Linux, BSD and Solaris.


Back to Top

Hard disk: A peripheral data storage device that may be found inside a desktop or laptop as in a hard drive situation. The hard disk may also be a transportable version and attached to a desktop or laptop.

Hard drive: The primary storage unit on PCs, consisting of one or more magnetic media platters on which digital data can be written and erased magnetically.

Hart Scott Rodino Act: The Act provides that before certain mergers, tender offers or other acquisition transactions can close, both parties must file a "Notification and Report Form" with the Federal Trade Commission and the Assistant Attorney General in charge of the Antitrust Division of the Department of Justice.

Hash Algorithms: Any well-defined procedure or mathematical function which converts a large, possibly variable-sized amount of data into a small datum, usually a single integer that may serve as an index into an array. The values returned by a hash function are called hash values, hash codes, hash sums, or simply hashes. Common hash algorithms include MD5 (Message-Digest algorithm 5) and SHA (Secure Hash Algorithm).

    MD5: is a widely used cryptographic hash function with a 128-bit hash value. MD5 has been employed in a wide variety of security applications, and is also commonly used to check the integrity of files.

    SHA-1: is a cryptographic hash function designed by the National Security Agency (NSA) and published by the NIST as a U.S. Federal Information Processing Standard.

    SHA-2: is a set of cryptographic hash functions (SHA-224, SHA-256, SHA-384, SHA-512) designed by the National Security Agency (NSA) and published in 2001 by the NIST as a U.S. Federal Information Processing Standard. SHA stands for Secure Hash Algorithm. SHA-2 includes a significant number of changes from its predecessor, SHA-1. SHA-2 consists of set of four hash functions with different digest sizes, with 224, 256, 384 or 512 bits respectively.

    SHA-3: is currently under development — an ongoing NIST hash function competition is scheduled to end with the selection of a winning function in 2012. The SHA-3 algorithm will not be derived from SHA-2.

HTML (Hypertext Markup Language): a language that uses tags to structure text into headings, paragraphs, lists and links. It tells a Web browser how to display text and images.


Back to Top

Image: In data recovery parlance, to image a hard drive is to make an identical copy of the hard drive, including empty sectors. A kin to data cloning. Also known as creating a “mirror image” or “mirroring” the drive.

Image Copy: A "mirror image" bit-by-bit copy of a hard drive, i.e., a complete replication of the physical drive regardless of how the drive is organized or whether the image created contains meaningful data in whole or in part. From an imaged copy of a hard drive it is possible to reconstruct the entire contents and organization of the source drive from which it was taken.

Information Governance: The organizational structures and processes that ensure an accountability framework for use by IT that also support an organization’s legal objectives and strategies.

Information Management: Information management is the collection and management of information from one or more sources and the distribution of that information to one or more audiences. It is largely limited to files, file maintenance, and the life cycle management of paper and electronically based files, other media and records.

Input Device: Any object which allows a user to communicate with a computer by entering information or issuing commands (e.g. keyboard, mouse or joystick).

Instant Messaging (IM): A form of real-time communication between two or more people based on typed text. The text is conveyed via devices connected over a network such as the Internet.

Internet: The interconnecting global public network made by connecting smaller shared public networks. The most well-known Internet is the Internet, the worldwide network of networks which use the TCP/IP protocol to facilitate information exchange.

Intranet: A private computer network that uses Internet technologies to securely share any part of an organization's information or operational systems with its employees.

IP address: A string of four numbers separated by periods used to represent a computer on the Internet.

IS / IT Information Systems or Information Technology: Usually refers to the people who make computers and computer systems run.

ISO - International Organization for Standardization: An international-standard-setting body composed of representatives from various national standards organizations.

ISP (Internet Service Provider): A company that offers its customers access to the Internet.


Back to Top

JPEG (Joint Photographic Experts Group): An image compression standard for photographs.


Back to Top

Keyword Search: A search for documents containing one or more words that are specified by a user.

Kilobyte (KB): One thousand bytes of data is 1K of data.


Back to Top

LAN (Local Area Network): Usually refers to a network of computers in a single building or other discrete location.

Latent Semantic Analysis (LSA): A technique in natural language processing, in particular in vectorial semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms.

Latent Semantic Indexing (LSI): An indexing and retrieval method that uses a mathematical technique called Singular Value Decomposition (SVD) to identify patterns in the relationships between the terms and concepts contained in an unstructured collection of text. LSI is based on the principle that words that are used in the same contexts tend to have similar meanings. A key feature of LSI is its ability to extract the conceptual content of a body of text by establishing associations between those terms that occur in similar contexts.

Legacy Data: Information in the development of which an organization may have invested significant resources and which has retained its importance, but which has been created or stored by the use of software and/or hardware that has been rendered outmoded or obsolete.

Legal Brief: The research filed by an attorney or by a party to a court. The research points out the legal issue that is being raised, what the law or rule of law says about the issue, how the law should be applied, and the conclusion of the information provided. A legal brief is a written statement submitted in a trial or appellate proceeding that explains one party's side.

Legal Compliance: The process or procedure to ensure that an organization follows relevant laws, regulations and business rules. The definition of legal compliance, especially in the context of corporate legal departments, has recently been expanded to include understanding and adhering to ethical codes within entire professions, as well. There are two requirements for an enterprise to be compliant with the law, first its policies need to be consistent with the law. Second, its policies need to be complete with respect to the law. The role of legal compliance has also been expanded to include self-monitoring the non-governed behavior with industries and corporations that could lead to workplace indiscretions. Within the LGRC realm, it is important to keep in mind that if a strong legal governance component is in place, risk cannot be accurately assessed, nor can the monitoring of legal compliance be carried out efficiently. It is also important to realize that within the LGRC framework, legal teams work closely with executive teams and other business departments to align their goals and ensure proper communication.

Legal Consistency: Property that declares enterprise policies to be free of contradictions with the law. Legal Consistency has been defined as not having multiple verdicts for the same case.

Legal Inconsistency: Defined as having two rules that contradict each other. Other common definitions of consistency refer to “treating similar cases alike”. In the enterprise context, legal consistency refers to “obedience to the law”. In the context of legal requirements validation, legal consistency is defined as, "Enterprise requirements are legally consistent if they adhere to the legal requirements and include no contradictions."

Legal Completeness: is a property that declares enterprise policies to cover all scenarios included or suggested by the law. Completeness suggests that there are no scenarios covered by the law that cannot be implemented in the enterprise. In addition, it implies that all scenarios not allowed by the law are not allowed by the enterprise.

Legal Document Management: The policies, procedures, planning and other activities around the storage and possessing of documents that may be needed for legal matters.

Legal Governance: The establishment, execution and interpretation of processes and rules put in place by corporate legal departments in order to ensure a smoothly-run legal department and corporation.

Legal Governance, Risk Management, and Compliance or "LGRC": The complex set of processes, rules, tools and systems used by corporate legal departments to adopt, implement and monitor an integrated approach to business problems. While Governance, Risk Management, and Compliance refers to a generalized set of tools for managing a corporation or company, Legal GRC, or LGRC, refers to a specialized – but similar – set of tools utilized by attorneys, corporate legal departments, general counsel and law firms to govern themselves and their corporations, especially but not exclusively in relation to the law

Legal Hold: A process which an organization uses to preserve all forms of relevant information when litigation is reasonably anticipated.

Legal Professional Privilege: Protects all communications between a professional legal adviser (a solicitor, barrister or attorney) and his or her clients from being disclosed without the permission of the client. The privilege is that of the client and not that of the lawyer.

Legal Risk Management: The process of evaluating alternative regulatory and non-regulatory responses to risk and selecting among them. Even with the legal realm, this process requires knowledge of the legal, economic and social factors, as well as knowledge of the business world in which legal teams operate. In an organizational setting, risk management refers to the process by which an organization sets the risk tolerance, identifies potential risks and prioritizes the tolerance for risk based on the organization’s business objectives, and manages and mitigates risks throughout the organization.

Litigation Management: The business activities around preparing for and/or responding to litigation.

Litigation Preparation: The strategic planning and/or activities around preparing for litigation.

Litigation Readiness Consulting: Consultative services to help guide an organization in preparation for litigation.

Litigation Response Consulting: Consultative services to help guide an organization in its response to litigation.

Litigation Support: Personnel or resources that help one or more organizations prepare for and respond to litigation or investigation.

Litigation Support Services: Services to support the preparation for and response to litigation or investigation.

LFP File: An ASCII delimited text file required for cross-reference of images to data.

Load File: A file that relates to a set of scanned images and indicates where individual pages belong together as documents. A load file may also contain data relevant to the individual documents, such as metadata, coded data and the like. Load files must be obtained and provided in prearranged formats to ensure transfer of accurate and usable images and data.

Login: (logging or signing in) The process by which individual access to a computer system is controlled by identification of the user using credentials provided by the user. A user can log in to a system to obtain access, and then log out when the access is no longer needed.

Lotus Notes: A client-server, collaborative application developed and sold by IBM Software Group. IBM defines the software as an "integrated desktop client option for accessing business e-mail, calendars and applications on [an] IBM Lotus Domino server.


Back to Top

Magnetic/Optical Storage Media: Includes, but is not limited to, hard drives (also known as "hard disks"), backup tapes, CD-ROMs, DVD-ROMs, Jazz and Zip drives, and floppy discs, all used singly or in combination in, or in conjunction with, your computers and any and all backup and archive systems for the same.

Magnetic Tape: See “Storage Media”

Mailbox: An area in memory or on a storage device where email is placed. In email systems, each user has a private mailbox. When the user receives email, the mail system automatically puts it in the mailbox. The mail system allows you to scan mail that is in your mailbox, copy it to a file, delete it, print it, or forward it to another user. The mailbox format used by Microsoft Exchange® email systems is PST, while Lotus Notes® uses NSF files.

Mail Server: Also used to mean a computer acting as an MTA that is running the appropriate software.

Mail Transfer Agent (MTA): A computer program or software agent that transfers electronic mail messages from one computer to another.

Meet and Confer - FRCP Rule 26(f): A settlement conference is a meeting between opposing sides of a lawsuit at which the parties attempt to reach a mutually agreeable resolution of their dispute without having to proceed to a trial. Such a conference may be initiated through either party, usually by the conveyance of a settlement offer; or it may be ordered by the court as a precedent (preliminary step) to holding a trial. Each party, the plaintiff and the defendant, is usually represented at the settlement conference by their own Counsel or attorney.

Megabyte (MB): A million bytes of data is a megabyte, or simply a meg.

Memory: Internal storage areas in the computer. The term memory identifies data storage that comes in the form of chips, and the word storage is used for memory that exists on tapes or disks. Moreover, the term memory is usually used as a short-hand for physical memory, which refers to the actual chips capable of holding data. Some computers also use virtual memory, which expands physical memory onto a hard disk. See the definitions for two types of physical memory: RAM and ROM.

Metadata: (meta data, or sometimes meta-information) is "data about other data", of any sort in any media. An item of metadata may describe an individual datum, or content item, or a collection of data including multiple content items and hierarchical levels, for example a database schema. In data processing, metadata is definitional data that provides information about or documentation of other data managed within an application or environment. The term should be used with caution as all data is about something, and is therefore metadata.

For example, metadata would document data about data elements or attributes, (name, size, data type, etc) and data about records or data structures (length, fields, columns, etc) and data about data (where it is located, how it is associated, ownership, etc.). Metadata may include descriptive information about the context, quality and condition, or characteristics of the data. It may be recorded with high or low granularity.

Microsoft Exchange Server: A messaging and collaborative software product developed by Microsoft. It is part of the Microsoft Servers line of server products and is widely used by enterprises using Microsoft infrastructure solutions. Exchange's major features consist of electronic mail, calendaring, contacts and tasks; support for mobile and web-based access to information; and support for data storage.

Microsoft SQL Server: A relational database management system (RDBMS) produced by Microsoft. Its primary query languages are MS-SQL and T-SQL.

Mirroring: The duplication of data for purposes of backup or to distribute network traffic among several computers with identical data.

MIS: Management information systems.

Modem: Hardware that lets a computer talk to another computer over a phone line.

Multipurpose Internet Mail Extensions (MIME): An Internet standard that extends the format of e-mail to support: Text in character sets other than ASCII, Non-text attachments, Message bodies with multiple parts, and Header information in non-ASCII character sets. MIME's use, however, has grown beyond describing the content of e-mail to describing content type in general, including for the web.


Back to Top

Network: A group of computers or devices that is connected together for the exchange of data and sharing of resources.

Node: Any device connected to network. PCs, servers, and printers are all nodes on the network.

NSF: A file that is created by Lotus to maintain Lotus Notes Electronic Mails with attachments.

Native File: The source document, as collected from the source computer or server, before any conversion or processing of the document.

Native File Review: Reviewing ESI using the software used to create it originally. For example: using Microsoft Word in the review process to open/review a .DOC (MS Word Document format) file.

Near-line Storage: Typically less accessible and less expensive than on-line storage, but still useful for backup data storage. A good example would be a tape library with restore times ranging from seconds to a few minutes. A mechanical device is usually involved in moving media units from storage into a drive where the data can be read or written. Generally it has safety properties similar to on-line storage.

Network: A group of interconnected computers. Networks may be classified according to a wide variety of characteristics. (e.g. Local Area Network (LAN), Wide Area Network (WAN), Metropolitan Area Network (MAN), Storage Area Network (SAN), peer-to-peer network, client-server network).

Network Operating System: Software which directs the overall activity of networked computers.

Near de-duplication: The elimination of electives with "near duplicate" similarities, i.e. a document that was sent to multiple custodians.

NIST-National Institute of Standards and Technology: A measurement standards laboratory which is a non-regulatory agency of the United States Department of Commerce. The institute's mission is to promote U.S. innovation and industrial competitiveness by advancing measurement science, standards, and technology in ways that enhance economic security and improve quality of life.


Back to Top

Object Linking and Embedding (OLE): A technology that allows embedding and linking to documents and other objects developed by Microsoft.

OCR - Optical Character Recognition: The mechanical or electronic translation of images of handwritten, typewritten or printed text (usually captured by a scanner) into machine-editable text.

Offline: Not connected (to a network).

Off-line Storage: Requires some direct human action in order to make access to the storage media physically possible. This action is typically inserting a tape into a tape drive or plugging in a cable that allows a device to be accessed. Because the data is not accessible via any computer except during limited periods in which it is written or read back, it is largely immune to a whole class of on-line backup failure modes. Access time will vary depending on whether the media is on-site or off-site.

Off-site Data Protection: To protect against a disaster or other site-specific problem, many people choose to send backup media to an off-site vault. The vault can be as simple as a system administrator’s home office or as sophisticated as a disaster hardened, temperature controlled, high security bunker that has facilities for backup media storage. Importantly a data replica can be off-site but also on-line (e.g., an off-site RAID mirror). Such a replica has fairly limited value as a backup, and should not be confused with an off-line backup.

Ongoing Preservation Obligation: Once an organization is served with a litigation notice, all future relevant electronic communication is also subject to the legal hold.

Online / Offline: In general, "online" indicates a state of connectivity, while "offline" indicates a disconnected state.

On-line Backup Storage: Typically the most accessible type of data storage, which can begin restore in milliseconds time. A good example would be an internal hard disk or a disk array (maybe connected to SAN). This type of storage is very convenient and speedy, but is relatively expensive. On-line storage is quite vulnerable to being deleted or overwritten, either by accident, by intentional malevolent action, or in the wake of a data-deleting virus payload.

Onsite Discovery Management: Discovery management services performed at a client’s site(s). Examples: Consulting, Scanning, Coding, Electronic discovery and review services.

Operating System (OS or O/S): An interface between hardware and applications; it is responsible for the management and coordination of activities and the sharing of the limited resources of the computer.

Overwrite: To copy new data over existing data. Overwritten data cannot be retrieved.


Back to Top

Parent-child Relationships: Parent-child relationships is a term used in e-discovery to describe a chain of documents that stems from a single e-mail or storage folder. These types of relationships are primarily encountered when a party is faced with a discovery request for e-mail. A “child” (i.e., an attachment) is connected to or embedded in the “parent” (i.e., an e-mail or Zip file) directly above it.

Password: A secret word or string of characters that is used for authentication, to prove identity or gain access to a resource (Example: An access code is a type of password).

Personal Computer (PC): Any general-purpose computer whose original sales price, size, and capabilities make it useful for individuals, and which is intended to be operated directly by an end user, with no intervening computer operator.

PDA (Personal Digital Assistant): Also known as a palmtop computer, is a mobile device which functions as a Personal information manager and has the abiltity to connect to the internet. The PDA has an electronic visual display enabling it to include a web browser, but some newer models also have audio capabilities, enabling them to be used as mobile phones or portable media players. Many PDAs can access the internet, intranets or extranets via Wi-Fi, or Wireless Wide Area Networks (WWANs).

PDF: Portable Document Format - A file format developed by Adobe Systems. PDF captures formatting information from a variety of desktop publishing applications, making it possible to send formatted documents and have them appear on the recipient's monitor or printer as they were intended. To view a file in PDF format, you need Adobe Acrobat Reader, a free application distributed by Adobe Systems.

Petabyte (PB): A unit of information or computer storage equal to one quadrillion bytes, or 1024 terabytes.

Phishing: The criminally fraudulent process of attempting to acquire sensitive information such as usernames, passwords and credit card details by masquerading as a trustworthy entity in an electronic communication.

Plain Text: The least formatted and therefore most portable form of text for computerized documents.

Plaintiff: Also known as a claimant or complainant, is the party who initiates a lawsuit (also known as an action) before a court. By doing so, the plaintiff seeks a legal remedy, and if successful, the court will issue judgment in favor of the plaintiff and make the appropriate court order (e.g., an order for damages).

Pointer: A pointer is an index entry in the directory of a disk (or other storage medium) that identifies the space on the disc in which an electronic document or piece of electronic data resides, thereby preventing that space from being overwritten by other data. In most cases, when an electronic document is “deleted,” the pointer is deleted, which allows the document to be overwritten, but the document is not actually erased.

Precedent: Establishing a principle or rule that a court or other judicial body adopts when deciding subsequent cases with similar issues or facts.

Preservation: The process of retaining and protecting all relevant evidence from destruction or deletion.

Privacy Law: The area of law concerned with the protection and preservation of the privacy rights of individuals. Increasingly, governments and other public as well as private organizations collect vast amounts of personal information about individuals for a variety of purposes. The law of privacy regulates the type of information which may be collected and how this information may be used.

Private Area Network: A network that is connected to the Internet but is isolated from the Internet.

Privilege: A special entitlement or immunity granted by a government or other authority to a restricted group, either by birth or on a conditional basis. Example: Attorney-client Privilege or Legal Professional Privilege.

Privilege Data Set: A set of documents that are deemed responsive or relevant but are withheld on the grounds of privilege (work product or attorney-client).

Production: To electronically deliver ESI to a variety of recipients or for use in other systems.

Production De-Duplication: Culling of a document if multiple copies of that document reside within the same production set. For example, if two identical documents are both marked responsive, non-privileged, production de-duplication ensures that only one of those documents is produced. Contrast with case de-duplication and custodian de-duplication.

Project Management: The discipline of planning, organizing and managing resources to bring about the successful completion of specific project goals and objectives.

Proximity Search: The process looks for documents where two or more separately matching term occurrences are within a specified distance, where distance is the number of intermediate words or characters.

PST File: A file that is a created by Microsoft to maintain Exchange and Outlook Electronic Mails.

Public Network: A network that is part of the public Internet.


Back to Top

Query Languages: Are computer languages used to make queries into databases and information systems.

Quality Control: The process ensuring that products or services are designed and produced to meet or exceed customer requirements. These systems are often developed in conjunction with other business and engineering disciplines using a cross-functional approach.

Quick Peek: ESI is made available to opposing party before being reviewed for privilege, confidentiality or privacy. Strict guidelines are required to prevent waiver.


Back to Top

RAM (Random Access Memory): The working memory of the computer into which application programs can be loaded and executed.

Raw Data: Is a term for data collected on source which has not been subjected to processing or any other manipulation. (primary data), it is also known as primary data.

Record Retention Policy: Policy for setting procedures around managing the lifecycle of records, from creation to maintenance to disposition.

Record Retention Schedule: A formalized plan for the management of records, identifying how long records should be kept, when they should be archived and when they can be destroyed.

Records Management: or RM, is the practice of identifying, classifying, archiving, preserving, and destroying records.

Relational Database Management System (RDBMS): A database management system (DBMS) in which data is stored in the form of tables and the relationship among the data is also stored in the form of tables.

Repository: A library in which collections are stored in digital formats (as opposed to print, microform, or other media) and accessible by computers.[1] The digital content may be stored locally, or accessed remotely via computer networks. A digital library is a type of information retrieval system.

Repository Hosting: A device accessed through the inter/intranet on which electronic data, images and record metadata is stored.

Reprography: Is the reproduction of graphics through mechanical or electrical means, such as photography or xerography.

Residual Data: (Sometimes referred to as “Ambient Data”) refers to data that is not active on a computer system. Residual data includes (1) data found on media free space; (2) data found in file slack space; and (3) data within files that has functionally been deleted in that it is not visible using the application with which the file was created, without use of undelete or special data recovery techniques.

Review: Examination of potentially relevant data sets, Paper or ESI, for relevancy, privilege and confidentiality in advance of production.

ROM: Read Only Memory - the hardware in a computer that that can be read but not written to. ROM contains the programming that allows a computer to boot up each time the user turns it on, and it contains essential system programs that neither the user or the computer can erase.

Router: A piece of hardware that routes data from a local area network (LAN) to a phone line.

Rule 16: Pretrial conference - Rule 16 may provide a party with an opportunity to discuss settlement without giving the appearance of having initiated the conversation.

Rule 26: In American law, discovery is the pre-trial phase in a lawsuit in which each party through the law of civil procedure can request documents and other evidence from other parties and can compel the production of evidence by using a subpoena or through other discovery devices, such as requests for production of documents, and depositions.

    This is the most substantial rule, which guides the discovery process.

    Subdivision (a) provides for automatic disclosure, which first was added in 1993. Disclosure requires parties to share their own supporting evidence without being requested to by the other party. Failure to do so can preclude that evidence from being used at trial. This applies only to evidence that supports their own case, not anything that would harm their case. For example, a plaintiff brings a case alleging a negligent accident where the defendant damaged the plaintiff's boat. The plaintiff would then be required to automatically disclose repair bills for his damaged property (Since this would only support his case) (26(a)(1)(c)).

    Subdivision (b) is the heart of the discovery rule, and defines what is discoverable and what is limited. Anything that is relevant is available for the other party to request, as long as it is not privileged or otherwise protected. Under §1, relevance is defined as anything more or less likely to prove a fact that affects the outcome of the claim. It does not have to be admissible in court as long as it could reasonably lead to admissible evidence.

    However, there are limits to discovery. §2 allows the court to alter the limits of discovery on the number of depositions, interrogatories, and document requests if it determines that the discovery sought is overly burdensome, redundant, unnecessary, or disproportionately difficult to produce with respect to the importance of the case or specific issue. Enshrined in §3, the work-product doctrine protects tangible (and some intangible) items created in anticipation of the litigation (e.g., a memorandum from an attorney outlining his strategy in the case). Protecting work product is considered in the interest of justice because discovery of such work product would expose an attorney's complete legal strategy before trial. §4 allows discovery of experts whose opinions may be presented at trial, but limits discovery of experts not likely to testify during trial. §5 generally prohibits the discovery of any material legally privileged (attorney-client, doctor-patient, etc.), and requires the production of a "privilege log" which describes the privileged information or material in a way that allows others to see that (if) it is privileged, but does not divulge the privileged material.

    Subdivision (c) provides for protective orders.

    Subdivision (d) specifies the times at which parties may employ the various methods of discovery.

    Subdivision (e) provides for supplementation, which requires a person to correct any submitted information as it is necessary. For example, if you submit your medical records, and then your doctor calls you to say a crucial medical test just came in, you may be required to send that new report to the other party without being specifically requested to do so.

    Subdivision (f) provides a special meeting between the parties to organize their discovery process; this is a required step.

    Subdivision (g) is the good faith rule which provides sanctions to any party that makes a discovery request or response designed to thwart justice, cause undue delay, or harass the other party.

Rule 34: Producing Documents, Electronically Stored Information, and Tangible Things, or Entering onto Land, for Inspection and Other Purposes.

Rule 37 (FRCP): FRCP 37(e), formerly 37(f), provides a safe harbor when data is lost or overwritten in the normal course of business.

Rule 502 (FRE) The proposed Federal Evidence Rule 502 is intended to reduce the risk of forfeiting the attorney-client privilege or work product protection.


Back to Top

Sampling: Usually (but not always) refers to the process of statistically testing a data set for the likelihood of relevant information. It can be a useful technique in addressing a number of issues relating to litigation, including decisions as to which repositories of data should be preserved and reviewed in a particular litigation, and determinations of the validity and effectiveness of searches or other data extraction procedures. Sampling can be useful in providing information to the court about the relative cost burden versus benefit of requiring a party to review certain electronic records.

Sanctions: Penalties or other means of enforcement used to provide incentives for obedience with the law, or with rules and regulations. Criminal sanctions can take the form of serious punishment, such as corporal or capital punishment, incarceration, or severe fines. Within the civil law context, sanctions are usually monetary fines, levied against a party to a lawsuit or his/her attorney, for violating rules of procedure, or for abusing the judicial process. The most severe sanction in a civil lawsuit is the involuntary dismissal, with prejudice, of a complaining party's cause of action, or of the responding party's answer. This has the effect of deciding the entire action against the sanctioned party without recourse, except to the degree that an appeal or trial de novo may be allowed because of reversible error. In the United States federal court system, certain types of conduct are sanctionable under Rule 11 of the Federal Rules of Civil Procedure.

Sandbox: A network or series of networks that are not connected to other networks.

SAS 70 (Statement on Auditing Standards No. 70): Service Organizations, commonly abbreviated as SAS 70, is an auditing statement issued by the Auditing Standards Board of the American Institute of Certified Public Accountants (AICPA), officially titled “Reports on the Processing of Transactions by Service Organizations”.

Second Request: A discovery procedure by which the Federal Trade Commission and the Antitrust Division of the Justice Department investigates mergers and acquisitions which may have anticompetitive consequences. Under the Hart-Scott-Rodino Antitrust Improvements Act, before certain mergers, tender offers or other acquisition transactions can close, both parties to the deal must file a "Notification and Report Form" with the Federal Trade Commission (FTC) and the Assistant Attorney General in charge of the Antitrust Division.

If either the FTC or the Antitrust Division has reason to believe the merger will impede competition in a relevant market, they may request more information by way of "Request for Additional Information and Documentary Materials", more commonly referred to as a "Second Request".

Secure Data Hosting: A service provided for the secure storage and access of electronic data, images and metadata.

Secure Sockets Layer (SSL): Cryptographic protocols that provide security and data integrity for communications over TCP/IP networks such as the Internet.

Server: Any computer on a network that contains data or applications shared by users of the network on their client PCs.

Service Level Agreement – SLA: A part of a service contract where the level of service is formally defined. In practice, the term SLA is sometimes used to refer to the contracted delivery time (of the service) or performance.

Settlement: When the parties to a dispute (both disputes that are being litigated before the courts, and disputes where court action has not been started) reach an agreement as to the case, which is said to 'settle' the claim.

Simple Mail Transfer Protocol (SMTP): An iInternet standard for electronic mail (e-mail) transmission across Internet Protocol (IP) networks.

Slack: The difference in empty bytes of the space that is allocated in clusters minus the actual size of the files. Also described as the data fragments stored randomly on a hard drive during the normal operation of a computer, or the residual data left on the hard drive after new data has overwritten some of the previously stored data.

Smartphone: A mobile phone offering advanced capabilities, often with PC-like functionality (PC-mobile handset convergence). There is no industry standard definition of a smartphone. For some, a smartphone is a phone that runs complete operating system software providing a standardized interface and platform for application developers. For others, a smartphone is simply a phone with features considered advanced at the time of its release - e.g., in the early 2000s this was features such as e-mail and Internet, but now these are commonplace on non-smartphones. Other definitions might include features such as e-book reader capabilities, Wi-Fi, and/or a built-in full keyboard or external USB keyboard and VGA connector. In other words, it is a miniature computer that has phone capability.

Smartphones and Personal Digital Assistants: Many corporate employees have company-issued PDA’s or smartphones that may contain discoverable material. This can raise issues for collection, as the tools for acquiring data from such devices are difficult to find.

Software: A general term used to describe a collection of computer programs, procedures and documentation that perform some tasks on a computer system.

Software Application: Any tool that functions and is operated by means of a computer, with the purpose of supporting or improving the software user's work. In other words, it is the subclass of computer software that employs the capabilities of a computer directly and thoroughly to a task that the user wishes to perform.

Spoliation: Refers to the intentional or negligent withholding, hiding, alteration or destruction of evidence relevant to a legal proceeding, and it is a criminal act in the United States under Federal and most State law.

Stand Alone Computer: A personal computer that is not connected to any other computer or network, except possibly through a modem.

State Court: Has jurisdiction over disputes with some connection to a U.S. state, as opposed to the federal government. State courts handle the vast majority of civil and criminal cases in the United States with minimal federal court supervision.

Storage Device: Any device that a computer uses to store information

Storage Media: Any removable device that stores data. See magnetic or optical storage media. See the examples listed below:

    Magnetic Tape: Magnetic tape has long been the most commonly used medium for bulk data storage, backup, archiving, and interchange. Tape has typically had an order of magnitude better capacity/price ratio when compared to hard disk, but recently the ratios for tape and hard disk have become a lot closer.[5] There are myriad formats, many of which are proprietary or specific to certain markets like mainframes or a particular brand of personal computer. Tape is a sequential access medium, so even though access times may be poor, the rate of continuously writing or reading data can actually be very fast. Some new tape drives are even faster than modern hard disks. A principal advantage of tape is that it has been used for this purpose for decades (much longer than any alternative) and its characteristics are well understood.

    Hard Disk: The capacity/price ratio of hard disk has been rapidly improving for many years. This is making it more competitive with magnetic tape as a bulk storage medium. The main advantages of hard disk storage are low access times, availability, capacity and ease of use.[6] External disks can be connected via local interfaces like SCSI, USB, FireWire, or eSATA, or via longer distance technologies like Ethernet, iSCSI, or Fibre Channel. Some disk-based backup systems, such as Virtual Tape Libraries, support data deduplication which can dramatically reduce the amount of disk storage capacity consumed by daily and weekly backup data. The main disadvantages of hard disk backups are that they are easily damaged, especially while being transported (e.g., for off-site backups), and that their stability over periods of years is a relative unknown.

    Optical Disc: A recordable CD can be used as a backup device. One advantage of CDs is that they can in theory be restored on any machine with a CD-ROM drive. (In practice, writable CD-ROMs are not always universally readable.) In addition, recordable CD's are relatively cheap. Another common format is recordable DVD. Many optical disk formats are WORM type, which makes them useful for archival purposes since the data can't be changed. Other rewritable formats can also be utilized such as CD-RW or DVD-RAM. The newer HD-DVDs and Blu-ray Discs dramatically increase the amount of data possible on a single optical storage disk, though, as yet, the media may be cost prohibitive for many people. Additionally the physical lifetime of the optical disk has become a concern as it is possible for some optical disks to degrade and lose data within a couple of years.

    Floppy Disk: During the 1980s and early 1990s, many personal/home computer users associated backup mostly with copying floppy disks. The low data capacity of a floppy disk makes it an unpopular and obsolete choice today.

    Solid State Storage: Also known as flash memory, thumb drives, USB flash drives, CompactFlash, SmartMedia, Memory Stick, Secure Digital cards, etc., these devices are relatively costly for their low capacity, but offer excellent portability and ease-of-use.

    Remote Backup Service: As broadband internet access becomes more widespread, remote backup services are gaining in popularity. Backing up via the internet to a remote location can protect against some worst-case scenarios such as fires, floods, or earthquakes which would destroy any backups in the immediate vicinity along with everything else. There are, however, a number of drawbacks to remote backup services. First, internet connections (particularly domestic broadband connections) are generally substantially slower than the speed of local data storage devices, which can be a problem for people who generate or modify large amounts of data. Secondly, users need to trust a third party service provider with both privacy and integrity of backed up data. The risk associated with putting control of personal or sensitive data in the hands of a third party can be managed by encrypting sensitive data so that its contents cannot be viewed without access to the secret key. Ultimately the backup service must itself be using one of the above methods, so this could be seen as a more complex way of doing traditional backups.

Subpoena: Commonly defined as a written command to a person to testify before a court or be punished.

Structured Data: Data that has a structured format, such as a database.

Structured Storage: (variously also known as COM structured storage or OLE structured storage) is a technology developed by Microsoft as part of its Windows operating system for storing hierarchical data within a single file.

System Administrator: (sys admin, sysop) a person employed to maintain and operate a computer system and/or network. System administrators may be members of an information technology department.


Back to Top

Tape Drive: A hardware device used to store data on a magnetic tape. Tape drives are usually used to back up large quantities of data due to their large capacity and cheap cost relative to other data storage options.

Terabyte (TB): A measurement term for data storage capacity. The value of a terabyte based upon a decimal radix (base 10) is defined as one trillion (short scale) bytes, or 1000 gigabytes.

Text Messaging: Refers to the exchange of brief written messages between mobile and portable devices over cellular networks. While the original term (see below) was derived from referring to messages sent using the Short Message Service (SMS), it has since been extended to include messages containing image, video, and sound content (known as MMS messages). The sender of a text message is known as a texter, while the service itself has different colloquialisms depending on the region: it may simply be referred to as a text or a texto in North America, an SMS in the United Kingdom and most of Europe, and a TMS in the Middle East, Asia and Australia.

TIFF (Tagged Image File Format): One of the most widely supported file formats for storing bit-mapped images. Files in TIFF format often end with a .tiff extension.

Transport Layer Security (TLS): Cryptographic protocols that provide security and data integrity for communications over TCP/IP networks such as the Internet. TLS’ predecessor is Secure Sockets Layer (SSL).

Transmission Control Protocol/Internet Protocol (TCP/IP): A collection of protocols that define the basic workings of the features of the Internet.


Back to Top

Unicode: A computing industry standard allowing computers to consistently represent and manipulate text expressed in most of the world's writing systems, both foreign and domestic. Developed in tandem with the Universal Character Set standard and published in book form as The Unicode Standard, Unicode consists of a repertoire of more than 100,000 characters, a set of code charts for visual reference, an encoding methodology and set of standard character encodings, an enumeration of character properties such as upper and lower case, a set of reference data computer files, and a number of related items, such as character properties, rules for normalization, decomposition, collation, rendering and bidirectional display order (for the correct display of text containing both right-to-left scripts, such as Arabic or Hebrew, and left-to-right scripts).

United States Department of Justice: (often referred to as the Justice Department or DOJ), is the United States federal executive department responsible for the enforcement of the law and administration of justice, equivalent to the justice or interior ministries of other countries. The Department is led by the Attorney General, who is nominated by the President and confirmed by the Senate and is a member of the Cabinet.

United States District Courts: Are the general trial courts of the United States federal court system. Both civil and criminal cases are filed in the district court, which is a court of law, equity, and admiralty.

Unitization: The assembly of individually pages into documents:

    Physical unitization utilizes actual objects such as staples, paper clips and folders to determine pages that belong together as documents for archival and retrieval purposes.

    Logical unitization is the process of human review of each individual page in a collection using logical cues to determine pages that belong together as documents. Such cues can be consecutive page numbering, report titles, similar headers and footers and other logical cues.

US-EU Safe Harbor: A streamlined process for US companies to comply with the EU Directive 95/46/EC on the protection of personal data. Intended for organizations within the EU or US that store customer data, the Safe Harbor Principles are designed to prevent accidental information disclosure or loss. US companies can opt into the program as long as they adhere to the 7 principles outlined in the Directive.

User: A person who uses a computer or Internet service. A user may have a user account that identifies the user by a username (also user name), screen name (also screen name).

Unstructured data (or unstructured information) Refers to (usually) computerized information that either does not have a data model or has one that is not easily usable by a computer program.


Back to Top

Vendor-Added Metadata: Data created and maintained by the electronic discovery vendor as a result of processing the document. While some vendor-added metadata has direct value to customers, much of it is used for process reporting, chain of custody, and data accountability. Contrast with customer-added metadata

VPN (Virtual Private Network): A computer network in which some of the links between nodes are carried by open connections or virtual circuits in some larger network (e.g., the Internet) as opposed to running across a single private network.


Back to Top

Web Browser: A software application which enables a user to display and interact with text, images, videos, music, games and other information typically located on a Web page at a Web site on the World Wide Web or a local area network.

Web Page or Webpage: A resource of information that is suitable for the World Wide Web and can be accessed through a web browser. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext links.

Wireless Communication: The transfer of information over a distance without the use of electrical conductors or "wires".

World Wide Web -WWW: (commonly abbreviated as "the Web") is a system of interlinked hypertext documents accessed via the Internet.

World Wide Web Base Repository: A device accessed through the internet on which electronic data, images and record metadata is stored.


Back to Top

XML - Extensible Markup Language: A general-purpose specification for creating custom markup languages. It is classified as an extensible language, because it allows the user to define the mark-up elements.


Back to Top

Zubulake: Five landmark decisions on e-discovery addressing when to shift the cost of electronic discovery to the requesting party; when a company needs to begin preserving electronic evidence and what electronic evidences must be preserved; what steps must be taken to preserve and the consequences of the failure to adequately preserve electronic evidence.