Ten Things You Should Know About Document Archiving
Archiving is different from backing up. The objective of backing up data is to help recover from data-loss disasters. It"s secondary data, being a copy of the primary data in active use. An archive contains primary data that have typically been retired from active use and are kept for historical, compliance or other purposes.
1. It has been estimated that up to 80% of data stored on expensive, fast-access, on-line storage media is accessed too infrequently and have no mission-critical significance.
2. Data that are infrequently used can be archived to secondary media. There are still primary data that might have to be accessed occasionally but stored on near-line storage devices to reduce costs. Such archiving can also improve system performance because too much data on-line can slow down systems.
3. Archiving of little used data can be automated. For example, the system will detect data and programs that have not been used for a certain period, and it might move these to secondary storage such as a near-line robotic storage devices. The device can retrieve and put selected data and programs on-line in a matter of seconds when needed.
4. Concatenating files and compressing data are used when data is archived. Such measures can reduce storage-space requirements and minimize costs.
5. Archived data can also be encrypted so that only those who have the key for de-encryption can access the data. This safeguards the confidentially of the data and is particularly important where the archives are stored on third party devices, such as Web servers.
6. Error detection algorithms, such as checksums and hashes, are typically used to ensure that the data archived is the same as the original data. Otherwise, the integrity of the data can be compromised.
7. Production data is archived primarily to save storage costs and improve system performance. Non-production data is archived for other purposes, e.g. to comply with regulatory requirements, to be available for electronic discovery and to preserve company history.
8. Archiving faces the problem of data becoming unreadable owing to progress in technology. Storage media are getting more and more compact and might use file formats that are not compatible with earlier formats, making data stored under those formats unreadable. The software that created particular data might also go through version changes, and data created with older versions might become unreadable.
9. To cope with the readability problem, data might be periodically copied to new media, or emulation programs might be developed to enable current systems to simulate older environments. Data can also be stored in XML files that contain language definitions, making it easier to decode the contents.
10. Archived data are also tagged with a retire-by date and retention policy. In such a case, document management programs can be programmed to remove documents that have expired. Unless data is being kept for historical record, they can be disposed after a specified number of years.
Archiving documents is a practice under which inactive documents are moved to near-line or off-line storage media. It serves a number of purposes, such as reducing storage costs, improving system performance, complying with regulations and preserving a historical record. To serve the intended purpose, special attention must be paid to readability issues, as electronic documents can become unreadable when the original programs that created them or the file formats they are stored in are no longer supported.
1. It has been estimated that up to 80% of data stored on expensive, fast-access, on-line storage media is accessed too infrequently and have no mission-critical significance.
2. Data that are infrequently used can be archived to secondary media. There are still primary data that might have to be accessed occasionally but stored on near-line storage devices to reduce costs. Such archiving can also improve system performance because too much data on-line can slow down systems.
3. Archiving of little used data can be automated. For example, the system will detect data and programs that have not been used for a certain period, and it might move these to secondary storage such as a near-line robotic storage devices. The device can retrieve and put selected data and programs on-line in a matter of seconds when needed.
4. Concatenating files and compressing data are used when data is archived. Such measures can reduce storage-space requirements and minimize costs.
5. Archived data can also be encrypted so that only those who have the key for de-encryption can access the data. This safeguards the confidentially of the data and is particularly important where the archives are stored on third party devices, such as Web servers.
6. Error detection algorithms, such as checksums and hashes, are typically used to ensure that the data archived is the same as the original data. Otherwise, the integrity of the data can be compromised.
7. Production data is archived primarily to save storage costs and improve system performance. Non-production data is archived for other purposes, e.g. to comply with regulatory requirements, to be available for electronic discovery and to preserve company history.
8. Archiving faces the problem of data becoming unreadable owing to progress in technology. Storage media are getting more and more compact and might use file formats that are not compatible with earlier formats, making data stored under those formats unreadable. The software that created particular data might also go through version changes, and data created with older versions might become unreadable.
9. To cope with the readability problem, data might be periodically copied to new media, or emulation programs might be developed to enable current systems to simulate older environments. Data can also be stored in XML files that contain language definitions, making it easier to decode the contents.
10. Archived data are also tagged with a retire-by date and retention policy. In such a case, document management programs can be programmed to remove documents that have expired. Unless data is being kept for historical record, they can be disposed after a specified number of years.
Archiving documents is a practice under which inactive documents are moved to near-line or off-line storage media. It serves a number of purposes, such as reducing storage costs, improving system performance, complying with regulations and preserving a historical record. To serve the intended purpose, special attention must be paid to readability issues, as electronic documents can become unreadable when the original programs that created them or the file formats they are stored in are no longer supported.
Source...