Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Backup
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Manipulation of data and dataset optimization== It is frequently useful or required to manipulate the data being backed up to optimize the backup process. These manipulations can improve backup speed, restore speed, data security, media usage and/or reduced bandwidth requirements. ===Automated data grooming=== Out-of-date data can be automatically deleted, but for personal backup applications—as opposed to enterprise client-server backup applications where automated data "grooming" can be customized—the deletion<ref group=note>Some backup applications—notably [[Rsync#History|rsync]] and [[CrashPlan]]—term removing backup data "pruning" instead of "grooming".</ref><ref>{{Cite web|url=https://linux.die.net/man/1/rsync|title=rsync(1) - Linux man page|website=linux.die.net|first1= Andrew |last1=Tridgell |first2= Paul |last2=Mackerras|first3=Wayne |last3=Davison}}</ref><ref>{{Cite web|url=https://support.code42.com/hc/en-us/articles/14827690989463-Archive-maintenance#archive-maintenance-0-0|title= Archive maintenance|date=2023|website=Code42 Support}}</ref> can at most<ref name="PondiniFAQ12">{{cite web |last1=Pond |first1=James |title=12. Should I delete old backups? If so, How? |url=https://www.baligu.com/pondini/TM/12.html |website=Time Machine |publisher=baligu.com |access-date=21 June 2019 |at=Green box, Gray box |date=2 June 2012}}</ref> be globally delayed or be disabled.<ref name="WirecutterBestOnlineCloudBackupService">{{cite web |last1=Kissell |first1=Joe |title=The Best Online Cloud Backup Service |url= https://thewirecutter.com/reviews/best-online-backup-service/ |website=wirecutter |publisher=The New York Times|access-date=21 June 2019 |at=Next, there’s file retention. |date=12 March 2019}}</ref> ===Compression=== Various schemes can be employed to [[Data compression|shrink]] the size of the source data to be stored so that it uses less storage space. Compression is frequently a built-in feature of tape drive hardware.<ref name="CherrySecuring15">{{cite book |url=https://books.google.com/books?id=SD_LAwAAQBAJ&pg=PA306 |title=Securing SQL Server: Protecting Your Database from Attackers |author=D. Cherry |publisher=Syngress |pages=306–308 |year=2015 |isbn=978-0-12-801375-5 |access-date=8 May 2018}}</ref> ===Deduplication=== Redundancy due to backing up similarly configured workstations can be reduced, thus storing just one copy. This technique can be applied at the file or raw block level. This potentially large reduction<ref name="CherrySecuring15" /> is called [[Data deduplication|deduplication]]. It can occur on a server before any data moves to backup media, sometimes referred to as source/client side deduplication. This approach also reduces bandwidth required to send backup data to its target media. The process can also occur at the target storage device, sometimes referred to as inline or back-end deduplication. ===Duplication=== Sometimes backups are [[Replication (computer science)|duplicated]] to a second set of storage media. This can be done to rearrange the archive files to optimize restore speed, or to have a second copy at a different location or on a different storage medium—as in the disk-to-disk-to-tape capability of Enterprise client-server backup. ===Encryption=== High-capacity removable storage media such as backup tapes present a data security risk if they are lost or stolen.<ref>[http://www.securityfocus.com/news/11048 Backups tapes a backdoor for identity thieves] {{Webarchive|url=https://web.archive.org/web/20160405033517/http://www.securityfocus.com/news/11048 |date=5 April 2016 }} (28 April 2004). Retrieved 10 March 2007</ref> [[Encrypting]] the data on these media can mitigate this problem, however encryption is a CPU intensive process that can slow down backup speeds, and the security of the encrypted backups is only as effective as the security of the key management policy.<ref name="CherrySecuring15" /> ===Multiplexing=== When there are many more computers to be backed up than there are destination storage devices, the ability to use a single storage device with several simultaneous backups can be useful.<ref name="PrestonBackup07-02">{{cite book |url=https://books.google.com/books?id=6-w4fXbBInoC&pg=PA219 |title=Backup & Recovery: Inexpensive Backup Solutions for Open Systems |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=219–220 |year=2007 |isbn=978-0-596-55504-7 |access-date=8 May 2018}}</ref> However cramming the scheduled [[Glossary of backup terms#Terms and definitions|backup window]] via "multiplexed backup" is only used for tape destinations.<ref name="PrestonBackup07-02" /> ===Refactoring=== The process of rearranging the sets of backups in an archive file is known as refactoring. For example, if a backup system uses a single tape each day to store the incremental backups for all the protected computers, restoring one of the computers could require many tapes. Refactoring could be used to consolidate all the backups for a single computer onto a single tape, creating a "synthetic full backup". This is especially useful for backup systems that do incrementals forever style backups. ===Staging=== Sometimes backups are copied to a [[Disk staging|staging]] disk before being copied to tape.<ref name="PrestonBackup07-02" /> This process is sometimes referred to as D2D2T, an acronym for [[Disk-to-disk-to-tape]]. It can be useful if there is a problem matching the speed of the final destination device with the source device, as is frequently faced in network-based backup systems. It can also serve as a centralized location for applying other data manipulation techniques. ===Objectives=== *[[IT disaster recovery|Recovery point objective]] (RPO): The point in time that the restarted infrastructure will reflect, expressed as "the maximum targeted period in which data (transactions) might be lost from an IT service due to a major incident". Essentially, this is the roll-back that will be experienced as a result of the recovery. The most desirable RPO would be the point just prior to the data loss event. Making a more recent recovery point achievable requires increasing the frequency of [[file synchronization|synchronization]] between the source data and the backup repository.<ref name="RiskyThinkingDefRPO">{{cite web |title=Recovery Point Objective (Definition) |url=https://www.riskythinking.com/glossary/recovery_point_objective.php |website=ARL Risky Thinking |publisher=Albion Research Ltd. |access-date=4 August 2019 |date=2007}}</ref> *Recovery time objective (RTO): The amount of time elapsed between disaster and restoration of business functions.<ref name="RiskyThinkingDefRTO">{{cite web |title=Recovery Time Objective (Definition) |url=https://www.riskythinking.com/glossary/recovery_time_objective.php |website=ARL Risky Thinking |publisher=Albion Research Ltd. |access-date=4 August 2019 |date=2007}}</ref> *[[Data security]]: In addition to preserving access to data for its owners, data must be restricted from unauthorized access. Backups must be performed in a manner that does not compromise the original owner's undertaking. This can be achieved with data encryption and proper media handling policies.<ref name="LittleImplement03">{{cite book |chapter-url=https://books.google.com/books?id=_DqO6kizEDUC&pg=PA17 |title=Implementing Backup and Recovery: The Readiness Guide for the Enterprise |chapter=Chapter 2: Business Requirements of Backup Systems |author=Little, D.B. |publisher=John Wiley and Sons |pages=17–30 |year=2003 |isbn=978-0-471-48081-5 |access-date=8 May 2018}}</ref> *[[Data retention]] period: Regulations and policy can lead to situations where backups are expected to be retained for a particular period, but not any further. Retaining backups after this period can lead to unwanted liability and sub-optimal use of storage media.<ref name="LittleImplement03" /> *[[Checksum]] or [[hash function]] validation: Applications that back up to tape archive files need this option to verify that the data was accurately copied.<ref name="BackupExecVerify&WriteChecksumsToMedia">{{cite web |title=How do the "verify" and "write checksums to media" processes work and why are they necessary? |url=https://www.veritas.com/support/en_US/article.100030833.html |website=Veritas Support |publisher=Veritas.com |access-date=16 September 2019 |date=15 October 2015 |at=Write checksums to media}}</ref> *[[Business process management#Monitoring|Backup process monitoring]]: Enterprise client-server backup applications need a user interface that allows administrators to monitor the backup process, and proves compliance to regulatory bodies outside the organization; for example, an insurance company in the USA might be required under [[HIPAA]] to demonstrate that its client data meet records retention requirements.<ref>[http://www.hipaadvisory.com/regs/recordretention.htm HIPAA Advisory] {{Webarchive|url=https://web.archive.org/web/20070411135655/http://www.hipaadvisory.com/regs/recordretention.htm |date=11 April 2007 }}. Retrieved 10 March 2007</ref> *[[Enterprise client-server backup#User-initiated backups and restores|User-initiated backups and restores]]: To avoid or recover from ''minor'' disasters, such as inadvertently deleting or overwriting the "good" versions of one or more files, the computer user—rather than an administrator—may initiate backups and restores (from not necessarily the most-recent backup) of files or folders.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)