Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Rsync
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Algorithm == {{more citations needed section|date=March 2015}} === Determining which files to send === By default, rsync determines which files differ between the sending and receiving systems by checking the modification time and size of each file. If time or size is different between the systems, it transfers the file from the sending to the receiving system. As this only requires reading file directory information, it is quick, but it will miss unusual modifications which change neither.<ref name="man page" /> Rsync performs a slower but comprehensive check if invoked with <code>--checksum</code>. This forces a full checksum comparison on every file present on both systems. Barring rare [[hash collision|checksum collisions]], this avoids the risk of missing changed files at the cost of reading every file present on both systems. === Determining which parts of a file have changed === The rsync utility uses an [[algorithm]] invented by Australian computer programmer [[Andrew Tridgell]] for efficiently transmitting a structure (such as a file) across a communications link when the receiving computer already has a similar, but not identical, version of the same structure.<ref>{{Cite web |title=RSync β Overview |url=http://tutorials.jenkov.com/rsync/overview.html |url-status=live |archive-url=https://web.archive.org/web/20170410213853/http://tutorials.jenkov.com/rsync/overview.html |archive-date=10 April 2017 |access-date=9 April 2017}}</ref> The recipient splits its copy of the file into chunks and computes two [[checksum]]s for each chunk: the [[MD5]] [[hash function|hash]], and a weaker but easier to compute '[[Rolling hash|rolling checksum]]'.<ref>{{cite web |url=http://rsync.samba.org/ftp/rsync/src/rsync-3.0.0-NEWS |title=News for rsync 3.0.0 |date=1 March 2008 |archive-url=https://web.archive.org/web/20080320001756/http://rsync.samba.org/ftp/rsync/src/rsync-3.0.0-NEWS |archive-date=20 March 2008 |url-status=dead}}</ref> It sends these checksums to the sender. The sender computes the checksum for each rolling section in its version of the file having the same size as the chunks used by the recipient's. While the recipient calculates the checksum only for chunks starting at full multiples of the chunk size, the sender calculates the checksum for all sections starting at any address. If any such rolling checksum calculated by the sender matches a checksum calculated by the recipient, then this section is a candidate for not transmitting the content of the section, but only the location in the recipient's file instead. In this case, the sender uses the more computationally expensive MD5 hash to verify that the sender's section and recipient's chunk are equal. Note that the section in the sender may not be at the same start address as the chunk at the recipient. This allows efficient transmission of files which differ by insertions and deletions.<ref>{{cite web |author=Norman Ramsey |url=https://www.cs.tufts.edu/~nr/rsync.html |title=The Rsync Algorithm}}</ref> The sender then sends the recipient those parts of its file that did not match, along with information on where to merge existing blocks into the recipient's version. This makes the copies identical. The [[rolling hash|rolling checksum]] used in rsync is based on Mark Adler's [[adler-32]] checksum, which is used in [[zlib]], and is itself based on [[Fletcher's checksum]]. If the sender's and recipient's versions of the file have many sections in common, the utility needs to transfer relatively little data to synchronize the files. If typical [[data compression]] algorithms are used, files that are similar when uncompressed may be very different when compressed, and thus the entire file will need to be transferred. Some compression programs, such as [[gzip]], provide a special "rsyncable" mode which allows these files to be efficiently rsynced, by ensuring that local changes in the uncompressed file yield only local changes in the compressed file. Rsync supports other key features that aid significantly in data transfers or backup. They include compression and decompression of data block by block using [[zstd|Zstandard]], [[LZ4 (compression algorithm)|LZ4]], or [[zlib]], and support for protocols such as [[Secure Shell|ssh]] and [[stunnel]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)