Dino Geek, try to help you

How does RSYNC compare files to detect changes?


Rsync, an abbreviation for “Remote Sync,” is an open-source Unix-based utility that efficiently, securely, and swiftly transfers and synchronizes files between two locations, namely local and remote computer servers. Rsync is favored for backups and mirroring due to its capabilities such as updating whole directories and minimizing data transfer by copying only diverging files.

RSync achieves efficiencies by using a version of the rsync algorithm, which involves creating checksums of blocks of data in the source file and the destination file. The algorithm then compares these checksums to identify changed blocks. It computes two pieces of information for each block in the source and destination files: a short checksum and a second longer and more CPU-intensive checksum.

The way rsync works is it initially divides the source file into equal-sized non-overlapping blocks (except for the last one which may be shorter) and computes ‘rolling’ checksum and a ‘strong’ checksum for each block. The rolling checksum is a conventional sum-of-bytes where the sum is taken over a sliding window through the file, and the strong checksum is a cryptographic checksum.

Once these are computed, only the checksums (not the file) are then sent to the receiver. The receiver also computes and sends back its own checksums from the destination file. Once received, the source (sender) machine then compares both sets of checksums.

The sender then sends the receiver only those portions of the source file that are different or missing from the destination file. This is possible because the receiver, after computing its checksums, compares these with the ones received from the sender, notices any discrepancies, and requests the appropriate blocks from the sender.

If two files are identical, rsync’s clever checksum-based method will not transfer any file data. If files have only partially changed, rsync is able to ensure that only those changed blocks of data are sent. This makes rsync a differential backup tool because it only copies the differential amount of data changed.

The checksum method of change detection allows rsync to minimize data transfers and work efficiently over networks. This efficiency makes rsync particularly effective for maintaining backup copies and mirrors where the file data being synchronized or backed up does not change significantly between operations.

Let’s consider the example of a daily backup operation using rsync. Say you only change a few files in a directory of several gigabytes. Rsync will only send those changed blocks of the files over the network, unlike conventional copying methods that would re-transfer every file, therefore saving bandwidth and time.

The sources used for this are:

1. Tridgell, A. S., and P. Mackerras. The rsync algorithm. 1996.
2. https://www.computerhope.com/unix/rsync.htm
3. Wikipedia (https://en.wikipedia.org/wiki/Rsync)
4. https://www.techrepublic.com/article/how-the-rsync-command-compares-to-scp-for-moving-files-in-linux/
5. Rsyn.org (https://rsync.samba.org/how-rsync-works.html)
6. https://www.tecmint.com/rsync-local-remote-file-synchronization-commands/


Simply generate articles to optimize your SEO
Simply generate articles to optimize your SEO





DinoGeek offers simple articles on complex technologies

Would you like to be quoted in this article? It's very simple, contact us at dino@eiki.fr

CSS | NodeJS | DNS | DMARC | MAPI | NNTP | htaccess | PHP | HTTPS | Drupal | WEB3 | LLM | Wordpress | TLD | Domain name | IMAP | TCP | NFT | MariaDB | FTP | Zigbee | NMAP | SNMP | SEO | E-Mail | LXC | HTTP | MangoDB | SFTP | RAG | SSH | HTML | ChatGPT API | OSPF | JavaScript | Docker | OpenVZ | ChatGPT | VPS | ZIMBRA | SPF | UDP | Joomla | IPV6 | BGP | Django | Reactjs | DKIM | VMWare | RSYNC | Python | TFTP | Webdav | FAAS | Apache | IPV4 | LDAP | POP3 | SMTP

| Whispers of love (API) | Déclaration d'Amour |






Legal Notice / General Conditions of Use