Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Tar (computing)
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Limitations== The original tar format was created in the early days of Unix, and despite current widespread use, many of its design features are considered dated.<ref>{{Cite web|url=http://duplicity.nongnu.org/new_format.html|title=duplicity: New file format|website=duplicity.nongnu.org}}</ref> Other formats have been created to address the shortcomings of tar. === File names === Due to the [[Tar_(computing)#Header|field size]], the original TAR format was unable to store file paths and names in excess of 100 characters. To overcome this problem while maintaining [[backwards compatibility|readability by existing TAR utilities]], GNU tar stores file paths and names in excess of the 100 characters are stored in <code>@LongLink</code> entries that would be seen as ordinary files by TAR utilities unaware of this feature.<ref>[https://github.com/gitGNU/gnu_tar/blob/master/src/create.c#L546 gnu_tar/src/create.c at master 路 gitGNU/gnu_tar 路 GitHub], line 546</ref> Similarly, the [[Tar (computing)#POSIX.1-2001/pax|PAX]] format uses <code>PaxHeaders</code> entries.<ref>[https://github.com/openbsd/src/blob/8df76133309eacd4092b091ee0504adb842322a5/bin/pax/tar.c#L1066 src/bin/pax/tar.c at 8df76133309eacd4092b091ee0504adb842322a5 路 openbsd/src 路 GitHub], line 1066</ref> === Attributes === Many older tar implementations do not record nor restore extended attributes (xattrs) or [[access-control list]]s (ACLs). In 2001, Star introduced support for ACLs and extended attributes, through its own tags for POSIX.1-2001 pax. bsdtar uses the star extensions to support ACLs.<ref name=bsd>{{Man|5|tar|FreeBSD}}</ref> More recent versions of GNU tar support Linux extended attributes, reimplementing star extensions.<ref name = "Les bons comptes, 2014">{{ Cite web | url = http://www.lesbonscomptes.com/pages/extattrs.html | title = Extended attributes: the good, the not so good, the bad. | access-date = 2019-09-03 | date = 2014-07-15 | website = Les bons comptes | quote = The extended attributes can be very valuable for storing file metadata (e.g. <nowiki>author="John Smith"</nowiki>, <nowiki>subject="country landscape"</nowiki>), in the many cases where you do not want or can't store this data in the file internal properties. | archive-url = https://web.archive.org/web/20141214001530/http://www.lesbonscomptes.com/pages/extattrs.html |archive-date=2014-12-14| df = dmy-all }}</ref> A number of extensions are reviewed in the filetype manual for BSD tar, tar(5).<ref name=bsd/> ===Tarbomb=== {{redirect-distinguish|Tarbomb|zip bomb}} A '''tarbomb''', in [[Jargon File|hacker slang]], is a tarball containing a large number of items whose contents are written to the current directory or some other existing directory when untarred instead of the directory created by the tarball specifically for the extracted outputs. <ref>{{cite web |title=Tarbomb Definition |language=en |website=[[The Linux Info Project]] |url=https://www.linfo.org/tarbomb.html |access-date=2024-12-12 }}</ref> It is at best an inconvenience to the user, who is obliged to identify and delete a number of files interspersed with the directory's other contents. Such behavior is considered bad etiquette on the part of the archive's creator. A related problem is the use of [[Path (computing)|absolute path]]s or [[Directory (computing)|parent directory]] references when creating tar files. Files extracted from such archives will often be created in unusual locations outside the working directory and, like a tarbomb, have the potential to overwrite existing files. However, modern versions of FreeBSD and GNU tar do not create or extract absolute paths and parent-directory references by default, unless it is explicitly allowed with the flag {{code|-P}} or the option {{code|--absolute-names}}. The bsdtar program, which is also available on many operating systems and is the default tar implementation on [[macOS|Mac OS X]] v10.6, also does not follow parent-directory references or symbolic links.<ref>{{Cite web|url=https://man.freebsd.org/cgi/man.cgi?query=bsdtar&sektion=1&format=html|title=bsdtar(1)|website=man.freebsd.org}}</ref> <!-- https://developer.apple.com/mac/library/documentation/Darwin/Reference/ManPages/man1/bsdtar.1.html Man page for "bsdtar", as provided by Apple. -->{{failed verification|date=July 2022}} If a user has only a very old tar available, which does not feature those security measures, these problems can be mitigated by first examining a tar file using the command <code>tar tf archive.tar</code>, which lists the contents and allows to exclude problematic files afterwards. These commands do not extract any files, but display the names of all files in the archive. If any are problematic, the user can create a new empty directory and extract the archive into it鈥攐r avoid the tar file entirely. Most graphical tools can display the contents of the archive before extracting them. [[Vim (text editor)|Vim]] can open tar archives and display their contents. [[GNU Emacs]] is also able to open a tar archive and display its contents in a [[dired]] buffer. ===Random access=== The tar format was designed without a centralized index or table of content for files and their properties for streaming to tape backup devices. Instead, the metadata for each file (such as name, size, time stamps) for each file is stored in a header before each file. The archive must be read sequentially to list or extract files. For large tar archives, this causes a performance penalty, making tar archives unsuitable for situations that often require random access to individual files. In turn, this design makes TAR archives resilient against damage from missing portions, in both the form of digital files and physical tape.{{cn|date=February 2025}} A truncated TAR file with missing parts on either ends still allows recovering the parts that are not missing, including the file paths and file names and metadata, by starting from the first TAR header that is not missing.<ref>Creating TAR with 100 KB missing at the beginning: <code>tail --bytes=+100000 "intact archive.tar" >>"missing beginning.tar"</code>. Next header can be found using a [[hex editor]]. Recover using <code>dd if="missing beginning.tar" of=recovered.tar ibs=''[bytes until next header which starts with file path and name]'' skip=1</code>. Quotation marks are not needed for file names without spaces.</ref> With a well-formed tar file stored on a seekable (i.e. allows efficient random reads) medium, the {{code|tar}} program can still relatively quickly (in linear time relative to file count) look for a file by skipping file reads according to the "size" field in the file headers. This is the basis for option {{code|-n}} in GNU tar. When a tar file is compressed whole, the compression format, being usually non-seekable, prevents this optimization from being done.<ref>{{cite web |last1=BillThor |title=What makes a tar archive seekable? |url=https://superuser.com/a/1235409 |website=Super User |access-date=15 December 2023 |language=en |date=July 28, 2017}}</ref> To maintain seekability, tar files must be also concatenated properly, by removing the trailing zero block at the end of each file.<ref>{{cite web |title=GNU tar 1.35: 4.2.4 Combining Archives with --concatenate |url=https://www.gnu.org/software/tar/manual/html_node/concatenate.html |website=www.gnu.org}}</ref> ===Duplicates=== Another issue with tar format is that it allows several (possibly different) files in archive to have identical paths and filenames. When extracting such archive, usually the latter version of a file overwrites the former. This can create a non-explicit (unobvious) tarbomb, which technically does not contain files with absolute paths or referring to parent directories, but still causes overwriting files outside current directory (for example, archive may contain two files with the same path and filename, first of which is a [[symbolic link|symlink]] to some location outside current directory, and second of which is a regular file; then extracting such archive on some tar implementations may cause writing to the location pointed to by the symlink).
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)