Editing File system (section)

== Attributes ==

=== File names ===

{{Main|Filename}}
A '''file name''', or '''filename''', identifies a file to consuming applications and in some cases users.

A file name is unique so that an application can refer to exactly one file for a particular name. If the file system supports directories, then generally file name uniqueness is enforced within the context of each directory. In other words, a storage can contain multiple files with the same name, but not in the same directory.

Most file systems restrict the length of a file name.

Some file systems match file names as [[Case sensitivity|case sensitive]] and others as case insensitive. For example, the names <code>MYFILE</code> and <code>myfile</code> match the same file for case insensitive, but different files for case sensitive.

Most modern file systems allow a file name to contain a wide range of characters from the [[Unicode]] character set. Some restrict characters such as those used to indicate special attributes such as a device, device type, directory prefix, file path separator, or file type.

=== Directories ===

{{Main|Directory (computing)}}

File systems typically support organizing files into '''directories''', also called '''folders''', which segregate files into groups.

This may be implemented by associating the file name with an index in a [[table of contents]] or an [[inode]] in a [[Unix-like]] file system.

Directory structures may be flat (i.e. linear), or allow hierarchies by allowing a directory to contain directories, called subdirectories.

The first file system to support arbitrary hierarchies of directories was used in the [[Multics]] operating system.<ref>{{cite conference|chapter-url=http://www.multicians.org/fjcc4.html|chapter=A General-Purpose File System For Secondary Storage|author=R. C. Daley|author2=P. G. Neumann|title=Proceedings of the November 30--December 1, 1965, fall joint computer conference, Part I on XX - AFIPS '65 (Fall, part I) |year=1965|conference=Fall Joint Computer Conference|publisher=[[AFIPS]]|pages=213–229|doi=10.1145/1463891.1463915|access-date=2011-07-30|doi-access=free}}</ref> The native file systems of Unix-like systems also support arbitrary directory hierarchies, as do, [[Apple Inc.|Apple]]'s [[Hierarchical File System (Apple)|Hierarchical File System]] and its successor [[HFS Plus|HFS+]] in [[classic Mac OS]], the [[File Allocation Table|FAT]] file system in [[MS-DOS]] 2.0 and later versions of MS-DOS and in [[Microsoft Windows]], the [[NTFS]] file system in the [[Windows NT]] family of operating systems, and the ODS-2 (On-Disk Structure-2) and higher levels of the [[Files-11]] file system in [[OpenVMS]].

{{Anchor|METADATA}}

=== Metadata ===

In addition to data, the file content, a file system also manages associated [[metadata]] which may include but is not limited to:

* name
* [[file size|size]] which may be stored as the number of blocks allocated or as a [[byte]] count
* [[system time|when]] created, last accessed, last backed-up
* owner [[user ID|user]] and [[group ID|group]]
* [[file system permissions|access permissions]]
* [[file attribute]]s such as whether the file is read-only, [[executable]], etc.
* [[device file|device type]] (e.g. [[Block devices|block]], [[Character devices|character]], [[Internet socket|socket]], [[subdirectory]], etc.)

A file system stores associated metadata separate from the content of the file.

Most file systems store the names of all the files in one directory in one place—the directory table for that directory—which is often stored like any other file.
Many file systems put only some of the metadata for a file in the directory table, and the rest of the metadata for that file in a completely separate structure, such as the [[inode]].

Most file systems also store metadata not associated with any one particular file.
Such metadata includes information about unused regions—[[free space bitmap]], [[block availability map]]—and information about [[bad sector]]s.
Often such information about an [[allocation group]] is stored inside the allocation group itself.

Additional attributes can be associated on file systems, such as [[NTFS]], [[XFS]], [[ext2]], [[ext3]], some versions of [[Unix File System|UFS]], and [[HFS+]], using [[extended file attributes]]. Some file systems provide for user defined attributes such as the author of the document, the character encoding of a document or the size of an image.

Some file systems allow for different data collections to be associated with one file name. These separate collections may be referred to as ''streams'' or ''forks''. Apple has long used a forked file system on the Macintosh, and Microsoft supports streams in NTFS. Some file systems maintain multiple past revisions of a file under a single file name; the file name by itself retrieves the most recent version, while prior saved version can be accessed using a special naming convention such as "filename;4" or "filename(-4)" to access the version four saves ago.

See [[comparison of file systems#Metadata|comparison of file systems § Metadata]] for details on which file systems support which kinds of metadata.

=== Storage space organization ===

A local file system tracks which areas of storage belong to which file and which are not being used.

When a file system creates a file, it allocates space for data. Some file systems permit or require specifying an initial space allocation and subsequent incremental allocations as the file grows.

To delete a file, the file system records that the file's space is free; available to use for another file.

[[File:100 000-files 5-bytes each -- 400 megs of slack space.png|frame|An example of slack space, demonstrated with 4,096-[[byte]] NTFS clusters: 100,000 files, each five bytes per file, which equal to 500,000 bytes of actual data but require 409,600,000 bytes of disk space to store <!-- The size listing shown in Explorer is oddly doubly-wrong. The example files are 5 bytes each, not 0.1K, and the clusters are a minimum of 4K not 1K.-->]]

A local file system manages storage space to provide a level of reliability and efficiency. Generally, it allocates storage device space in a granular manner, usually multiple physical units (i.e. [[bytes]]). For example, in [[Apple DOS]] of the early 1980s, 256-byte sectors on 140 kilobyte floppy disk used a ''track/sector map''.{{Citation needed|date=September 2012}}

The granular nature results in unused space, sometimes called [[slack space]], for each file except for those that have the rare size that is a multiple of the granular allocation.{{Sfn|Carrier|2005|pp=187–188}} For a 512-byte allocation, the average unused space is 256 bytes. For 64&nbsp;KB clusters, the average unused space is 32&nbsp;KB.

Generally, the allocation unit size is set when the storage is configured.
Choosing a relatively small size compared to the files stored, results in excessive access overhead.
Choosing a relatively large size results in excessive unused space.
Choosing an allocation size based on the average size of files expected to be in the storage tends to minimize unusable space.

=== Fragmentation ===

[[File:File system fragmentation.svg|thumb|File systems may become [[File system fragmentation|fragmented]]]]

As a file system creates, modifies and deletes files, the underlying storage representation may become [[File system fragmentation|fragmented]]. Files and the unused space between files will occupy allocation blocks that are not contiguous.

A file becomes fragmented if space needed to store its content cannot be allocated in contiguous blocks. Free space becomes fragmented when files are deleted.<ref>{{cite book|url=https://books.google.com/books?id=dSMJAAAAQBAJ&pg=PA524|title=Embedded Microcomputer Systems: Real Time Interfacing|edition=Third|last=Valvano|first=Jonathan W.|publisher=[[Cengage Learning]]|date=2011|access-date=June 30, 2022|page=524|isbn=978-1-111-42625-5}}</ref>

This is invisible to the end user and the system still works correctly. However this can degrade performance on some storage hardware that work better with contiguous blocks such as [[Hard disk drive#Performance characteristics|hard disk drives]]. Other hardware such as [[Solid-state drive|solid-state drives]] are not affected by fragmentation.

=== Access control ===

<!-- Too many 'see also' links, perhaps these should be moved to the 'See also' sect at the end -->
{{See also|Computer security|Password cracking|Filesystem-level encryption|Encrypting File System}}

A file system often supports access control of data that it manages.

The intent of access control is often to prevent certain users from reading or modifying certain files.

Access control can also restrict access by program in order to ensure that data is modified in a controlled way. Examples include passwords stored in the metadata of the file or elsewhere and [[file permissions]] in the form of permission bits, [[access control list]]s, or [[Capability-based security|capabilities]]. The need for file system utilities to be able to access the data at the media level to reorganize the structures and provide efficient backup usually means that these are only effective for polite users but are not effective against intruders.
<!-- Please don't make this article really big by including all the issues of file security here. Please add it to a file system security article -->

Methods for encrypting file data are sometimes included in the file system. This is very effective since there is no need for file system utilities to know the encryption seed to effectively manage the data. The risks of relying on encryption include the fact that an attacker can copy the data and use brute force to decrypt the data. Additionally, losing the seed means losing the data.

=== Storage quota ===
[[File:Btrfs qgroup screenshot.png|thumb|upright=1.5|Example of qgroup (quota group) of a [[btrfs]] filesystem]]
Some operating systems allow a system administrator to enable [[disk quota]]s to limit a user's use of storage space.

=== Data integrity ===

A file system typically ensures that stored data remains consistent in both normal operations as well as exceptional situations like:
* accessing program neglects to inform the file system that it has completed file access (to close a file)
* accessing program terminates abnormally (crashes)
* media failure
* loss of connection to remote systems
* operating system failure
* system reset ([[warm reboot|soft reboot]])
* power failure ([[hard reboot]])

Recovery from exceptional situations may include updating metadata, directory entries and handling data that was buffered but not written to storage media.

=== Recording ===

A file system might record events to allow analysis of issues such as:
* file or systemic problems and performance
* nefarious access

=== Data access ===

==== Byte stream access ====

Many file systems access data as a stream of [[bytes]]. Typically, to read file data, a program provides a [[memory buffer]] and the file system retrieves data from the medium and then writes the data to the buffer. A write involves the program providing a buffer of bytes that the file system reads and then stores to the medium.

==== Record access ====

Some file systems, or layers on top of a file system, allow a program to define a [[Record (computer science)|record]] so that a program can read and write data as a structure; not an unorganized sequence of bytes.

If a ''fixed length'' record definition is used, then locating the n<sup>th</sup> record can be calculated mathematically, which is relatively fast compared to parsing the data for record separators.

An identification for each record, also known as a key, allows a program to read, write and update records without regard to their location in storage. Such storage requires managing blocks of media, usually separating key blocks and data blocks. Efficient algorithms can be developed with pyramid structures for locating records.<ref>{{cite web|url=https://www.researchgate.net/publication/234789457|title=KSAM: A B + -tree-based keyed sequential-access method|work=ResearchGate|access-date=29 April 2016}}</ref>

=== Utilities ===

Typically, a file system can be managed by the user via various utility programs.

Some utilities allow the user to create, configure and remove an instance of a file system. It may allow extending or truncating the space allocated to the file system.

{{Anchor|DENTRY}}
Directory utilities may be used to create, rename and delete ''directory entries'', which are also known as ''dentries'' (singular: ''dentry''),<ref>{{cite book
 | last1                 = Mohan
 | first1                = I. Chandra
 | title                 = Operating Systems
 | url                   = https://books.google.com/books?id=eei_jHVJi3oC
 | location              = Delhi
 | publisher             = PHI Learning Pvt. Ltd.
 | date      = 2013
 | page                  = 166
 | isbn                  = 9788120347267
 | access-date            = 2014-07-27
 | quote                 = The word dentry is short for 'directory entry'. A dentry is nothing but a specific component in the path from the root. They (directory name or file name) provide for accessing files or directories[.]
}}</ref> and to alter metadata associated with a directory. Directory utilities may also include capabilities to create additional links to a directory ([[hard link]]s in [[Unix]]), to rename parent links (".." in [[Unix-like]] operating systems),{{Clarify|date=July 2014}} and to create bidirectional links to files.

File utilities create, list, copy, move and delete files, and alter metadata. They may be able to truncate data, truncate or extend space allocation, append to, move, and modify files in-place. Depending on the underlying structure of the file system, they may provide a mechanism to prepend to or truncate from the beginning of a file, insert entries into the middle of a file, or delete entries from a file. Utilities to free space for deleted files, if the file system provides an undelete function, also belong to this category.

Some file systems defer operations such as reorganization of free space, secure erasing of free space, and rebuilding of hierarchical structures by providing utilities to perform these functions at times of minimal activity. An example is the file system [[defragmentation]] utilities.

Some of the most important features of file system utilities are supervisory activities which may involve bypassing ownership or direct access to the underlying device. These include high-performance backup and recovery, data replication, and reorganization of various data structures and allocation tables within the file system.

=== File system API ===

Utilities, libraries and programs use [[file system API]]s to make requests of the file system. These include data transfer, positioning, updating metadata, managing directories, managing access specifications, and removal.

=== Multiple file systems within a single system ===

Frequently, retail systems are configured with a single file system occupying the entire [[Computer storage device|storage device]].

Another approach is to [[Disk partitioning|partition]] the disk so that several file systems with different attributes can be used. One file system, for use as browser cache or email storage, might be configured with a small allocation size. This keeps the activity of creating and deleting files typical of browser activity in a narrow area of the disk where it will not interfere with other file allocations. Another partition might be created for the storage of audio or video files with a relatively large block size. Yet another may normally be set ''read-only'' and only periodically be set writable. Some file systems, such as [[ZFS]] and [[APFS]], support multiple file systems sharing a common pool of free blocks, supporting several file systems with different attributes without having to reserved a fixed amount of space for each file system.<ref>{{cite web|url=https://docs.freebsd.org/en/books/handbook/zfs/|title=Chapter 22. The Z File System (ZFS)|work=The FreeBSD Handbook|quote=Pooled storage: adding physical storage devices to a pool, and allocating storage space from that shared pool. Space is available to all file systems and volumes, and increases by adding new storage devices to the pool.}}</ref><ref>{{cite web|url=https://daisydiskapp.com/guide/4/en/APFS/|title=About Apple File System (APFS)|work=DaisyDisk User Guide|quote=APFS introduces space sharing between volumes. In APFS, every physical disk is a container that can have multiple volumes inside, which share the same pool of free space.}}</ref>

A third approach, which is mostly used in cloud systems, is to use "[[disk image]]s" to house additional file systems, with the same attributes or not, within another (host) file system as a file. A common example is virtualization: one user can run an experimental Linux distribution (using the [[ext4]] file system) in a virtual machine under his/her production Windows environment (using [[NTFS]]). The ext4 file system resides in a disk image, which is treated as a file (or multiple files, depending on the [[hypervisor]] and settings) in the NTFS host file system.

Having multiple file systems on a single system has the additional benefit that in the event of a corruption of a single file system, the remaining file systems will frequently still be intact. This includes virus destruction of the ''system'' file system or even a system that will not boot. File system utilities which require dedicated access can be effectively completed piecemeal. In addition, [[defragmentation]] may be more effective. Several system maintenance utilities, such as virus scans and backups, can also be processed in segments. For example, it is not necessary to backup the file system containing videos along with all the other files if none have been added since the last backup. As for the image files, one can easily "spin off" differential images which contain only "new" data written to the master (original) image. Differential images can be used for both safety concerns (as a "disposable" system - can be quickly restored if destroyed or contaminated by a virus, as the old image can be removed and a new image can be created in matter of seconds, even without automated procedures) and quick virtual machine deployment (since the differential images can be quickly spawned using a script in batches).