File systems are integral to any operating system with long-term storage capacity. A file system has two distinct parts: the mechanism for storing files and the directory structure into which they are organized. In modern operating systems where several users can access the same files simultaneously, it has also become necessary for such features as access control and different forms of file protection to be implemented.
A file is a collection of binary data. A file could represent a program, a document, or, in some cases, part of the file system itself. In modern computing, it is common for several different storage devices to be attached to the same computer. A common data structure such as a file system allows the computer to access many other storage devices in the same way; for example, when you look at the contents of a hard drive or a CD, you view it through the same interface even though they are completely different mediums with data mapped on them in completely different ways. Files can have many other data structures within them but can all be accessed by the same methods built into the file system. The arrangement of data within the file is then decided by the program creating it. The file systems also store several attributes for the files within it.
All files have a name by which the user can access them. In most modern file systems, the name consists of its unique character, a period, and an extension. For example, the file ‘bob.jpg’ is uniquely identified by the first word ‘bob’; the extension jpg indicates a jpeg image file. The file extension allows the operating system to decide what to do with the file if someone tries to open it. The operating system maintains a list of file extension associations. Should a user try to access ‘bob.jpg,’ it would most likely be opened in whatever the system’s default image viewer is.
The system also stores the location of a file. Some file systems can only be held as one contiguous block. This has simplified storage and access to the file as the system only needs to know where the file begins on the disk and how large it is. However, it does lead to complications if the file is to be extended or removed, as there may not be enough space available to fit the larger version. Most modern file systems overcome this problem by using linked file allocation. This allows the file to be stored in any number of segments. The file system then has to hold where every file block is and how large they are. This greatly simplifies file space allocation but is slower than contiguous allocation as the file can be spread out all over the disk. Modern operating systems overcome this flaw by providing a disk defragmenter. This utility rearranges all the files on the disk so that they are all in contiguous blocks.
Information about file protection is also integrated into the file system. Protection can range from the simple systems implemented in the FAT system of early Windows, where files could be marked as read-only or hidden, to the more secure systems implemented in NTFS, where the file system administrator can set up separate read and write access rights for different users or user groups. Although file protection adds great complexity and potential difficulties, it is essential in an environment where many other computers or users can access the same drives via a network or time-shared system such as a raptor.
Some file systems also store data about which user created a file and when they created it. Although this is not essential to running the file system, it is useful to its users.
For a file system to function properly, it needs several defined operations to create, open, and edit a file. Almost all file systems provide the same basic methods for manipulating files.
A file system must be able to create a file. To do this, enough space must be left on the drive to fit the file. There must also be no other file in the directory. It is to be placed with the same name. Once the file is created, the system will record all the above attributes.
Once a file has been created, we may need to edit it. This may be simply appending some data to the end of it or removing or replacing data already stored within it. When doing this, the system keeps a write pointer marking where the next write operation to the file should take place.
For a file to be useful, it must, of course, be readable. To do this, you need to know the name and path of the file. From this, the file system can ascertain where the file is stored on the drive. While reading a file, the system keeps a read pointer. This stores which part of the drive is to be read next.
In some cases, reading all of the files into memory is impossible. File systems also allow you to reposition the read pointer within a file. To perform this operation, the system must know how far into the file you want the read information to jump. An example of where this would be useful is a database system. When a query is made on the database, reading the whole file to the point where the required data is inefficient; instead, the application managing the database would determine where the necessary bit of data is in the file and jump to it. This operation is often known as a file seek.
File systems also allow you to delete files. To do this, it needs to know the name and path of the file. The procedure removes its entry from the directory structure to delete a file. It adds all previously occupied space to the free space list (or whatever other free space management system it uses).
These are the most basic operations a file system requires to function properly. They are present in all modern computer file systems, but the way they work may vary. For example, performing the delete file operation in a current system like NTFS with file protection built into it would be more complicated than operating in an older file system like FAT. Both plans would first check whether the file was in use before continuing; NTFS would then have to check whether the user currently deleting the file has permission. Some file systems also allow multiple people to open the same file simultaneously and decide whether users have permission to write a file back to the disk if other users currently have it open. If two users have read and write permission to file, should one be allowed to overwrite it while the other still has it available? Or if one user has read-write permission and another only has read permission on a file, should the user with write permission be allowed to overwrite it if there’s no chance of the other user also trying to do so?
Different file systems also support other access methods. The simplest way of accessing information in a file is sequential access. This is where the information in a file is accessed from the beginning, one record at a time. It can be rewound or forwarded several documents or reset to the beginning of the file to change its position. This access method is based on file storage systems for tape drives and sequential access devices (like modern DAT tape drives) on random-access ones (like hard drives). Although this method is straightforward in operation and ideally suited for certain tasks such as playing media, it is inefficient for more complex tasks such as database management.
A more modern approach that better facilitates reading tasks that aren’t likely to be sequential is direct access. Direct access allows records to be read or written over in any order the application requires. This method of allowing any part of the file to be read in any order is better suited to modern hard drives as they will enable any part of the drive to be read in any order with little reduction in transfer rate. Direct access is better suited to most applications than sequential access. It is designed around the most common storage medium today instead of one that isn’t used much anymore except for large offline backups. Given how direct access works, it is also possible to build other access methods on top of direct access, such as sequential access or creating an index of all the files’ records, speeding up finding data in a file.
On top of storing and managing files on a drive, the file system also maintains a directory system in which the files are referenced. Modern hard drives hold hundreds of gigabytes. The file system helps organize this data by dividing it up into directories. A directory can contain files or more directories. Like files, there are several basic operations that a file system needs to perform on its directory structure to function properly.
It needs to be able to create a file. This is also covered by the overview of the operation on a file, but as well as creating the file, it needs to be added to the directory structure.
When a file is deleted, the space taken up by the file needs to be marked as free space. The file itself also needs to be removed from the directory structure.
Files may need to be renamed. This requires altering the directory structure, but the file remains unchanged.
List a directory. To use the disk properly, the user must know what’s in all its guides. On top of this, the user needs to browse through the directions on the hard drive.
Since the first directory structures were designed, they have undergone several large evolutions. Before directory structures were applied to file systems, all files were stored on the same level. This system has one directory in which all the files are kept. The next advancement on this, which would be considered the first directory structure, is the two-level directories. There is a single list of guides on the same level. The files are then stored in these directories. This allows different users and applications to store their files separately. After this came the first directory structures, as we know them today: directory trees. Tree structure directories improve on two-level guides by allowing directories and files to be stored in directories. All modern file systems use tree structure directories, but many have additional security features built on top of them.
Protection can be implemented in many ways. Some file systems allow you to have password-protected directories in this system. The file system won’t allow you to access a guide before giving a username and password. Others extend this system by providing different users or groups access permissions. The operating system requires users to log in before using the computer and then restrict their access to areas they don’t have permission for. The system used by the computer science department for storage space and coursework submission on Raptor is a good example of this. In a file system like NTFS, all types of storage space, network access, and printers can be controlled this way. Other types of access control can also be implemented outside of the file system. For example, applications such as Win Zip allow you to password-protect files.