File systems are an integral part of any operating system with the capacity for long term storage. There are two distinct parts of a file system, the mechanism for storing files and the directory structure into which they are organized. In modern operating systems where several users can access the same files simultaneously, it has also become necessary for such features as access control and different forms of file protection to be implemented.
A file is a collection of binary data. A file could represent a program, a document, or in some cases, part of the file system itself. In modern computing, it is quite common for several different storage devices attached to the same computer. A common data structure such as a file system allows the computer to access many different storage devices in the same way; for example, when you look at the contents of a hard drive or a cd, you view it through the same interface even though they are completely different mediums with data mapped on them in completely different ways. Files can have very different data structures within them but can all be accessed by the same methods built into the file system. The arrangement of data within the file is then decided by the program creating it. The file systems also store several attributes for the files within it.
All files have a name by which the user can access them. In most modern file systems, the name consists of three parts, its unique name, a period, and an extension. For example, the file ‘bob.jpg’ is uniquely identified by the first word ‘bob’; the extension jpg indicates a jpeg image file. The file extension allows the operating system to decide what to do with the file if someone tries to open it. The operating system maintains a list of file extension associations. Should a user try to access ‘bob.jpg,’ it would most likely be opened in whatever the system’s default image viewer is.
The system also stores the location of a file. In some file systems, files can only be stored as one contiguous block. This has simplified storage and access to the file as the system then only needs to know where the file begins on the disk and how large it is. However, it does lead to complications if the file is to be extended or removed as there may not be enough space available to fit the larger version of the file. Most modern file systems overcome this problem by using linked file allocation. This allows the file to be stored in any number of segments. The file system then has to store where every block of the file is and how large they are. This greatly simplifies file space allocation but is slower than contiguous allocation as the file can be spread out all over the disk. Modern operating systems overcome this flaw by providing a disk defragmenter. This utility rearranges all the files on the disk so that they are all in contiguous blocks.
Information about file protection is also integrated into the file system. Protection can range from the simple systems implemented in the FAT system of early windows where files could be marked as read-only or hidden to the more secure systems implemented in NTFS where the file system administrator can set up separate read and write access rights for different users or user groups. Although file protection adds a great deal of complexity and potential difficulties, it is essential in an environment where many different computers or users can access the same drives via a network or time-shared system such as a raptor.
Some file systems also store data about which user created a file and at what time they created it. Although this is not essential to the running of the file system, it is useful to the system’s users.
For a file system to function properly, they need several defined operations to create, open, and edit a file. Almost all file systems provide the same basic set of methods for manipulating files.
A file system must be able to create a file. To do this, there must be enough space left on the drive to fit the file. There must also be no other file in the directory. It is to be placed with the same name. Once the file is created, the system will make a record of all the attributes noted above.
Once a file has been created, we may need to edit it. This may be simply appending some data to the end of it or removing or replacing data already stored within it. When doing this, the system keeps a write pointer marking where the next write operation to the file should take place.
For a file to be useful, it must, of course, be readable. To do this, all you need to know the name and path of the file. From this, the file system can ascertain where on the drive the file is stored. While reading a file, the system keeps a read pointer. This stores which part of the drive is to be read next.
In some cases, it is not possible to read all of the files into memory. File systems also allow you to reposition the read pointer within a file. To perform this operation, the system needs to know how far into the file you want the read pointer to jump. An example of where this would be useful is a database system. When a query is made on the database, it is obviously inefficient to read the whole file to the point where the required data is. Instead, the application managing the database would determine where in the file the required bit of data is and jump to it. This operation is often known as a file seek.
File systems also allow you to delete files. To do this, it needs to know the name and path of the file. To delete a file, the system simply removes its entry from the directory structure. It adds all the space it previously occupied to the free space list (or whatever other free space management system it uses).
These are the most basic operations required by a file system to function properly. They are present in all modern computer file systems, but the way they function may vary. For example, to perform the delete file operation in a modern file system like NTFS that has file protection built into it would be more complicated than the same operation in an older file system like FAT. Both systems would first check whether the file was in use before continuing; NTFS would then have to check whether the user currently deleting the file has permission to do so. Some file systems also allow multiple people to open the same file simultaneously and decide whether users have permission to write a file back to the disk if other users currently have it open. If two users have read and write permission to file, should one be allowed to overwrite it while the other still has it open? Or if one user has read-write permission and another only has read permission on a file, should the user with write permission be allowed to overwrite it if there’s no chance of the other user also trying to do so?
Different file systems also support different access methods. The simplest method of accessing information in a file is sequential access. This is where the information in a file is accessed from the beginning, one record at a time. It can be rewound or forwarded several records or reset to the beginning of the file to change the file position. This access method is based on file storage systems for tape drives and sequential access devices (like modern DAT tape drives) on random-access ones (like hard drives). Although this method is straightforward in its operation and ideally suited for certain tasks such as playing media, it is very inefficient for more complex tasks such as database management.
A more modern approach that better facilitates reading tasks that aren’t likely to be sequential is direct access. Direct access allows records to be read or written over in any order the application requires. This method of allowing any part of the file to be read in any order is better suited to modern hard drives as they allow any part of the drive to be read in any order with little reduction in transfer rate. Direct access is better suited to most applications than sequential access. It is designed around the most common storage medium in use today instead of one that isn’t used very much anymore except for large offline back-ups. Given the way direct access works, it is also possible to build other access methods on top of direct access, such as sequential access or creating an index of all the files’ records, speeding to speed up finding data in a file.
On top of storing and managing files on a drive, the file system also maintains a directory system in which the files are referenced. Modern hard drives store hundreds of gigabytes. The file system helps organize this data by dividing it up into directories. A directory can contain files or more directories. Like files, there are several basic operations that a file system needs to perform on its directory structure to function properly.
It needs to be able to create a file. This is also covered by the overview of the operation on a file, but as well as creating the file, it needs to be added to the directory structure.
When a file is deleted, the space taken up by the file needs to be marked as free space. The file itself also needs to be removed from the directory structure.
Files may need to be renamed. This requires an alteration to the directory structure but the file itself remains unchanged.
List a directory. The user will require to know what’s in all the directories stored on it to use the disk properly. On top of this, the user needs to browse through the directories on the hard drive.
Since the first directory structures were designed, they have gone through several large evolutions. Before directory structures were applied to file systems, all files were stored on the same level. This is basically a system with one directory in which all the files are kept. The next advancement on this, which would be considered the first directory structure, is the two level directories. In this, There is a single list of directories that are all on the same level. The files are then stored in these directories. This allows different users and applications to store their files separately. After this came to the first directory structures as we know them today, directory trees. Tree structure directories improve on two-level directories by allowing directories and files to be stored in directories. All modern file systems use tree structure directories, but many have additional security features built on top of them.
Protection can be implemented in many ways. Some file systems allow you to have password-protected directories in this system. The file system won’t allow you to access a directory before giving a username and password for it. Others extend this system by given different users or groups access permissions. The operating system requires the user to log in before using the computer and then restrict their access to areas they don’t have permission for. The system used by the computer science department for storage space and coursework submission on raptor is a good example of this. In a file system like NTFS, all types of storage space, network access, and devices such as printers can be controlled in this way. Other types of access control can also be implemented outside of the file system. For example, applications such as win zip allow you to password-protect files.