We mentioned before that directory entries contain information about files that is not a part of the file itself and that this information was called file metadata. There is also other information in the file system that is not about specific files and thus is not part of the directory entries. File system metadata For example, where is the first directory located in the file system? We will see later that there will be other structures that will tell us things such as how to find free disk blocks. The details will vary with the particular file system, but there are always these other structures, and they are very important to the integrity of the file system. They are collectively known as file system metadata.
An OS presents an application program with an API that represents the abstraction of a file. The API has to include semantics on how the application tells the OS which portion of the file it wants to access. Different applications need different modes of access.
Initially, computer applications were designed to process information in batches that were sequenced by some key information such as a part number or customer number. Such applications needed to process files sequentially. At one time these files were literally sorted decks of punched cards and later were sorted blocks of data on a magnetic tape. The system might have an input file of transactions such as time cards and a master file such as the payroll records, both of which might be in order by the employee number. The application would start reading at the front of each file and would incrementally read each file, keeping them synchronized by the key field, in this case, the employee number. For decks of cards, the records were a fixed size.
For magnetic tape, they could be any convenient size up to some maximum that the hardware or the OS would dictate. For sequential processing on disk storage, the OS (or a software library) has to have some definition of what the record size is for each file and it then has to keep track of the current position (or current record pointer ) for each application that has the file open. File system metadata(Note that different processes accessing the same file probably would have different current record pointers.) For normal sequential processing, the OS will increment the current record pointer for each read or write. There is usually a command in the API to reset the current record pointer to the start of the file.
to rewinding a tape to the starting position. Since the disk blocks are a fixed size and may not exactly match the record length requirements of the application, it is fairly common for the OS to combine more than one logical record into a physical data block.
As disk drives got much cheaper, secondary storage migrated from being stored on magnetic tapes to being stored on disk drives. Once the data was mostly kept online it became possible to process each transaction as it occurred rather than accumulating them to be processed in sequential batches. Transaction processing is generally preferable to batch processing because it allows management to track the status of an enterprise more nearly in real time. However, this meant that the application had to access the master file data in random order rather than purely sequential order. File system metadata So the file APIs were extended to include another model: random access.
In this model, the application will tell the OS which record in the file it needs and the OS will move directly to that record and access it for reading or writing. Usually, this will require some simple mapping of a key-value to the record number. For example, a small company might simply assign the employee numbers sequentially and use the employee number as the record number. In some OS, this addressing is expressed as a record number and in others, it is expressed as a byte offset from the start of the file.
access that record in the file. If the application does a read next operation it will get the next record. In order to start accessing at any point in a random access file, the OS usually provides a seek command, which will position the current record pointer at the first record that has a key-value greater than or equal to a given key value. When OSs only ran one process at a time this command would actually position the disk head to this position in the file (i.e., it would seek the physical location of the data). Now it is a logical positioning only.