Before going into file organization it is necessary to go into little details of files generally. A file is a collection of related information defined by its creator. File is the basic unit of storage that enables a computer to distinguish one set of information from another. The most important files are program and data files. Program files are associated with providing access and execution of an actual program that is stored and installed on your computer system, whereas data files contain data and information needed for a program to perform execution successfully. The most important feature of a file is its name.
Having said this, a trade-off ...view middle of the document...
Records are placed in file in the same order as they are inserted. A new record is inserted in the last page of the file; if there is insufficient space in the last page, a new page is added to the file.
This makes insertion very efficient. However, as a heap file has no particular ordering with respect to field values, a linear search must be performed to access a record. A linear search involves reading pages from the file until the required is found. This makes retrievals from heap files that have more than a few pages relatively slow, unless the retrieval involves a large proportion of the records in the file.
Heap files are one of the best organizations for bulk loading data into a table, as records are inserted at the end of the sequence; there is no overhead of calculating what page the record should go on.
Pros of heap files are; When data is being bulk-loaded into the relation. The relation is only a few pages long. In this case, the time to locate any tuple is Short, even if the entire relation has been searched serially. When every tuple in the relation has to be retrieved (in any order) every time the relation is accessed. For example, retrieve the name of all the students. Cons of Heap storage; Heap files are inappropriate when only selected tuples of a relation are to be accessed.
Hash File Organization
In a hash file, records are not stored sequentially in a file instead a hash function is used to calculate the address of the page in which the record is to be stored.
Hash File Organization: This mechanism uses a Hash function computation on some field of the records. As we know, that file is a collection of records, which has to be mapped on some block of the disk space allocated to it. This mapping is defined as the hash computation. The output of hash determines the location of disk block where the records may exist. The field on which hash function is calculated is called as Hash field and if that field acts as the key of the relation then it is called as Hash key. Records are randomly distributed in the file so it is also called Random or Direct files. Commonly some arithmetic function is applied to the hash field so that records will be evenly distributed throughout the file.
Index Sequential Access Method (ISAM) File Organization
In an ISAM system, data is organized into records which are composed of fixed length fields. Records are stored sequentially, originally to speed access on a tape system. A secondary set of hash tables known as indexes contain "pointers" into the tables, allowing individual records to be retrieved without having to search the entire data set. The key improvement in ISAM is that the indexes are small and can be searched quickly, thereby allowing the database to access only the records it needs. Additionally, modifications to the data do not require changes to other data, only the table and indexes in question. When an ISAM file is created, index nodes are fixed, and their...