COLLECTIONS OF DATA:
Collection of books |
HISTORICLE COLLECTIONS:
Library of Alexandria in Ancient Egypt |
Glancing backward through history to the time of the "Library of Alexandria" which existed in Egypt roughly some 2300 years ago. The library of Alexandria was early mankind's attempt to gather up or "collect" all the written knowledge in the world. This first known effort to preserve an understanding of the natural world and its history. The library existed as a physical location, full of shelves to store possibly a half million writings on scrolls. Anytime you collect more objects that you can track in your head reliably, you need to develop a system to track and organize the objects for fast search and retrieval.
MODERN-DAY LIBRARIES:
Library Card Catalog |
Library's Online Database |
THE WORK DBMSs DO:
Work Automated by DBMS |
DEWEY DECIMAL SYSTEM:
Dewey Decimal Location Marker |
Dewey Decimal Top-Level Classes |
Dewey Decimal Number Decoder |
However, when the DBMS stores digital objects directly in it's database file, there is no need for browsing. Objects are stored at a byte offset from the start of the database file. If the digital records are each 100 bytes long and you want to retrieve the 5th record, simply read 100 bytes of data starting 500 bytes in from the start of the database file.
DATABASE FILE STRUCTURE:
Database File Record Offsets |
Single index is smaller than full book record it points to on disk |
Last_Name index shows record location |
Indexes sorted alphabetically in RAM |
While the the database records are stored on disk, the indexes, due to their smaller size, can be stored in RAM. As each new book is added to or removed from the library, the DBMS will need to update each books associated index. Because the indexes are in memory this process is much faster than if they were stored on disk. The requirement of keeping the indexes in memory is one of the main reasons database servers require lots of RAM. It should be noted that the use of indexes in addition to the records themselves is a duplication of data. Having many different indexes say by First_Name, Last_Name, Hire_Date, Office_location adds to the duplication and work since every time you modify or insert a record you must also update the indexes. It should also be noted that if the database records are never modified, the indexes would never also never need to change. Performance tests are often done to determine whether adding another index will have a positive or negative performance effect on the database.
TAKEAWAYS:
Modern DBMSs either store records full of metadata about entities that exist in the real world or the database records are the actual digital objects themselves. DBMSs store there data records in a database file and give humans the ability to query for a specific set of data records matching some criteria. The DBMS will keep sorted indexes in RAM to allow for fast location and retrieval of the requested set of records called a "recordset". While the database term may have been coined in 1962 to refer to the methods of storing and retrieving digital data, the concepts such as indexing and metadata have existed for millennia. In future blog posts DATABASE 102 & 103, we will investigate the database concept further. In even more database blog posts, I will investigate the different types of databases such as relational, NoSQL and NewSQL as well as their use cases.
Simple means beautiful
ReplyDeletereplica rolex watches uk, combining elegant style and cutting-edge technology, a variety of styles of replica rolex air king watches, the pointer walks between your exclusive taste style.
ReplyDelete