Data Storage Digest

Do-It-Yourself Windows File Recovery Software: A Comparison

results »

What Sequential Reading Means for Data Access

Data access can be governed in a variety of ways. Most of us are already very familiar with RAM, or random access memory, which essentially allows data to be read in any order. On the other hand, sequential access memory, or SAM, facilitates data access through a sequential process. Such functionality is typically seen in magnetic memory devices, which have been gaining momentum as of late, but sequential access is also used in tape drives, CDs, DVDs and other forms of media, too. What many don’t realize, however, is that sequential reading actually has a number of benefits in today’s computers.

For starters, a sequential read algorithm is theoretically faster than a random read algorithm simply because of the way modern hardware operates. With most of the access time being spent on the seek operation itself, which involves positioning the disk head on the correct disk cylinder, it makes sense to try and minimize the number of seek operations that have to take place. Because random access algorithms require more seek operations than sequential access algorithms, it’s easy to see which process outperforms the other in terms of throughput.

There are some downsides to sequential data access as well. For example, locating a specific record contained within a file that holds thousands of records is difficult when using sequential access. Even the process of pulling multiple records within a file, either for updating, deletion or other purposes altogether, can be slowed down when using a sequential method of data access.

Furthermore, sequential data access will not prevent file fragmentation. This could cause an otherwise sequential access algorithm to turn toward random access in order to locate all the pieces of a file, which will ultimately affect data throughput.

When records do not have to be accessed in a strict order, however, random access is certainly faster. Many programming languages and operating systems of today do provide support for both sequential and random data access.

In order to ensure the maximum amount of efficiency, some IT experts use a combination of sequential and random access when working with data. Projects such as Kudu, a new in-memory store for Hadoop, support both sequential and random access.

Baoqiu Cui, chief architect for the smartphone developer Xiamoi, partnered with a number of companies, including Cloudera, to pioneer the Kudu platform. He was quoted as saying: “Our infrastructure team has been working with Cloudera to develop Kudu, taking advantage of its unique ability to support columnar scans and fast inserts and updates to continue to expand our Hadoop ecosystem footprint. Using Kudu, alongside interactive SQL tools like Impala, has allowed us to build a next-generation data analytics platform for real-time analytics and online reporting.”

As you can see by now, sequential data access has advantages as well as disadvantages. While you may be constrained to one method or another based on your current hardware, those who are able to utilize both sequential and random access typically see increased performance and efficiency from their systems.


No comments yet. Sign in to add the first!