If your brand new 3.6GHz CPU and $500 video card is still waiting 60 seconds for a Doom 3 level to load, you'll know that the hard drive is probably the most significant bottleneck in your PC subsystem.
It's all about finding the right balance.
IDE and SATA hard drives have not evolved much in terms of speed, particularly within the past 18 months. Sure, fast 10 000rpm drives have been introduced, and manufacturers are beefing up the cache, but hard drives have not increased in speed the same way as CPUs and video cards have. Building an extremely fast PC is more than just dropping in the fastest processor you can find. A CPU is one part of the equation, but you'll need a fast video card to draw the images as quickly as the CPU can feed it. A fast subsystem will keep data moving efficiently, as a slower bus will bottleneck the data flow.
One problem that exists for almost all current drives is the way they access data. Unlike ram, hard drives are mechanical devices, that are limited by the speed of the internal components when accessing data. Like a record player, the "needle" needs to move from one spot to another to retrieve information. Rotational and seek latencies are the big hurdles here, which is why we see many SCSI or 10 000 rpm SATA drives seemingly so much faster than "standard" IDE or SATA drives. Faster spinning hard drives alleviate the problem somewhat by increasing the speed of the motors, but as a quick price check can tell you, these drives are very expensive, and in order to make the drives attractive price-wise, they often feature lower capacities. The larger cache mentioned earlier can also speed up data access, but it doesn't completely solve the problem as incorrect cached data is useless if the program doesn't need it.
How a Drive Accesses Data
All hard drives work the same way overall... the CPU makes a data request, the drive spins to where the data is located, and retrieves it and sends it back to the CPU. The data is stored on tracks, and unlike recordable CDs, hard drive tracks write from the outer edge of the platter and moves it's way to the inner platter. In a typical hard drive, data reads and writes begin on the bottom platter, referred to as Disc 0, and the first read/write head, which is head 0. After one cycle of data is complete (track 0 on head 0), the drive moves to the other side of the disc (track 0 on head 1). Once that is done, the drive moves to the next head on the second disc (track 0 on head 2). Once the last head on the last side of the last disc finishes, the cycle repeats with track 1 and head 0.
The two sides of a head is collectively known as a cylinder. As outlined earlier, the data is written to sequentially on the cylinders until the inner diameter of the disc is used. Ideally, program data is written in order and reads follow the order data was written. This of course is rarely the case, and in some cases, where the program thinks data should be isn't there, forcing it to look elsewhere (which makes a strong case for keeping your discs defragmented).
One issue that plagues Parallel ATA drives is that although you can speed up the mechanics, the drive still needs to be efficient at retrieving data. Ideally, a drive will know where to pick up data "A", and know where data "B" is located. It should know it needs "E" before "D", and so on. The best way to do this is through queuing, which at the system bus level, organizes the data that needs to be retrieved. Legacy Command Queuing (LCQ), has some limitations though, one of which is the bus is going to be occupied until the drive completes the reordering and retrieval of data. Given the mechanical nature of hard drives, if requests are being made faster than the drive can fulfill them, we still get bottlenecked, even with more cache and faster motors.
Native Command Queuing
Native Command Queuing (NCQ) was developed to address the problems of LCQ. Introduced with the Serial ATA II spec, this is a feature that can only be found in native SATA hard drives. Unlike LCQ, NCQ works by allowing a drive to process multiple commands at the same time. These commands can be rescheduled or reordered on a whim, and can also issue new requests while the drive is retrieving data from the previous request.
NCQ is tied in closely with Hyper-Threading, and combined with HT capable hardware and software, the performance differences should be quite substantial when compared to non-NCQ drives. Tagged command queuing is supported, and is the command reordering based on seek and rotational optimization. We mentioned that those two items are a big part of a drive's performance (or lack thereof), and by reordering the algorithms based on the linear and angular position of the data, the process will be much more efficient.
In addition to these algorithms, NCQ is capable of communicating the status of the commands being performed at any time. This is referred to as Race-Free Status Return Mechanism, and in essence, the drive is able to issue several commands at the same time, without needing to wait for the host to check on the status.
Interrupt Aggregation is another feature where multiple commands can be aggregated to one interrupt. Normally, for each command, the host bus would be interrupted each time, but with Interrupt Aggregation, this can happen only once.
Finally, NCQ has the ability to set up the direct memory access (DMA) operation for a data transfer without host software intervention. First Party DMA (FPDMA) allows the drive to process a number of commands without any intervention from the CPU and/or software.
How NCQ Works
When a command is given to the hard drive, the device needs to determine if this command is to be queued or processed right away. In order for NCQ to work efficiently, two commands were added to the SATA II specification, which are Read FPDMA Queued and Write FPDMA Queued. We mentioned Hyper-Threading earlier, and the advantage here is normally, applications request one piece of data at a time. With Hyper-Threading, several applications can request data. While this can happen without Hyper-Threading, the technology allows queues to be built more efficiently.
To simplify how NCQ works, a good example would be an elevator. Say there is a person to deliver three packages in the elevator, where each package represents a data request for an application (which would be the company these packages are intended for). Say the packages need to go to floors 2, 3, and 4, but they are stacked in a random order on the elevator floor. The deliver person ends up dropping off the packages as they are currently stacked, which would be on floors 4, 2, and 3. Naturally, this is inefficient, and it would be better to deliver the boxes in sequential order.
To represent queuing, the delivery person sorts the packages so that they are dropped off in sequential order. Hyper-Threading can be represented by having a second delivery person sorting out the boxes while the other drops them off. In any case, this was a simplified example, but that's the general idea.
Who and What Supports NCQ?
As we've already pointed out, NCQ is only present in native SATA drives. The majority of initial SATA drives (Seagate was the exception) upon debut were not native drives, but rather, IDE drives with SATA interfaces. NCQ enabled drives will only apply to those which qualify under the Serial ATA II specification. At the time of this writing, only Seagate and Maxtor offer these drives. The Seagate Barracuda 7200.7, which we'll be looking at today, is readily available and coming soon will be their Barracuda 7200.8 which increases the capacity and doubles the buffer to 16MB. Maxtor's offering is the DiamondMax 10, which should be available now, but we were unable to find any of these drives locally.
NCQ drives don't mean much without a controller, and like the drives, the controller support is a little slim right now. Intel's ICH6R, which can be found with their 915 and 925X chipsets fully supports NCQ from the get go. Silicon Image has recently demo'd their SiI 3124 controller at IDF 2004, and should have their part out soon. As for Promise, Siig, NVIDIA, ATI and VIA, none of them currently have controllers that support NCQ. No doubt, these controllers will come, but at the moment (as in, going to the store and buying something today), only Intel boards based on Grantsdale and Alderwood will support NCQ.
As for software support, for desktop users, the performance gain from "regular" applications will be minimal. NCQ thrives on multithreaded and multitasking situations, and the truth is, not many desktop applications and games tap into this. Workstation and server level systems on the otherhand may see a boost, but again, that will depend on the application.
The important thing to remember is NCQ drives will work on all SATA controllers, but the NCQ functionality will be disabled.
Now that we've discussed NCQ, let's have a look at a couple Seagate Barracuda 7200.7s.