What is a storage bottleneck? And how can you avoid it?
Thomas Pavel, EMEA Storage Sales Director at Avago Technologies, told us about the strains caused by the data deluge and how your organisation can avoid them.
TechRadar Pro: What are the biggest challenges of the data deluge?
Thomas Pavel: The volumes of published information and data continue to grow unabated, fuelled by demanding applications like business analytics, social media, video streaming and grid computing. Many organisations, regardless of their area of business, want insight from new and unstructured sources such as news reporting, web usage trends and social media chatter.
The ability to access and retrieve data quickly is also a major factor contributing to business success and/or customer satisfaction. But there's a lot of data to handle. Just keeping up with this relentless growth and storing of data is challenge enough, but how to deal with such vast volumes of data cost-effectively? And perhaps most importantly: How to maintain or even improve storage performance?
TRP: What is a storage bottleneck? Where and when do bottlenecks tend to occur?
TP: As the volume of data increases, so too can the time it takes to access it. This is known as a 'bottleneck'. There are many potential locations for 'pain points' or bottlenecks in an enterprise system, so locating the bottleneck is not always simple. Addressing the bottleneck and maintaining performance is the rationale behind continual advances in storage technologies today.
When designing storage systems for performance, it is essential to understand where the bottlenecks can occur. This is especially true given that the bottlenecks change with each new generation of technology along the data storage path.
The three most critical elements that affect storage performance are the server's Peripheral Component Interconnect Express (PCIe®) bus, the SAS solution as implemented in host bus adapters (HBAs) and expanders, and the disk drives themselves, which can have either a SAS or a SATA interface.
Storage bottlenecks migrate among the successive generations of the various technologies involved end-to-end. With the advent of third generation PCIe, for example, second generation SAS became the new storage bottleneck. Third generation SAS is now able to take full advantage of third generation PCIe's performance, making PCIe the new bottleneck in systems using 12Gb/s SAS.
TRP: What guidelines can we use to maximise storage system performance?
TP: When designing a storage system for high performance, it is necessary to understand the throughput limitations of each element. Critical applications must also scale easily over time while remaining both highly protected and easily manageable.
SAS is now in its third generation, and the performance has doubled with each new generation from the original 3Gb/s to 6Gb/s and now 12Gb/s. SAS, like PCIe uses lanes and high performance storage systems normally aggregate multiple SAS lanes to support high data rates.
TRP: Does the storage bottleneck change with different system configurations?
TP: This table provides a summary of some sample configurations showing where the bottleneck exists when configured with a "full complement" of disks (the slowest element in the system). As shown, the need to support more disks (for capacity) requires the use of later generations of SAS and/or PCIe, and/or more SAS lanes.
Looking at it another way, in systems with a small number of disks, their relatively low aggregate throughput becomes the bottleneck, so there is no need to "over-design" the configuration with later generation technologies and/or more SAS lanes. The disks referenced in the table example all have a 6Gb/s interface with a throughput of 230MB/s and 550MB/s for the 15K RPM HDDs and SSDs, respectively.
Note that the table assumes all drives are operating at their maximum throughput simultaneously, and this does not always occur. It is also important to note that IOPs is often more critical than throughput in many applications today, depending on the circumstances. For these reasons, each configuration is normally able to support many more disks than indicated.
TRP: So how can SAS third generation improve performance for businesses?
TP: Being able to move at 12Gb/s means that measurements of over one million IOPS can be achieved. 12Gb/s SAS is an evolutionary change and a big step forward for the market. For the first time IT managers will be able to exploit the full potential of PCIe 3.0. This in turn will benefit businesses that rely on mission-critical data in a variety of environments, including transactional databases, data mining, video streaming and editing.
TRP: What are the issues in migrating to SAS third generation?
TP: The primary issue in the migration to third generation SAS is a familiar one: investment protection. Most organisations have made a significant investment in SAS disks, and want to preserve that investment when migrating to 12Gb/s SAS technology. The problem is: The third generation SAS standard maintains backwards compatibility by throttling down to the slowest SAS data rate in the system.
In small-scale point-to-point configurations, this is not always an issue because the migration would require upgrading both an Initiator and its Target. But in most organisations, such point-to-point configurations are rare.
The system-level "slowest data rate" performance limitation, therefore, means that in organisations without point-to-point configurations would not be able achieve the 12Gb/s performance boost until all disks support this new standard.
TRP: How can this issue be overcome?
TP: Fortunately there is a way to overcome this limitation, and that requires understanding a little about how SAS expanders function. A SAS expander makes it possible for a single (or multiple) Initiator(s) to communicate with multiple Targets concurrently. Expanders help make SAS remarkably scalable, and because each is capable of supporting multiple disks, expanders also makes it possible to aggregate the throughput of those disks.