point
Menu
Magazines
Browse by year:
SNP A new breed storage solution
Ashu Joshi
Friday, October 1, 2004
Storage comes in many flavors from the simple direct attach disk drive in the typical desk top PC to the very large and complex storage systems used in enterprise mission critical environments. But, regardless of the environment, all storage systems must effectively handle three basic elements: connectivity/protocol; data processing; and control plane functions.

The driving factors for the complexity, and therefore cost, of these three elements, within a specific end user environment, are a function of the perceived need for availability, throughput, and ease of management.

In a simple environment, such as a stand alone PC running word processing, the three elements contribute very little to the complexity and cost of the overall solution. Connectivity is solved through a direct connect protocol such as ATA or parallel SCSI. What little data processing and control plane functionality needed is handled through software running on the operation system (e.g. drivers), and/or through a combination of firmware and hardware on the host bus adaptors (HBAs) and the storage devices.

At the other end of the spectrum, we have businesses where even slight delays in data availability are very costly and any significant delay can be catastrophic. In this high-end environment companies see data availability as a strategic investment. This willingness to invest in availability, throughput, and ease of management solutions provides a breeding ground for new technologies in connectivity/protocols, storage data processing and control plane functionality.

There are a number of technologies which have been available for some time (at a significant price premium) in the high end market that are now moving into the small to medium size enterprise. Some examples are: network attached in the connectivity/protocol element; RAID and virtualization in the storage data processing element; discovery, provisioning, and QoS in the control plane functionality element.

The challenge to the storage systems providers is to provide this new functionality at an affordable price point. The current solutions which involve running complex software-based solutions on the most powerful general purpose microprocessors available has certain inherent limitations, which make it difficult if not impossible, to achieve the desired price/performance goals. In each new generation the storage provider is faced with the challenge of adding functionality and increasing performance while lowering the cost. The time to market delays caused by the complexity of the code, combined with every increasing power and real estate consumption, is rapidly becoming a formidable barrier.

Storage Network Processors (SNPs)
Storage Network Processors are a new breed of processors designed to provide a cost-effective way to address the current and future challenges of ever increasing functional complexity, availability, and performance within a Storage Area Network (SAN) or a Network Attached Storage (NAS) environment. SNPs deliver integrated data processing through high performance hardware offload engines and firmware based applications for control plane functionality within a networked storage environment.

Connectivity/Protocol
SNPs utilize a “Terminate and Reissue” architecture. This means that the basic core of the SNP is protocol agnostic. Common network connection protocols to the host network are: Fibre Channel and iSCSI. Common protocols for the device side are: Fibre Channel (FC); Serial Attached SCSI (SAS); Serial and Parallel ATA; and Parallel SCSI. The implication of this, is that typically SNPs will have the protocol specific MACs external to them. However certain derivatives of the silicon could integrate MACs for specific protocol solutions. When dealing with iSCSI, the TCP Offload Engine should be part of the SNP design. In addition, the SNP may or may not support RDMA, iSER, and IPSec.

Port Density (and Internal Architecture)
Typical arrays consist of multiple host and disk drive ports. Increasingly, storage vendors are designing very dense systems, and that in turn, implies that silicon has to increase the number of ports it can handle.
For example, a typical high-end system today ships with 64, 2Gb Fibre Channel ports. In order to achieve the performance and availability required by the end-user, the internal design of this storage system would be divided into processing sub-systems, each consisting of a general purpose processor, memory, and associated circuitry. In this example the storage system would end up using 16 subsystems connected through a backplane.

Now imagine a processing sub-system designed around an SNP that is capable of supporting 8 to 16 2Gb FC ports. This not only provides a significant reduction in components, power, and real estate, but also greatly reduces the complexity of the software task necessary to support advance storage functionality.

The important thing to keep in mind when comparing a 10Gbps or higher SNP with conventional design is to understand the number of ports the system needs to handle, and in what size or form-factor. It is not about a single 10Gbps port (although sooner or later ports on storage systems will have point to point 10Gbps ports).

High Availability (HA)
SNPs support high availability requirements through clustering and hot swap support.
Clustering is done through a dedicated interface that allows two SNPs to be connected to each other. Clustering two SNP sub-systems through a low latency interface provides the means for copying of context, and/or data between the two sub-systems, in support of Active-Passive or Active-Active failover. Unlike networking systems with failover capabilities, the amount of copying to implement failover is high because the information to be preserved is at the data level, not packet level.

SNPs supports hot swap for companion chips and protocol controllers connected to it. In the event of a failure the SNP will allow for a hot removal and replacement (‘hot’ implies power is still on!). This allows ports to be swapped without bringing down the entire system.

Reliability
A storage system is all about information and data. Information is the life-blood of the enterprise. Securing and making the data reliable is essential. How do you prevent data loss in the event of corruption or disk failure? The answer is to implement reliability features such as RAID (redundant array of independent disks), ECC (error correction codes), and Multipathing.

RAID 1, also known as mirroring, is used to produce multiple copies of the same data on different disk drives. Under this scheme, if one drive fails the system simply retrieves the data from one of the mirrored copies. Since SNPs terminate and reissue, it’s a simple matter for the SNP to multiply the data stream into as many redundant data streams as desired, with no performance overhead.

RAID 5 and RAID 6 are schemes for recovering from disk drive failures by creating checksums that allow missing data from a failed drive to be recreated. RAID 5 provides recovery algorithms for a single drive failure while RAID 6 provides algorithms for a multi-drive failure. RAID 5/6 make much more efficient use of the physical disk space than mirroring, but generates additional processing overhead in the creation of the recover algorithms.

SNPs mitigate most, if not all, the performance penalty by incorporating XOR engines in the hardware, specifically designed to generate the RAID 5/6 algorithms. In addition, some SNPs provide additional processing cores for the rebuild process after a drive failure.

ECC (error correction codes) are used to recover from faulty memory. SNPs incorporate ECC to insure the integrity of data while it is being processed by the internal storage system, from the time it arrives from the transport until it reaches its final destination.
Multipathing is a method of linking end nodes with multiple physical paths to prevent failure due to physical links going down. The IO routing needs to make use of the redundant paths when a failure is detected. This helps to support link failures and relies on support provided by the SNP.

Performance
One common method for improving performance in a disk subsystem is by distributing the data as stripes on multiple disks. The idea is that the data blocks are distributed across multiple disk spindles and is accessed in parallel to complete one read, thereby effectively multiplying the bandwidth by the number of drives. As mentioned above, since SNPs terminate and reissue, this function can be performed as part of the storage data processing function.

Another advantage to SNPs is their inherent ability to avoid the huge performance hit associated with data copying and data movement that is unavoidable in general-purpose processor type architecture.

SNPs avoid this high overhead activity by using hardware to chunk or break up the original Scatter Gather List (SGL) or Linked Lists (LL) and match them with what is needed on the destination side. And when data movement is necessary, the SNPs avoid the inefficiencies of the general processor Load/Store feature by use of internal DMA engines.

Advanced Information Management
In addition to the block level management discuss thus far, SNPs also provide ‘file level’ processing where they go beyond block mapping to help information management, by implementing header processing at the file system level. This involves RPC processing for network file systems such as CIFS and NFS, inode/file node to Block Map, and also distributed lock management. This feature could be coupled with archival and near line storage capabilities to implement policy regulation of information in an enterprise.

The picture below sums up the essence of any SNP – the inner circle defines the core storage acceleration and management features (at file or block level), and the wrapper around it (the outer circle) are the various protocols, interfaces, and features that the SNP provides and supports, based upon the market and customer requirements.

Ashu Joshi serves as the VP of Technology at iVivity, he was responsible for the system architecture and design of iDiSX. With over 15-years of computer-industry experience, his areas of expertise include RAID architecture, storage and network protocol stacks. Previously, Joshi was Director of Technology Development at American Mega trends Inc. where he led the firmware product development team. He can be contacted at ashuj@ivivity.com.
Twitter
Share on LinkedIn
facebook