Thursday, July 24, 2008

What's new in CLARiiON CX4








EMC's next generation CLARiiON productline CX4 does introduct some exciting features that only belong to high-end disk array before. The first and largest innovation is the introduction of UltraFlex architecture. With UltraFlex, you can configure your array with either 10G iscsi modules or 4/8 Gb FC modules. Just plugin and play then you can turn an FC array into iSCSI one. Also, there is reserved slot for future FCoE module, and I guess even Infiniband module if converged IB tech becomes mature.

Above are two pic about the CX4 960 CPU and I/O module. You can see that one CX4 960 controller module has up to 4 flex I/O module to choose.
The other features including Flash SSD as level 0 storage and energy smart drive speed destage according to I/O load. Sounds the same with DMX4's change recently.
There is a RecoveryPoint embbed in Flare to provide local and remote copy, especially for integration with VMware Site Recovery Manager. I will discuss it in my next blog:-).
At least, now we can know how many disks each modle can hold directly from their modle name :-) There are 4 modles with different capacity: CX4 120, CX4 240, CX4 480, and ulitimately CX4 960.

Tuesday, July 22, 2008

The Seven Tiers of DR


Seven Tiers concept of DR was originally arised by http://www.share.org/ to help DR specialists define various method of recoverying mission critical IT systems. From RTO(Recover Time Object) and RPO(Recover Point Object) perspective, this definition offers a service level comparing between risk and cost.


What is disaster? Everyone in China during the first half on 2008 can know what exactly it means

Tier 0 means no recovery at all when data lost occur. There is no backup plan, nor any backup hardware remotely, such as tapes or disks, etc. The only way to protect data lost in Tier o is praying to God :-)

In Tier 1 you will have an offsite backup plan, possibly transport the tapes through what we so called PATM(pickup as truck access method). When data lost occurs, you can retrive the backup tape via car or anyother transportation tools. Very age-old way for recovery :) What happen if there is an car accident or horrible jam during recovery? The RTO and RPO is not under your controll.

But it is still an effect complementary for data replication due to bandwidth limit or data de-duplication not available. Think about transferring 2 TB data with only 80KB/s speed...Why would you take up a drive and deliver 2 LTO4 tape in your hand directly?

When we go into Tier 2, we will feel quite lucky that there is a hot site to recover our facility besides PTAM tape delivery. So, no traffic jam concern now: ) The RTO is under controll but the recovery point is limit to daily data lost. We can only rollback to yesterday and lost everything we done within today.

Time to solve the car accident now. In Tier 3, we will have an electronic link between production site and hot backup site, such as ATM, or any other IP network links. Mission critical or private data can be transffered through TCP/IP while not a car maybe driven by drunk drivers. It is reported last year that CITI did lost some tape containing millions of bank account info. The lost is much more than the car :-)

When you are running a SAP ECC6 ERP system, should there an earthquake occur, how could you recover business if you only have backup data onhand in the recovery site? Is your backup specialist also an SAP basis expert? The recovery time will depends on how soon you can setup the same infrastructure in the hot site. So, that's why we introduce Tier4, with an active secondary site running. Often, we will utilize some host based replication software to achive asychronious remote volumn copy. The data lost should be limited to hours.

How about using some disk array based snapshot/snapclone tech locally to provide an clone for remote data synchronous replication, for example, TimeFinder and SRDF A ? In Tire5, we will integrate the data replication tech and application tightly. For example, the Oracle DataGuard and SRDF. Asynchronous redo log transfer between production and backup sites will help us restart business in minutes.

But there are still some industry that require zero data lost and zero downtime. This tough service level demand us to make both application and data replication between two sites synchronously. A lot of customers apply the Three Datacenter methology to reduce performance impact on production site when there are more than 300KM or longer distance. Which means, we can utilize FC over DWDM within the same city(60-80 KM usually) for synchronous replication between production site and backup site 1.


Then we will make an asynchronous replication between backup site 1 and backup site 2(500KM away) through IP networks, no physical distance limit actually. This design is widely used by EMC and HDS when implementing Business Continuity solution.

Thursday, July 17, 2008

Welcom to my storage blog

Just want to consolidate what I experienced and learned from past several years consulting carrer in D, H, E.