One of the vendors that really caught my eye at the recent Gesalt IT Field Day was Ocarina Networks. The reason for this was that before attending the session with Ocarina I knew very little about the inner workings of data compression and de-duplication, at least no-where near the level that I was about to find out. The area of unstructured de-duplication and compression isn’t an area I’ve really thought about too much before as none of companies I have worked for so far have ever had a serious requirement for such functionality outside of email and email archival de-duplication.
The following TechHead video is an introduction to Ocarina Networks, or simply ‘Ocarina’ as they are more commonly known by company CTO Goutham Rao.
So who would use such data optimization products?
The answer to this is really pretty much any business with a large amount of unstructured data, eg: oil and gas companies with geophysical data, financial analysis or film and television companies.
So how does it all work?
In this white-board session Goutham provides an excellent and highly interesting insight into what is involved in compressing and de-duping data. I highly recommend that you take a look:
The devices that perform the data optimization magic are dedicated hardware optimizer appliances which come in two sizes:
Ocarina 2400 Storage Optimizer:
8 (2 x Quad) Cores per appliance
Processes: >1.5TB per day (pair)
Ocarina 3400 Storage Optimizer:
8 (2 x Quad) Cores per appliance
Processes: >2.0TB per day (pair)
The optimization software that runs on these appliances can also be run in a virtual machine (VM) though this is better suited to test instances only as the compression and de-dupe process can be extremely resource intensive.
An independent lab report has reported that the Ocarina optimization software achieves “57x better data reduction than industry leader NetApp” – quite impressive as I’ve personally always held NetApp as being one of the best at data de-dupe in the market. The lab report can be found here.
Whilst at the Ocarina offices the attendees of the Gestalt IT Field Day were asked to bring along a USB stick full of data so that the team at Ocarina could demonstrate the types and extent of compression that could be achieved.
I provided a 2GB USB key with a varying array of pre-compressed formats such as jpg, tiff, mp3, pdf and zipped files. The outcome after being put through Ocarina’s ‘optimizers’ was a 29% saving through the use of de-dupe and compression. So that was almost a third saving on space consumed by these pre-compressed files which was quite impressive and could amount to significant disk cost savings on a much larger volume of data.
The Take Away
The key points I took away from our session with Ocarina Networks were:
- Their de-dupe and compression technology is content-aware meaning that there is an element of intelligence into what and how particular data types are compressed and/or de-duped.
- This content-awareness also allows for a tiered storage approach to data storage and automatic relocation according to criteria such as age or last access date.
- Policies and rules can be set to determine when particular data gets optimized (ie: de-duped and compressed).
- De-duplication and compression is in fact quite interesting – thanks to Goutham’s fantastic white-board session.
- The user interface proved to be very intuitive and a novice with the product such as myself could create and run a optimization job easily.
- Ocarina is a relatively small company achieving big things.