Truth in Advertising – Advanced Data Compression in Oracle 11g
You may have seen the following advertisement from Oracle. They claim that Advanced Data Compression for Oracle 11g cuts your data storage requirements in half. Anyone who knows anything about data compression know that this is misleading. Sure, its possible that data compression could cut your data storage requirements in half. But that depends on the kind of data you have and on your environment. As we will see below, compression rates vary greatly depending on the nature of the data.
Let’s take a quick look at the results of an InfoWorld investigation at Oracle Database 11g Advanced Compression Testbed, Methodology, and Results. In this investigation, InfoWorld analyzes two data sets: 1) the test kit that Oracle provides and 2) several tables from the TPC-C order entry benchmark data.
Let’s focus on the industry benchmark data. The first thing to notice is the variation in the compression rates. They vary greatly, depending on the nature of the data. The second thing to notice is that if do your math and figure out the before and after storage numbers, you will see that Advanced Data Compression for Oracle 11g reduces the total storage for these tables from 273,034 MB to 210,808 MB. This is a long way short of the 50% number that Oracle is touting! In fact, its actually saving less than 23%. However, as I have said, compression rates do vary greatly depending on the nature of the data and TPC-C data does contain random strings which do not compress well.
|Table Name||Original Size (MB)||Compression Rate (%)|
The next thing to notice in the InfoWorld article are the performance findings. While Oracle claims that “it runs faster” in the advertisement, the results of the InfoWorld investigation did not back this up. In fact, the InfoWorld investigation observed transactions per second for a mixed workload “falling from 28 to 13, which “represents a nearly 50% performance hit“. However, InfoWorld do point out that their tests are not a true representation of a real-world workload, and that the results are within acceptable parameters.
The point of this blog post is not to question whether Oracle can achieve high compression rates or whether performance is acceptable when compression is turned on. The point of this blog post is to make sure you are aware that compression rates and performance are highly dependent on the nature of your data and the nature of your environment, and don’t let vendors like Oracle tell you otherwise.