Table of contents for IBM Big Data Management Launch 2013
Tim Vincent, CTO of Information Management, came next to talk about DB2 with BLU Acceleration.
He began by identifying several different kinds of workloads and scenarios that the new solution is designed to address and pointed out the context for this is rapidly changing hardware capabilities and pricing. Memory prices are falling,bandwidth inside machines is increasing, solid state disks are becoming practical at scale. To take advantage of this you need a new software architecture around parallelism, column stores, in-memory support etc. This is what leads to the new DB2 with BLU product that offers 8-25x in reporting and analytics work, 10x storage savings.. It’s also the first step on what he sees as a journey, expanding the acceleration into new workloads.
So how does BLU acceleration work?
- Actionable compression using a patented compression technique that preserves order so that much of the predicate logic can be executed against the compressed data and expansion can be left to the very end.
- Dynamic in-memory columnar processing that allows data to move dynamically from storage to memory. This allows databases to exceed available memory without causing problems.
- Parallel Vector Processing support multi-core systems and allowing Single Instruction Multiple Data processing to run the same command against lots of data in parallel
- Data Skipping so that you don’t process data that’s irrelevant.
- Concurrent workload management built in so that performance degrades slowly and evenly as workload is added
All of this works without indexes, aggregates, tuning or changes to SQL or schemas.
For example 10TB of data is compressed to 1TB in memory. The use of a column store means that only 10GB is acceesed and data skipping focused on 1Gb. This is spread across many cores so each only handles 32MB and the single Instruction Multiple Data processing means that even this is handled more quickly.
From a practical point of view BLU is integrated into the base DB2 product. DB2 has been extended to add a column store (in addition to row and XML stores) and a new core that knows how to access the column store. The new BLU tables are “load and go” with no need to think through compression approaches, indexing, optimizing etc. It also automatically manages its performance over time so that there is not a performance degradation as the database is used.
In addition this release improves DB2 pureScale (a shared disk architecture) offering rolling maintenance updates without downtime, disaster recovery over large distances and transparent scalability up to 100 nodes. Tim also emphasized the future proofing they have built in by providing the kind of graph storage you need for schema-less or NoSQL access as well as support for OLTP and data warehouse type workloads.