The folks at ParAccel announced their 3.0 product recently (details on ParAccel 3.0 here). As I haven’t written about them before, some background first. ParAccel is one of the new class of MPP columnar analytic databases designed to address the challenges raised by the vast volumes implied in the phrase “big data”, the increasing complexity of analytics, the need to be more agile in terms of building and modifying a wide range of analytics, and the need to do all of this faster – getting to actionable insight. These challenges tend to overwhelm standard data architectures with too many scans and over-complex joins. The result is processes that require extensive resources to gather and load data, manage the physical design and do all the tuning needed.
ParAccel’s architecture consists of a set of nodes and a communication fabric
- Leader Nodes
These control sessions and handle queries – parsing, planning, optimizing and scheduling
- Compute Nodes
These performs all the processing and store data
- Hot Standby Node
Hot replacement for compute and leader nodes
- Communication Fabric
Using a customized UDP protocol, the fabric is highly optimized to minimize congestion and retransmission
ParAccel promises high performance with linear scalability, taking advantage of its fully parallelized implementation. In addition it is designed to optimize within and between queries to fully utilize the hardware. Support for native SQL support and an ability to handle very complex joins make it easy to implement and let it support potentially very complex SQL.
With 3.0 you can develop new analytic functions and embed them deeply into the databases. These functions can substitute for a query, table or sub-query and take full advantage of ParAccel’s partitioning. These functions can also be “polymorophic” and only map them to specific datasets as they are used, allowing for write once re-use many functions. 3.0 includes an array of pre-built mathematical, statistical and data mining functions that can be used directly or embedded into these new functions. ParAccel aims to make it easy to embed potentially very complex analytic functions into the database where they are treated a first class objects, allowing the results of an analytic function to be handled just like any other data element.