≡ Menu

Event capture and analytics #inext2010


Mike Hoskins talked about event capture and analytics, or the transition from data acquisition to meaningful knowledge – how to build data pipelines from client data to meaningful insight at a Business Service Provider (BSP or SaaS data processor).  BSPs have an interesting challenge  – they provide a service in the cloud but they have clients with firewalls behind which the data (in various formats) sits.

Traditionally data is packaged up behind these firewalls and transmitted to the BSP. This is easy for the clients but difficult for the BSPs as the data is in different formats and just turns up in big packages. A willingness to accept multiple data formats turns out to be critical but often creates a large demand for custom data integration as each customer must be onboarded uniquely. The pattern is so common that Pervasive developed an Integration Hub. This sits at the BSP and can data collection (handling scheduling, reformatting, validation and profiling) often in a wide range of formats and data preparation to turn these feeds into something more coherent (involving aggregation, joining, transforming, matching and linking data). This still requires customers to push data to the BSP, although it handles the resulting data work using an integration platform.

Pervasive has moved from this approach to one based on managed agents (as noted briefly in the introductory remarks). These take advantage of the Pervasive DataCloud 2 infrastructure to push agents down behind the customer firewall. These agents automate the process of extracting, transforming and packaging up the data for transmission to the integration hub. Sitting behind the firewalls at clients these have access to the source systems and push the data out through the firewall into the cloud for transmission to the BSP. These agents have a permanent relationship to the Pervasive DataCloud so they can be managed from there. These agents might be pre-configured or custom developed in the cloud and pushed down to specific client locations.

Pervasive also have Business Xchange for B2B aggregation. These take many trading partner end points and aggregate them before feeding them to the BSP. This allows a BSP to have a direct link to their big customers while having a single interface to the aggregator for all the smaller clients. In particular this allows for manual entry of data by very small clients, important as the integration hub assumes all data feeds are automated.

So far all this is about data aggregation, integration and movement. And this approach handles 2 of the three problems in integration – the variety of end points and formats (problem #1) and the rate of change in this (problem #2). However the third problem is that of volume – the amount of data involved. This can overload the data loading window available- especially if lots of data is combined with complex matching say. Pervasive DataRush has been developed as a platform to improve the performance of these activities by taking full advantage of parallelism/multi-cores.

This platform is also available to the BSPs (and others) to build high-performance analytic processing. Most BSPs do some analysis of the data they collect as well as processing it but this is changing to do more and more data mining and predictive analytics. As they do this, mining large amounts of data, they need high performance ways to analyze this data. BSPs who can do this can add tremendous value to the data they collect and turn around and sell it back to the people who provided the data. Or, better yet, wrap these analytics into decisions and sell Decisions as a Service. This requires a platform for building these domain specific analytics and this is another areas where Pervasive sees DataRush playing.

An interesting view of the changing landscape for any SaaS software companies (BSPs or not) – they need to take the data they collect and turn it into higher-value analytics and decisions.


Comments on this entry are closed.