I got an update on IBM InfoSphere recently. InfoSphere handles information integration, master data management, lifecycle management, privacy and security, warehousing, big data (as described here in a previous post on IBM’s Big Data platform).
Independent integration is the challenge that IBM sees as being at the heart of data and information management. They define integration as the understanding, defining, cleansing, transforming, testing, delivering, securing and archiving. This needs to support multiple initiatives – single view of a customer, data warehouses, governance, managing big data and more. Today people are often selecting different integration technologies for these various projects resulting in something of a maze of disconnected integration solutions. Many are creating silos of integration to go with their silos of data!
To resolve this IBM says it offers a comprehensive, incremental and transferable approach. Integration technologies that are comprehensive across the information chain from source to deliver, that can be adopted incrementally and that are transferable across multiple projects and multiple platforms. Integrated integration if you like.
InfoSphere provides a core of information integration with data quality, data lifecycle and data security/privacy wrapped around it to support initiatives across MDM, Data Warehousing and Big Data. This core supports improving applications, consolidating applications, building a single view of the customer, managing governance and establishing information as a service. Components then:
- Information integration – InfoSphere Information Server and InfoSphere Foundation Tool s
Integrating and transforming data and content. Functions include automated discovery, glossaries, cleansing, transformation, replication and information as a service delivery. A common metadata layer helps ensure trusted data across these areas.
- Lifecycle, security and privacy – InfoSphere Optim and InfoSphere Guardium
Managing, protecting and securing data. Functions include managing test data (creating realistic but not real data), data growth management, data masking, database monitoring and encryption. This can be used, for instance, to mask or redact personally identifiable data prior to data mining.
- Master Data Management – InfoSphere MDM, InfoSphere Identity Insight
Designed to create trusted views of master data about customers, patients etc. Includes three options – pre-built SOA and MDM models for some domains, modifiable pre-built ones for other areas and a platform for custom MDM models. InfoSphere Identity Insight is an identity fraud detection engine. MDM is integrated with ILOG Rules to handle MDM events and more complex MDM rules.
- Data Warehouse – InfoSphere Warehouse, InfoSphere Warehouse Packs, Netezza
The Warehouse Packs are industry model-based pre-defined warehouse. Performance (MPP – Massively Parallel Processing – deep compression and workload management), analytics (OLAP, in-database mining) and Flexibility (Netezza appliances, InfoSphere flexible-appliances and software). SPSS Modeler, for instance, can use the in-database mining routines directly.
- Big Data – InfoSphere Streams and InfoSphere BigInsights
Streaming analytics and “internet-scale” analytics. Both described as part of my previous post on IBM’s approach to Big Data.
IBM likes to say that it is the only vendor with a presence in all the sub-markets for integration as well as having a leading position in all these segments. Their combination of experience and scope as well as their focus on helping customers implement in an incremental and transferable way.
Don’t forget the Decision Management Technology Map