Thomas Tileston of Warner Bros Home Entertainment (DVDs, games and digital distribution part of Warner Bros) was next talking about their use of SAS and Teradata. He started with a little history as he has been using SAS for 20 years or so.
- In the early days analytical data set preparation was 70% of the work (from tapes to analytics-ready data) and 30% was the analysis (distribution, outliers, analysis).
- By the late 90s he was using SAS against Teradata. This brought in issues of relational data, IT doing data transformation etc. Initially he just sucked data out of Teradata and did everything in SAS.
- By 1998 he was thinking about using SQL to access Teradata and using SAS on the results.
- Between 1999 and 2005 he was increasingly creating an initial analytical dataset in Teradata and then doing a little data transformation in SAS and completing analysis.
- Today he uses a single workflow across Teradata/SAS that creates the analytical dataset, runs the analytics, publishes the results and cleans up.
He showed some great worked examples, illustrating with SQL that runs in both SAS and Teradata but runs in seconds on Teradata (when it is close to the data) rather than minutes on SAS (when the data has to be moved). A classic illustration of the value of in-database analytics – keeping the analysis near the data.
But what he wants is all the SAS functions he uses available in the Teradata Warehouse so he can do this all the time, not just some of the time. He gave another example where he really wants to use a SAS function but does not want to have to move the data to SAS, run the function and then store the result back in Teradata. This is what excites him about the Teradata/SAS partnership.
Data quality is also an issue. All the master data is in SAP but, as every product is very different, lots of data gets added by hand (genre, actors, executive producers…) and this creates quality problems. To resolve this they have a lot of “rules” about the data to check for completeness and validity. Today this is corrected through daily updates but want to have these rules running in SAP (which of course they could do using SAP’s rule engine). Even using SAS for this has helped, though, by enabling business users to get involved in managing these rules.
Benefits so far for doing data preparation in Teradata – moving most or all of it from SAS to Teradata
- 36 hours (100% SAS) to 1.25 hours (90% Teradata, 10% SAS)
- 2 hours (100% SAS) to 1 minute (95% Teradata, 5% SAS)
And the increased performance means that new uses of these analytics or more sophisticated versions of them become practical, adding business value.
Issues for the future:
- The analyst needs read/write table space in Teradata
They must understand SQL and this is an issue as analysts don’t usually use SQL. SQL is the pipe connecting Teradata and SAS and the partnership should help address this by pushing SAS functionality into Teradata
- The analyst has to become involved in ETL
At least the transformation part and this means they have to understand the relational data model
- The analyst has to be able to drive the creation of analytical dataset
IT can’t do this for them
Great examples and a nice view of the value of the relationship.