Eugene from SARS, the South African Revenue Service, presented next on how SARS is using BI in revenue collection. He began by pointing out that there is a difference in how public sector organizations use BI – a focus on service delivery not profits, on taxpayers not customers, enforcement campaigns not marketing campaigns and so on. Of course public sector organizations still want an ROI, operational efficiency and use KPIs for performance management.
SARS has a wide range of core systems as well as a set of external data sources. Initially the IT department just dumped data from the source systems to their business users. This was replaced with a more formal information management department that responded to requirements defined by analysis teams but still hit the source systems. Capacity constraints led to an enterprise data warehouse (Teradata) but the Information Management department could not meet the demand for new reports etc while the business users wanted more control. Their current state is that of having their information management department acting as an enabler for business departments to manage their own BI capabilities. The technical architecture behind this has a primary staging layer for moving data into a production warehouse Operational Data Store and a secondary staging area supporting BI and data mining warehouses. This two stage approach allows them to present historical data through the lens of constantly changing business rules. A metadata repository underpins this and a presentation layer gives users access to reports, cubes etc.
SARS presents strategic summaries, aligned with the KPIs, as dashboards for the executive level who are typically considered measurement users. Tactical reports and dashboards are delivered to regional offices. These users tend to be exploratory users. Finally operational intelligence is delivered to execution users at the operational, branch level. The different levels consume different kinds of analytics.
SARS has learnt not to pursue big bang projects, to mix business and IT people, to plan for poor data quality and for peak season volumes and to manage change. From a business perspective they focus on changing how business people request data/reports, on showing ROI and on embracing user empowerment and self-service.
They use standard reporting on things like ontime filing, with an ability to drill down into zones, industries and more as well as self-service for reporting on metrics against various dimensions, slice and dicing etc. More interestingly they use various advanced analytics to catch fraud etc. For instance, a company might under report its corporate income tax and over-report the VAT it paid so that it continually gets refunds. However, this is a challenge because:
- Some critical fields are not mandatory
- It can be hard to correlate these two kinds of tax return
- Suspicious activity may have been reported but it is purely unstructured text.
- At the end of the day the intent is to find those organizations who are truly suspicious so data on registration, status, payment rates/timeliness must also be considered.
- And not everyone can be pursued so who to call and who to audit.
- Finally, are there linked entities that need to be closed down when a fraudster is found.
Advanced analytics are used in various ways:
- Neural nets predict values, or at least buckets of values, for missing values
- Statistically infer outliers
- Text mine the unstructured text reports to see if there are patterns of reporting that will allow early investigation
- All of this feeds into a risk engine that predicts the risk of fraud
- They then predict who is likely to be reached by the call center to prioritize calls to these taxpayers
- Next they predict the likelihood of a successful audit so that the auditors can prioritize their work
- They use association and geospatial data to find clusters of suspicious organizations, linking directors, audit companies etc.
- 3rd party information is brought in on things like houses and assets, travel etc to find suspicious mismatches between tax returns and lifestyle.
Great example of advanced analytics to detect fraud and catch tax evaders.