Spend Analysis III: Common Sense Cleansing
Today I'd like to welcome back Eric Strovink of BIQ who, as I indicated in part I of this series, is going to be authoring the first part of this series on next generation spend analysis and why it is more than just basic spend visibility. Much, much more!
Many observers would acknowledge that there's not a lot of difference between viewing cleansed spend data with SAP BW or Cognos or Business Objects, and viewing cleansed spend data with a custom data warehouse from a spend analysis vendor. They're all OLAP data warehouses; they all have competent data viewers; they all provide visibility into multidimensional data. What has historically differentiated spend analysis from BI systems is the cleansing process itself (along with, in contrast to the BI view, the decoupling of data dimensions from the accounting system).
Because it's hard to distinguish one data warehouse from another, cleansing has become an important differentiator for many spend analysis vendors. The vendor has typically developed a viewpoint as to the relative merits of manual labor/offshore resources, automated tools, custom databases, and so on, and sells its SA product and services around that viewpoint. Unfortunately, all the resulting hype and focus on cleansing services, from both these vendors and the analysts who follow them, has obscured a simple reality -- namely, that effective data cleansing methods have been around for years, are well understood, and are easy to implement.
The basic concept, originated and refined by various consultants and procurement professionals during the early to mid-1990's, is to build commodity mapping rules for top vendors and top GL codes (top means ordered top-down by spending) -- in other words, to apply common sense 80-20 engineering principles to spend mapping. GL mapping catches the "tail" of the spend distribution, albeit approximately; vendor mapping ensures that the most important vendors are mapped correctly; and a combination of GL and vendor mapping handles the case of vendors who supply multiple commodities. If more accuracy is needed, one simply maps more of the top GLs and vendors. Practitioners routinely report mapping accuracies of 95% and above. More importantly, this straightforward methodology enables sourcers to achieve good visibility into a typical spend dataset very quickly, which in turn allows them to focus their spend management efforts (and further cleansing) on the most promising commodities.
Is it necessary to map every vendor? Almost never; although third-party vendor mapping services are readily available, if you need them. And, as far as vendor familying is concerned, grouping together multiple instances of the same vendor clears up more than 95% of the problem. Who-owns-whom familying using commercial databases seldom provides additional insight; besides, inside buyers are usually well aware of the few relationships that actually matter. For example, you won't get any savings from UTC by buying from Carrier and from Otis Elevator. And, it would be a mistake to group Hilton Hotels under their owners, since they are all franchisees.
[N.B. There are of course cases where insufficient data exist to use classical mapping techniques. For example, if the dataset is limited to line item descriptions, then phrase mapping is required; if the dataset has vendor information only, then vendor mapping is the only alternative. Commodity maps based on insufficient data are inaccurate commodity maps, but they are better than nothing.]80-20 logic also applies to the overall spend mapping problem. Consider a financial services firm with an indirect spend base. Before even starting to look at the data, every veteran sourcer knows where to start looking first for potential savings: contract labor, commercial print, PCs and computing, and so on. Here is a segment of the typical indirect spending breakdown, originally published by The Mitchell Madison Group:
If you have limited resources, it can be counterproductive to start mapping commodities that likely won't produce savings, when good estimates can often be made as to where the big hits are likely to be. If you can score some successes now, there will be plenty of time to extend the reach of the system later. If there are sufficient resources to attack only a couple of commodities, it makes sense to focus on those commodities alone, rather than to attempt to map the entire commodity tree.
The bottom line is that data cleansing needn't be a complex, expensive, offline process. By applying common sense to the cleansing problem, i.e. by attacking it incrementally and intelligently over time, mapping rules can be developed, refined, and applied when needed. In fact, whether you choose to have an initial spend dataset created by outside resources, or you decide to create it yourself, the conclusion is the same: cleansing should be an online, ongoing process, guided by feedback and insight gleaned directly (and incestuously) from the powerful visibility tools of the spend analysis system itself. And, as a corollary, cleansing tools must be placed directly into the hands of purchasing professionals so that they can create and refine mappings on-the-fly, without any assistance from vendors or internal IT experts.
Next: Defining "Analysis"