AI is NOT Failing Because of a Lack of Forward Positioned Data

Lack of forward positioned data is NOT the problem.

(It is a problem, but not the biggest one!)

An AI agent making 1000X the decisions IS!

Right now, while the big AI players have achieved 80% to 90% “accuracy” on their carefully designed synthetic benchmarks, when applied to real world problems, accuracy in many domains drops to 25% (or worse, as at most 20% of code generated by an AI survives into a production application once it gets reviewed by a senior developer who finds a plethora of security issues, boundary condition errors, and code that, frankly, just doesn’t solve the problem at all).

THIS MEANS THAT THE AI IS MAKING 750X MORE WRONG DECISIONS THAN THE HUMAN!

That’s a LOT of mistakes.

Meanwhile, give an expert human

a) always available forward positioned data and Augmented Intelligence applications to process it (so all the data the expert human needs to make the decision is at her fingertips)

b) A-RPA (Automation) software that is best-of-breed and capable of immediately executing any decision the human makes (possibly using the forward positioned data and appropriate augmented intelligence outputs)

And that human will make 100X the decisions she’s making now, and get 95% of them correct. So if you hire 10 humans, you will have 25X less errors (5% vs 75%).

When you consider ten humans will cost considerably less than AI when you consider the rapidly rising token costs and the costs of dealing with the 25X increase in errors the AI will bring, Augmented Intelligence powered by Forward Deployed Data and a small team of humans will be a LOT more productive than you ever thought possible.

The world is not binary, flat, or stable!

It’s multi-state, curved, and chaotic.

You need fuzzy math, fractal geometry, and non-linear differential equations to describe it.

Similarly, the supply chain world we built is not a predictable single source flatland (as the work of Edwin Abott Abott in 1884 should have made clear to you).

You need multi-state logic, multiple (supply) chains and multiple methods for managing them.

And these DO NOT fit into a 2 x 2 grid! It’s this ongoing lie that ultimately leads to failure and organizations bringing in one consultancy* after another, and one platform after another, in an attempt to fix problems which never go away.

Every distinct dimension that needs to be considered in classification and decision making is a distinct dimension that needs to be taken account in any methodology or “map” presented to you (and multiplies the number of “buckets” you need for classification). So if you have three dimensions, you need at least 2 * 2 * 2 = 8 buckets in your classification scheme (as you will have at least 2 values per dimension you differentiate on, and that’s assuming each dimension you are differentiating on is a binary decision — if it were ternary, e.g you were classifying each dimension on high, medium, low or red, yellow, green, then you would have 3 * 3 * 3 = 27 buckets).

That’s why every single analyst quadrant map that attempts to assess a vendor, product, or service on more than 2 dimensions is an ultimate failure. (That’s why SolutionMap works — it’s just tech vs customer sentiment, not innovation, service, tech, market fit, market strategy, product strategy, industry strategy, geographic strategy, product viability, pricing, track record, execution, operations, and customer experience randomly squished into two meaningless composite values using absurd average weightings that are equivalent to taking the average weight of an apple, BMX bike, and a cruise ship.)

Mathematically, this would require a 14-D hypercube with 16,384 sub-cubes. And that’s why you don’t measure everything, only what counts! But try as you might, you usually going to end up with at least 3 independent dimensions that are critical to any problem you work on. But that’s not a bad thing! [Remember, the 3-sided triangle is the most stable shape with area in flatland (where analysts and consultants still love to live in to this day), and the 4-sided tetrahedron (pyramid) you can make from 4 triangles in 3-D is one of the most fundamentally stable shapes there is (and atomic bonding proves this).]

Since, when it comes to Procurement, the 3 most critical dimensions are complexity, risk, and organizational impact of what you’re buying, proper Procurement is dictated by a pocket cube. The Busch-Lamoureux Exact Purchasing pocket cube to be precise.

So if anyone else claims their updated Kraljic matrix will work for you, just shut the door. Don’t bother arguing. If they won’t accept real-world reality, you won’t get a real-world solution. Find someone who understands the complexity and can build you a platform to address it, with as much automation as can be brought to bare. (And quite a bit can be brought to bear, as per our series on operationalizing the pocket cube.) That’s how you will succeed. The old fashioned way — define the problem, use Human Intelligence (HI) to address the problem, and design processes and systems to execute the solution as efficiently as possible. The fundamentals don’t change, and anyone who says otherwise is a scam artist trying to sell you (silicon) snake oil. Don’t buy it.

* Now big consultancies won’t tell you this because if you get it right the first time, they can’t continue to sell you consulting hours, which is their ultimate goal.

Sourcing and Procurement Are NOT The Same

And they are definitely NOT interchangeable, as per a recent article by Paul Martyn (the Sourcing Optimization Grand Master) on LinkedIn.

As per his article,

  • sourcing is strategic
  • procurement is transactional

And this is why they are not only not the same, as per Paul’s article, but not interchangeable.

In the age of AI (Hype), this is distinction becomes doubly important!

As technology advances rapidly, humans become less and less important in Procurement as rapid advances in automation allow more and more of the tactical process to be completely automated (as ARPA allows exceptions to be learned and future manual intervention requirements to be eliminated) but more and more important in sourcing as Gen-AI repeatedly proves just how Astonishingly Inept modern Artificial Idiocy is.

Many will argue that sourcing is tactical because modern software can assemble RFXs from existing specs, automatically select suppliers from your SXM and/or ERP, automatically distribute them, automatically validate the returned RFXs, eliminate vendors who don’t meet absolute requirements, analyze the responses against market data for validity, build and execute multi-objective models, and recommend and award. And while that certainly sounds like sourcing, it’s not. It is sourcing execution. The tactical part that has to be done to support the strategic, but NOT the strategic.

The strategic is creating the specs, identifying the real organizational requirements, determining the requirements for supplier inclusion, validating the suppliers, determining the proper (multi-round) event type, validating the generated RFXs, analyzing the responses for hidden risks and traps and idiosyncrasies, defining the right trade-off models, selecting and modifying the right award scenario, overseeing the negotiation, etc. Every part of the process that requires an actual decision with Human Intelligence.

This is because, as Paul points out, a dumb machine doesn’t understand:

  • lowest cost vs resilience
  • incumbent vs challenger
  • standardization vs innovation
  • savings vs service
  • global leverage vs local agility

Or any other trade-off that can’t be completely quantified and captured in fail-safe rules.

Systems can, and should, support all tactical bit-pushing — especially since we were promised they would do so over 40 years ago when the big push was made for every person and business to adopt them — but, like IBM said in 1979, a computer should never (EVER) make a decision. And that most definitely includes Sourcing decisions!

If You’re Spending 250K Annually Per Engineer On AI …

Then not only are you contributing to planetary destruction (through the generation of between 1.32 tons (high end models, 1 joule per token) and 84 tons (low end models, 2 joules per token) of CO2 to power those data centres, which is about 0.2 to 12.7 times the average individual carbon footprint, with an expectation of 7 to 11 tons (Source), and the utilization of 300,000 gallons to 5,000,000 gallons of water a day to keep those servers cool, or a town’s worth of water every day!

BUT YOU ARE NEEDLESSLY WASTING 400K+ A YEAR

1. Less than 20% of AI generated code survives unscathed in a commercial enterprise software product once senior developers weed out all the security errors, boundary condition errors, and generated code that doesn’t even solve the problem. So, that’s 200K of 250K down the drain as only 20% of output is usable.

2. Having to fix AI generated slop will consume 80% of a good senior developer’s time — a developer you should also be paying 250K a year.

End result, you’ll losing 200K + 200K per developer you force AI coding tools upon!

But hey, it’s your money. If you want to p!ss it away so NVIDEA’s CEO can get richer selling more CPUs we don’t need, that’ up to you!

The linked article contains some metrics, but here are a few others.

  • token prices vary widely, from an average of around 50c/M tokens on the smallest, cheaper models to $75/M tokens (or higher) for higher end “workhorse” models
  • energy processing requirements per token are estimated to be between 1 joule and 2 joules
  • you can buy 14.3 Trillion tokens at the median of around $17.5/M tokens (and 35 times that at the lower end)
  • processing 14.3 T tokens will take about 4000 kwH @ 1 joule/token
  • on an average NA grid, expect to produce 500 to 600 g of Co2 per kWh (since most of our grids are still dirty)

The Bullshit Filter for Enterprise AI Startups consists of 12 Questions!

Not 11!

Backing up, earlier this year Jason Busch published his 11-Question Bullshit Filter for Enterprise AI startups. This was, and is, needed because the vast majority of Enterprise AI startups are bullshit (especially in FinTech and Procurement) and the sooner you figure that out, the better.

I was hoping that, by now, the AI startup scene would start crashing due to over investment, lack of returns (only 6% of AI implementations have generated an ROI), and, generally, lack of usefulness. (AI can serve up your data, show you complexity and even help with automating some tasks, but it can’t make decisions and, due to lack of anything close to intelligence, can’t even do basic tasks without your oversight.) But, even worse, these solutions are still multiplying like Fibonacci’s rabbits and their claims are getting more outlandish by the day. (How many times do we have to tell you AI Employees Aren’t Real, you should NOT engage any vendor selling “AI Employees”, because you definitely do NOT want AI Employees.)

So, since they are flooding our space with BS marketing and making ridiculous claims about what their useless apps can do, it’s more critical than ever that you be able to suss out the BS claims from the non-BS claims. (Hint: 95% are BS claims, so it wont’ be easy!)

We’ll start with Jason’s 11 filters, which we’ll number 12 down to 2, because he left out the most important filter, and the one that, if it fails, allows you to skip the next 11.

Filter 12: Founder DNA
Can they build and sell? Likely not. Chances are, if they’ve cut through the noise and reached you, they can only sell. And if you did find a builder, they won’t survive long enough to support you if they can’t sell.

Filter 11: Motivation
Is failure unacceptable? (Every startup team will say it is, but unless every founder has a reason they simply cannot accept failure, when the going gets tough … the tough get going … and quit.)

Filter 10: Interface
Is it designed for those who will ACTUALLY be using it?

Filter 09: Categorization
Does the product actually do something new? Is there a strong reason for the market to adopt it?

Filter 08: “Found Money”
Are there instant benefits that sell themselves on the first demo.

Filter 07: Displacement
Does the product workaround or replace a solution that buyers hate?

Filter 06: Functional Bonds
Does the solution cross boundaries that increase value beyond peers?

Filter 05: Data Delta
Is there a “data” strategy to exploit the delta between what humans can easily consume and what AI can leverage (and summarize into something useful for human data ingestion)?

Filter 04: “Messy Middle”
Can the solution ingest external “dark data” and turn it into actionable insights without requiring a(n extensive) manual data-cleansing project? (Quick review and correction is okay.)

Filter 03: Connect the Dots
Does the app bridge the gap between “Watercooler Data” and “System of Record Data” (ERP/PO) to explain the why behind an analysis or recommendation?

Filter 02: “Show Your Work” Audit
Can the user drill into any output, see each and every step the AI took, drill down to the source data, and verify that everything is correct, accurate, and no data was changed?

These are all great filters, but there’s no point going through them if you don’t check the most important filter first:

Filter 01: Is it LLM-based?
If yes, move along. Don’t waste any time.

Most of the failures in the age of AI come from Gen-AI LLMs that promise the world and don’t even deliver a pile of dirt. That hallucinate on every other query. That burn up thousands of dollars of tokens to deliver less than fresh MBA interns with no real world experience and no clue to share on their first day no less.

Even worse, the majority of these players are simply wrapping third party LLMS in the creation of their “solution”. That’s not a solution at all. That’s an unmitigated disaster waiting to happen!

In the rare case an LLM actually offers a partial solution, it is best to go straight to one of the major providers. That way, you know who’s responsible when something goes wrong and don’t have to worry about providers playing the blame game and pointing fingers at each other.