Day 2 notes from Web 2.0 Summit in San Francisco, CA:
[my analysis and notes are in these square brackets.]
Cops and Robbers Las Vegas Style, with Jeff Jonas, chief scientist of the IBM Entity Analytic Solutions group
- has some very interesting ideas on “the database of intentions concept”
- founded SRD in 1983
- worked with the gaming business and the MIT Team
- sold to IBM Jan 2005
- Now chief scientist at IBM analytics
Rather than using the notes I had typed, I’m going to repost Jeff’s excellent summary from his blog:
I was invited to speak at the Web 2.0 Summit last week in San Francisco. Believe it or not I actually presented 41 charts in less than 10 minutes. This kind of general session presentation was called a Show Me/High Order Bits. That’s right, the essence of my life’s work in just 10 minutes … the thrill! [Note: The formal title was: “Cops and Robbers Las Vegas Style.“]
If you did not make this most amazing summit with a most amazing cast of attendees or were there and missed my auctioneer-inspired delivery, here are the key points I covered:
0. I first showed a picture of a fire breather from my last New Year’s eve party – but that is not important right now.
1. I showed a surveillance video of a casino scam involving a corrupt dealer – resulting in a $250,000 loss in 15 minutes. If the dealer had the same address in the payroll system as the “high roller” had in the loyalty club and comp systems (free rooms, meals, etc.) … who would know?
2. I introduced the concept of “Corporate Amnesia.” This occurs when one part of the organization makes a decision which very clearly did not account for other key data sitting elsewhere in the enterprise e.g., your marketing department is mailing offers to a person currently in jail for stealing from you!
3. “Perception Isolation” is the leading cause of Corporate Amnesia. Think of each operational system as a distinct enterprise perception. Notably, each perception is isolated from the others.
4. Enterprise intelligence requires persistent context. There is no way to get smart if perceptions are not integrated. When perceptions are integrated and stored in a database … this is persistent context. Think of this like a brain. You need a brain to be smart … duh!
6. Then treat data as a query. And thus I introduced a 1st principle for enterprise intelligence: If you do not process every new piece of key data (perception) first like a query … then you will not know if it matters … until someone asks.
7. Treating data like a query beats periodically boiling the ocean when attempting to achieve real time intelligence.
8. Then, also treat queries as data. This means if one wishes to have a query persist, it must be persisted in the same data space as the data itself. Which leads to the 2nd principle for enterprise intelligence: Treat queries like data to avoid having to ask every question every day.
9. While constructing context (real time receipt of perceptions from across the different operational systems) this happens to be the most
ideal time for this librarian function to exhibit enterprise awareness. Which leads to the 3rd principle for enterprise intelligence: Enterprise intelligence is computationally most efficient when performed at the moment the observation is perceived.
10. This is the world I sometimes refer to as “Perpetual Analytics.” A world where the “data finds the data … and the relevance finds the user.”
11. And this stuff really works … and at scale. In fact, in a benchmark center this was found to scale to over 3 billion historical observations while handling the real-time ingestion of more than 2,000 perceptions a second.
12. This has privacy consequences. For example: (a) What perceptions can or should be placed into context (in one brain)?; (b) What if
perceptions are contextualized for one mission, then re-purposed later for another?; (c) What if someone steals the brain?; and (d) What if the librarian is corrupt?
13. I worry about these things. And I spend about 40% of my time thinking about the privacy and civil liberties consequences of such systems. Which prompted one of my more recent inventions: a new class of technology I call “Analytics in the Anonymized Data Space.” Basically, instead of transferring perceptions from the various senses (an organization’s operational systems) that are human readable … the perceptions are anonymized first before being handed to the librarian for contextualization in the brain. The Reader’s Digest explanation of anonymization is basically this: if you take a pig and a grinder and make a sausage, even if I give you the sausage and the grinder you are not going to be able to make a pig. The cool thing about this new technology is that the librarian can still construct and persist context and discover relevance without actually handling human meaningful data.
14. So I summarized with the main think towards enterprise intelligence — (a) Without persistent context … you have no brain; (b) Treat data and queries with equal rights to improve awareness; (c) More intelligence is possible when thinking based on streaming perceptions;
and (d) And from a privacy perspective: More or less perceptions, that is the question (there is an important policy discussion that needs to take place about just how many – more versus less – perceptions should be permitted to be put in the brain).
15. While this approach to enterprise intelligence was born in Las Vegas … today it plays a role in national security, financial services, health care, etc. And much of the focus of my current activity is towards using this technology to deliver new threat and fraud intelligence solutions in these and other areas.
To my shock at this point I had completed 36 charts and still had 1.5 minutes left. As I thought this was in fact a possibility, I quickly moved into what I called the bonus section!
Bonus Picture 1. I showed a picture of a chimpanzee with the words “99.4 percent human.” The point being: If a .6% difference matters this much … no wonder traditional information systems lack so much intelligence! Net net, in intelligence systems very tiny little increments of accuracy make the entire difference between being dumb and smart.
Bonus Picture 2. And it may go without saying, that in such systems as this … the more observations one has the better the context. In fact, many times new observations will contain the evidence to improve or fix earlier contextualizations.
Bonus Picture 3. And this brings us to the crucial concept of “Sequence Neutrality.” Meaning despite the order of the observations (records A, B, C received in that order versus arriving in the order C, B, A) the end state is the same. If you cannot process information with sequence neutrality then you get “data drift” – meaning you hold contradictory content which must be reconciled eventually or accuracy erodes. This is a common reason data warehouses must be reloaded. Almost no systems possess this sequence neutrality property. Notably, it is virtually essential at scale because it eventually becomes impossible to tear very large databases down to reload them every week, month, or quarter.
Closing thought. After working on designing sequence neutrality into my technologies, I have discovered there are some cases where a new
record (perception) will necessitate so much recontextualization, it cannot be done in real time. Drats! That means the system must either be periodically reloaded or alternatively go offline into a maintenance mode (i.e., deep sleep) to remedy the situation. But alas, that is why
humans sleep too – deep recontexualization that could not be handled on the fly. Our dreams are the byproduct of this necessary re-shuffling. Or so I have concluded!
This post is now the shortest read about my enterprise intelligence information theory.
I plan on blogging about “why perception isolation is the leading cause of corporate amnesia” very soon
[What were my key take-aways from this talk? That Jonas is addressing the issue we see in our company and in every company I have ever looked at where there are many disconnected and different systems that “perceive” the environment, partners, customers, and developers and no central “brain” making sense of all of the perceptions and reconciling them. This is more than a data warehouse. This is a sophisticated system in the middle that pulls in all inputs, puts them together in context, and then matches new inbound data against that entire context before deciding what to do about it and how important it is. This is not single-sign on. Or globally unique identities. Or data warehousing. Or record merging. It is a more holistic and comprehensive view of the enterprise where the premise is: The enterprise should treat me (the customer/partner/developer/supplier) in one consistent way all the time.” I’m looking forward to spending some more time with Jonas to understand how these principles might be brought to bear in a real enterprise today.]
[And by the way Jeff, your compression of ten years into ten minutes (no…8 1/2 minutes!) proved my Theory of Constraint which states that “constraints are a key driver of both creativity and clarity and so should be welcomed to any project.” This has seemed to be true no matter where I have been or what project I have worked on. It applies equally well to startups too – it seems that the less cash and resources they have, the more focused, creative, and disciplined they are.]