IBM SDU Refines and Redefines Enterprise Search

By Charles King, Pund-IT, Inc.  March 20, 2019

“Time is money” has been a central tenant for business technology for decades, from the mechanical calculators ubiquitous to office environments during the first half of the twentieth century to the servers and systems that became central to transaction processing and other applications in the latter half. Speeding and automating both simple and complex labor-intensive tasks enabled companies to decrease costs and increase efficiencies while becoming more competitive and profitable.

But as once exceptional technologies become increasingly commonplace and commoditized, it’s easy to forget a central point: That even amazing technologies don’t fix every problem organizations can and will confront. That’s as true for traditional solutions as it is for more recent developments, including eCommerce and customer relationship management (CRM) applications, as well as wide ranging, broadly available technologies, like search.

That last point – search – is central to a new offering IBM recently added to its Watson Discovery portfolio: Smart Document Understanding (SDU). Let’s consider what SDU is and does and why that will be welcomed by numerous enterprises.

The problem with search

Search is a settled technology, right? I mean, search engines have been around for decades, were key to the Internet’s development and evolution, landed Microsoft in court for antitrust and drive billions of dollars in advertising and other revenues. So, what could IBM or anyone else do to make search different or better than it already is?

The problem isn’t with search so much as it is with what information is being searched. That is, traditional search engines are great for crawling, indexing and querying the relatively homogenous information that constitutes web sites and online data. However, they’re less effective at dealing with the masses of heterogeneous structured (documents) and unstructured (image, video and sound files) information that businesses store in various on premises and cloud locations.

But what about the “big data” platforms and products everyone was talking about a few years back? Those can be great for managing and searching certain kinds of data and data repositories, but complex processes and enterprise information infrastructures sometimes require more hands-on efforts that impact the effectiveness of conventional solutions. In other words, the more diverse and dispersed an organization’s data resources are, the less likely they can be fully managed or exploited with existing search tools.

The IBM SDU solution

In a blog post introducing IBM’s SDU, Donna Romer, VP of Watson Platform, Offering Management, noted a pair of interesting challenges where Smart Document Understanding can be applied. The first was a situation that an IBM customer, U.S. Bank, encountered: creating pricing schemas for credit card and debit card transaction services that can be easily and transparently customized for business customers. The second was to find ways to improve and speed the ways that business documents are prepared for training artificial intelligence (AI) solutions.

How did IBM help U.S. Bank? The company and Elavon, one of its subsidiaries, decided to develop a pilot and test program for a statement analysis offering capable of analyzing prospect billing statements in real-time and generating optimized pricing proposals. Using Watson Discovery with SDU, the team cut the time required for proposal creation from 10 days to 2 minutes, radically improving sales processes for both U.S. Bank sales reps and the merchants they serve.

What about applying Watson Discovery with SDU to documents used for machine learning for AI training? Consider that AI training often requires thousands of documents that must be ingested and annotated, and those enrichments tested before they can be used to support successful machine learning.

Smart Document Understanding leverages advances from IBM Research, as well as the company’s recently introduced Corpus Conversion System, an AI-based cloud service that can ingest 100,000 PDF pages per day (with accuracy above 97 percent) and then train and apply advanced machine learning models to extract content from the documents at scale.

SDU allows Watson Discovery customers to visually train AI to understand documents, to distinguish textual elements, to extract valuable information and to exclude “noise” like headers and footers. That’s impressive but in addition, no technical training is required for using SDU. Instead, a visual interface allows workers to point and click on elements such as titles, subtitles, headers and footers in training documents. The Watson system then displays how it understands the fields so staff can correct and resubmit documents if necessary.

In essence, Watson Discovery with SDU can be used to significantly speed document-based machine learning preparation for AI training. Plus, SDU’s point and click classification can also be applied to images, spreadsheets, PDFs and optical character recognition (OCR) content. As a result, Watson Discovery with SDU can also be used to train AI systems to recognize and ferret-out valuable “small data” information assets contained in and typically obscured by massive volumes of case files, internal reporting documents, historical customer data, past transaction and interaction files and other business documents.

Final analysis

IBM’s addition of Smart Document Understanding to Watson Discovery highlights a pair of interesting points. First, that within IT few things are ever really finished or settled. That squares with the fact that technologies are tools that, with evolutionary refinement, can be successfully applied to increasing numbers and other types of problems.

The second is that time is still, and probably always will be money when it comes to business. A notable point to consider about Watson Discovery with SDU is how it can demonstrably benefit both old school processes like sales proposal creation for U.S. Bank and emerging efforts, including document-based machine learning for AI and searching for valuable “small data” assets.

Those are the kinds of problems that IBM’s new solution is solving today. It won’t be surprising if organizations find new ways to use IBM’s Watson Discovery with SDU in the months and years ahead.

© 2019 Pund-IT, Inc. All rights reserved.