Mind to Market

Wednesday, April 04, 2007

Terminology vs Knowledge

In our perpetual climb up the value chain beginning with the bits and bytes of data, then to recognizable information and finally knowledge, terminology fits right in the middle. Terms, or words, are recognizable slices of information that may be organized into a hierarchy or system. In fields such as medicine or biology with long histories of terminology development, many different systems of terms have cropped up due to the various needs of sub-groups of users and their isolation from each other. As a result, there is much duplication, overlap and confusion in terminology use even within the knowledge domain.

The need to interface between various networks of healthcare payers and providers has driven the demand for organization in this confusion, resulting in systems such as Current Procedural Terminology (CPT), SNOMED Clinical Terms, and International Classification of Diseases all developed to assist healthcare providers in finding a common vocabulary to describe the services they provide.

Selecting a standard terminology provides a common framework and is a significant step forward, but these are merely words; text strings in a matrix of other text strings without the ability to transfer significant quantities of knowledge. Humans naturally make the connection between a term and the object it represents and in fact expect that term to be imbued with the requisite information, but such is not the case with computers. To a computer "Joe Smith" is just a nine character text string.

Object Oriented Programming (OOP) sought to change all that, providing the knowledge underpinnings to turn simple terms into full-fledged objects that behave as do the terms they represent, or at least to the degree necessary to fulfill the requirements of the software. And there you have it: what exactly are the requirements? For a terminology management system it is to standardize and organize the diverse terminologies and there it stops. Most knowledge management systems aspire to loftier goals such as supporting decision making processes.

And thus we have a grey area: terminology management systems that aspire to be knowledge management systems. Or users who want them to be. A successful terminology management system is one which includes and classifies as many terms in the domain as possible whereas a successful knowledge management system includes as many functions in the domain as possible.

Let's take for example an anatomical terminology management system. If this includes a complete catalog of anatomical parts, including synonyms and locations, this will fulfill the requirements of users who wish to know what term to use and in what context. However, even if we know that the femur is attached to the hip and is in the leg, the terminology management system may not indicate that it is a bone or that it could suffer a fracture. This is the type of information that would be contained within a knowledge management system.

Perhaps the most effective knowledge management systems would be ones that incorporate a terminology catalog at the front end; a place where every term could be found and then once found could capture the information needed to model that object. As terminology management systems finally catch on and fulfill the needs of the industry, knowledge management systems will not be far behind.

Labels: , , , , , ,

Friday, December 29, 2006

I'm Feeling Lucky?

The original Google interface, back when they were just a search engine, had a text entry field and two buttons: Google Search and I'm Feeling Lucky. Most of us click on the Google Search button by instinct perhaps because we've been unsuccessful with the IFL button or perhaps because we just can't believe that it could actually work. The IFL button bypasses the results page and takes you directly to the first web site returned in your query. Could save a lot of time, right? If you are looking for a popular or unambiguous page, say "iPod Shuffle," you may expect to be taken right to Apple's iPod Shuffle page and you'd be right (as of the date of this blog). The one person I talked to who actually used this feature said "when I'm doing a search for http://www.nytimes.com/ it always gets me there." I'm sure people who are a bit more accomplished at using a browser use it too....

But any search that is reasonably ambiguous will require some manual filtering to get you to the site you want. Entering "Steve Connolly" will get you not the famous blog you see here, rather, the site of the well known Elvis impersonator (although we've never been seen in the same room together). In fact, since many of our queries are ambiguous, we prefer the option of manually filtering the results before plunging onto a web page. But what if your workflow required numerous nested queries? Requiring a hybrid automatic/manual process would improve accuracy but slow the process and consume valuable resources. One solution is to reduce ambiguity by standardizing the vocabulary; using the standard vocabularies of ICD-10, UMLS in the case of biomedical vocabularies. By agreeing to a standard vocabulary users can quickly determine whether the term "Steve Connolly" writes a blog or croons in a Vegas lounge. Would this put the I'm Feeling Lucky button out of business? Not yet. Although we've agreed to a single definition of the term, there may be many references to it, and these references would be returned in the query. We have however, significantly cut down on the inaccuracies in our results.

The next step is to standardize the associations to references. Google uses the sheer number of links connecting two terms which can be misleading. Google bombs are examples of how these associations can be manipulated; one well known example is the term "miserable failure" which has been linked by many individual Web sites to the biography of George W. Bush, thus, at the time of this blog a query of "miserable failure" will return the biography as the top ranked result. Semantic Web technologies seek to provide more explicit associations between terms, eliminating statistically based results with results that are definitive.

Once we've decided what a "Steve Connolly" is, next we can ask what a "Steve Connolly" does. We've established a subject, then we can query for associations, or predicates, and objects connected to those predicates. Implementing the subject, predicate, object format of the Semantic Web is neither easy nor straightforward and thus the original Web is in no danger of disappearing. But given time and effort the two original Google buttons: Google Search and I'm Feeling Lucky will gradually merge into one: I'm Feeling Google??

Labels: , , , , ,