louisrosenfeld.com logotype

Home > Bloug Archive

May 11, 2003: Got Enterprise Metadata?

Check out EContent Magazine's article on content integration by Tony Byrne of CMSWatch fame. Tony paints a great picture of the issues surrounding integrating content, data, and applications across departmental silos or "stovepipes," as he calls them, and describes some of the software tools that hope to address these problems.

His article nicely illustrates many enterprise IA issues; in fact, it's one of the few things I've read that touches on how metadata might (or might not) be used to integrate content across the enterprise's silos. ContextMedia's InterChange is one of the more promising CI tools that Tony discusses. It "builds and maintains a major central metadata store... Capturing all that metadata—-and normalizing it around the Dublin Core or some other universal schema—-is the first critical task of any ContextMedia implementation."

Yow. That's nice and all, but it sounds pretty damned ambitious. In fact, all enterprise-wide metadata initiatives sound pretty damned ambitious. Strange for me: I'm a librarian by background, but I'm increasingly finding myself advising that such initiatives be delayed or avoided altogether. They're just too difficult and expensive for most enterprises to take on. Two main reasons: metadata interoperability and metadata merging.

Interoperability is a syntactic exercise, helping reduce problems of using inconsistent metadata attributes. For example, if the enterprise manages to get all of its major content silos to use a common schema like Dublin Core, now the metadata attribute for content creators will always be listed as "creator," never "author," "source," or other variations. There are direct benefits to content managers and to users who are searching enterprise content with interoperable metadata, but achieving interoperability through a standard metadata schema is very difficult.

Even more difficult is merging that metadata semantically. Achieving interoperability doesn't amount to much if synonymous metadata values don't match. For example, the enterprise's departments may all agree to a standard attribute called "subject," but department X fills that subject field with the value "cell phone," department Y inserts "mobile phone," and department Z calls it "wireless device". See the problem? Achieving vocabulary control in the politicized environment of the large, distributed enterprise can be a nightmare.

There are techniques like cross-walking vocabularies, creating switching vocabularies, and meta-thesauri that can be used to semantically map these metadata values, but the only organizations I've seen take this on are dealing with a fairly homogeneous content collection, audience, and vocabulary, all within a single domain. The best known example is the National Library of Medicine's UMLS meta-thesaurus; it's the product of many years of effort and, I'm sure, cost a pretty penny to develop. Still, it addresses a content domain (medicine) covered by fairly standardized vocabularies and used by a mostly homogeneous audience. This environment is a bit more appropriate for semantic merging that what you'll find on your typical wildly heterogeneous enterprise intranet.

My advice is to focus on more realistic areas, like enterprise search, that can provide immediate benefits to helping users find information. But I'm still a librarian at heart, so I'm anxious to hear of examples of successful enterprise-wide metadata initiatives. So let's hear'em!

email this entry

Comment: Madonnalisa (May 12, 2003)

Just my $0.02...I agree for the most part on what you've said Lou, but somehow it's very difficult to make generalizations about content/business/technologies.

- Centralization: I agree, you can't centralize everything. The purpose of technology is to support and enhance business process...not scrap tools that actually worked for various departments...the Data Management world evangelized this in the 80/90s but now many are starting to say that decentralization could just be as effective..."it depends" mostly on the management support and the project implementers execution of that vision. I guess folks need to step back on why they want to centralize in the first place; usually there are many motivations for centralization, like control, accountability, and legislative regulations.

- Value of content: Sometimes we can get so fixated by content, that we forget if the content is high-value. An Enterprise IA really needs to review the value of the content that these systems are supposed to support; data managers need to also be involved since time-critical or high-value content may exist in structured systems. As much as I hate committees/stewards, I think they could be very useful in understanding how information impacts how employees get their work done. Not all information is useful for supporting business processes. For example, I don’t care as a video content manager that chicken fingers were served at the company cafeteria on Monday.

- Enterprise Metadata: I still think it’s possible with the appropriate people involved with a staged implementation approach. Yes EM is intimidating, but just think of all those data managers out there who have worked with ERPs & CRMs, in successful environments they were able to deploy that same concept of enterprise information management. By showing the value of proper metadata use for various content types, this should not be too difficulty. For example, starting with web page metadata then moving to metadata for images or video, then eventually to full content management systems.

- Semantics: This will always be a problem with digital management and retrieval of information. Natural language processing is quite a challenging task. I see hope in the formal use of RDFs and ontologies in enterprise information architectures.

- Training & education: This is critical if you centralize. It takes time to prove the value of metadata and with training/education/documentation; the folks who would be creating metadata will be able to see the value right away.

- ROI: The only other benefit I can think of centralization though is the cost effectiveness of managing the data centrally with one system (rather than trying to create integration points with every system)...but then again those costs savings don't mean anything if it hinders end-user interaction with the system. If you have time critical data that needs to be accessible to the system users in real time, an organization really needs to figure out the value for centralized vs. decentralized models.

- Search: I believe that search on its own can't solve your answers. What is important is that there are complementary tools, in our case “some” metadata standards and a controlled vocabulary that is managed by a "neutral" party within an organization (and of course training & documentation). Another tool that could eventually become effective in supporting content integration are technologies revolving around auto-extraction of metadata...that area is not quite sophisticated right now...but who knows how this could impact metadata creation/use. For the immediate future, I see federated searching (my definition: passing the query to various applications that contain content/information and returning a blended result from all the sources) being one of those tools that can effectively mimic integrated content.

Comment: Lou (May 15, 2003)

Lisa, thanks for your thoughtful posting!

I can't say that I disagree with anything you say here, but I'm glad that your optimism about enterprise metadata is balanced by a healthy respect for starting with low hanging fruit and phasing in from there. This really can and should be true for any aspect of EIA. Still, I see search as the natural starting point for *most* enterprises; an easy place to get some early wins while ramping up for tougher challenges, like enterprise metadata...

Comment: ML (May 16, 2003)

I guess it will more than likely start with search :) Check out this recent CNET article on IBM database integrator:


Comment: Sean (Oct 24, 2003)

Entopia has a very unique enterprise search methodology, which I think you will find interesting. We provide an infrastructure solution, K-Bus with an embedded search engine. Our infrastructure solution connects to enterprise repositories and applications and formulates a meta-data layer to search across and build applications on using an open standards API. We focus on social context of content and idnetify experts and return not only a social network analysis, but also a visual representation of the search results by key-word and topic (semantic).

If you would like further information, please let me know.

I enjoyed this thread.

Add a Comment:



URL (optional, but must include http://)

Required: Name, email, and comment.
Want to mention a linked URL? Include http:// before the address.
Want to include bold or italics? Sorry; just use *asterisks* instead.

DAYENU ); } else { // so comments are closed on this entry... print(<<< I_SAID_DAYENU
Comments are now closed for this entry.

Comment spam has forced me to close comment functionality for older entries. However, if you have something vital to add concerning this entry (or its associated comments), please email your sage insights to me (lou [at] louisrosenfeld dot com). I'll make sure your comments are added to the conversation. Sorry for the inconvenience.