louisrosenfeld.com logotype

Home > Bloug Archive

Oct 19, 2003: The Enterprise Metadata Nut: Cracked?

In my take on enterprise information architecture, enterprise-wide metadata development is the most ambitious, "way off" component of the architecture. It's Really Hard to get different business units to agree to a single metadata schema. It's Really, Really Hard to get them to then populate those metadata attributes with semantically consistent values. I've already yammered on this topic in a past Bloug entry, so I won't get into it further here. However, I can at least offer you a simple diagram (40kb PDF file) to explain the enterprise metadata situation as I see it.

But some really smart people I know, including Joseph Busch, Bob Boiko, and Michael Crandall, seem to be fans of SchemaLogic's SchemaServer product. According to SchemaLogic's site, SchemaServer offers:

  • "Support for shared schema and local variations via a generalized model
  • "Vocabulary management enables conceptual interoperability and cuts system management workload
  • "Change management enables data stewards/stakeholders to track dependencies and ensure availability
  • "Distributed collaboration accelerates problem resolution and improves responsiveness
  • "Synchronization of new or changed schema across target systems cuts cycle-time and maintenance expenses"

Sure sounds nice, but is it the solution to the enterprise metadata headache? Who knows, but diagnosis is half the battle, and at least their copywriter seems to understand our pain enough to articulate it well.

Bonus points for SchemaLogic: they seem to at least be aware that the field of IA exists, addressing "enterprise information architects," among others. That's more than can be said for most vendors.

Oh yeah, other vendors: Context Media seems to be competing in the same space. Their Interchange product "facilitates discovery and access to unstructured content through the portal by normalizing diverse taxonomies." Again, sounds great, but...

Is anyone really making their metadata attributes interoperable and merging the semantic aspects of their metadata values in an enterprise setting? Anyone successfully using these products or something else? Applying them to semi-structured text (not data, which is a simpler challenge)?

If this sounds like you, you can pass go, collect $200, and enter the Information Architects Hall of Fame, temporarily housed in a lovely corner of my recently rebuilt basement in Ann Arbor, Michigan. Just be sure to tell us how you did it, ok?

If you did pull off this amazing feat, how long did it take? What kind of metadata attributes did you develop and apply? The relatively easy ones, like audience labels, or the more-painful-than-a-visit-to-the-dentist ones, like subjects? The content: how much and what kind? How many business units supplied said content? And when did you get released from the nice white padded room?

email this entry

Comment: ML (Oct 20, 2003)

Great way to describe the problem with enterprise metadata. The challenge in an enterprise is who would really own the "semantic merging" because this is where I see the politics/territory issues rearing its ugly head.

One of the things of setting up metadata I've noticed is that you need some kind of data dictionary explaining how the values would be populated(or if there are vocab for it).

Also the other thing missing in general about IA...is if this is the progression from wireframes and sitemaps, how do we describe the gap between that type of IA and the enterprise metadata/vocab IA? Not that I'm saying there's an us vs. them...it's how can we help our colleagues understand how they inter-relate?

Comment: vanderwal (Oct 22, 2003)

Lou, an excellent overview of the problems. The crosswalk and normalizing of metadata terms can also have downfalls, but there can be work arounds (although not elegant). A long term project from a couple years ago we were trying to build a taxonomy with a thesaurus and crosswalks (in data architecting terms normalizing).

One problem we ran into was the content owners and the longtime users of the metadata (variable names) did not understand the unified system.

Another issue was cardinality of the terms, for example one group had 12 possible categorical types of weather condition, while others had 5. When we tried marrying these 8 content providers we ended up with 2 categories, sunny and not sunny.

To work around both of these issues we built a metadata viewer that would show the native categories with its normalized category. When a user wanted to work with a content area they were familiar and tie it to other areas they were given the option of using the standardized categories for all, or the option of using one of their germane categories for the primary content area to extract joined information from other content areas.

For example a user familiar with the taxonomy from content area X could choose to use its germane categories (as they had 15 years working with that metadata structure) and tie it to Y and Z content areas to pull data and reports with fatality and weather information. Would could narrow the search based on the 12 weather conditions, but the retrieval would extract information from Y and Z using only sunny and not-sunny. This was not perfect but is was workable and allowed for the users find information that was helpful and used a base of information that they were familiar with as a framework.

In pulling the content from new repositories in, this method helped us build the crosswalks as well as normalize the new information more easily.

Comment: vanderwal (Oct 22, 2003)

ML, I have come to use visual IA and LIS IA as a demarcation between the different camps. The visual IA often are information designers too. Knowing which IA frame of reference is important when discussing IA. At times the discussions encompass both camps, which gets confusing.

A couple years ago I was discussing that Web development in whole needed a language for criticism, much like literary criticism and media criticism so to provide a common ground to provide constructive feedback. I am ever more so thinking IA needs this framework formalized.

Comment: David Locke (Nov 19, 2003)

In terms of coming to a common definition for semantics, the people doing confromed dimensions are essentially doing the same work. They start with a traditional OLAP data warehouse and work towards a real-time data warehouse.

There is a constant lost off data. They cannot accept aggregates right off the top. Then, where they cannot reach agreement on a conformed dimension, that data is lost as well. Then, you have departmental data that will never have any meaning outside that department. In the end lots of data is lost. You can do a real-time query as long as you realize that the answer is not going to be particularly meaningful.

The problem areas that the conformed dimensions people find are going to be the same ones that cause problems for taxonomies. The two communities are probably talking about the same terms.

Comment: ML (Jan 5, 2004)

So this takes me to the IA Summit 2004...I tackled some thoughts with my colleague Sara Rice...and we'll be presenting on whether or not we need to be at the table to talk "metadata" and how it could involve visual IAs as well. Short plug for "Entprise metadata & information architecture." I hope you all could be there...maybe we can really bang away at this discussion.

Add a Comment:



URL (optional, but must include http://)

Required: Name, email, and comment.
Want to mention a linked URL? Include http:// before the address.
Want to include bold or italics? Sorry; just use *asterisks* instead.

DAYENU ); } else { // so comments are closed on this entry... print(<<< I_SAID_DAYENU
Comments are now closed for this entry.

Comment spam has forced me to close comment functionality for older entries. However, if you have something vital to add concerning this entry (or its associated comments), please email your sage insights to me (lou [at] louisrosenfeld dot com). I'll make sure your comments are added to the conversation. Sorry for the inconvenience.