

INFORMATION MANAGEMENT
Classification (or subject indexing, as some call it) of content and assets is something that we all do to one degree or another. Classification can mean writing a label on the back of a picture of a favourite pet, or it can mean applying a web bookmark to a recommended site using Delicious. In the work environment, the ever increasing burden of managing digital information can be a nightmare. Information, if it is to be of any value whatsoever, has to be stored and managed in a way that makes it easily discoverable when needed.
And, bearing in mind that the information seeker might not actually know what they are looking for: Edenius and Burgerson (2003) point out that the nature of information is that it is new, not known before, so how can one know in advance what relevant information is available and where to look for it?
Classification can also mean the design and application of controlled vocabularies. These are lists of relevant terms – relevant to the classifier or information creator AND also relevant to the anticipated information seeker. They can be flat a – Z lists, or they can be complex, multi-relational hierarchies. Different vocabularies can be “mapped together” forming “bonds” between suitable terms, so that a document classified with a term from one vocabulary automatically becomes classified with any mapped terms from other vocabularies. When people search for content, information is returned which is classified with terms which match the search words used. Easy.
Now, Clay Shirky (2005) might argue that controlled vocabularies force classifiers to guess how their information users think.
Others suggest that the construction of controlled vocabularies have too often ignored their human context, and the context in which they will need to function (Mai, 2008). Vob (2007) hits the nail on the head: “The main purpose of subject indexing is to construct a representation of a resource that is being tagged”. That is, the information to be classified needs to be described, and described in a way that will be compatible with the search questions asked by the information seeker.
Our approach to the development of controlled vocabularies is essentially constructivist, based on sound psychological principles. People construct representations of their work, environment, colleagues, documents – of their knowledge – everyday in their communications with others. So, why scan a document title for “key words” to use as its descriptor when those may not be the words and phrases that someone might use when making a mundane reference to the document? Human discourse in interaction is the key, and that is the fundamental basis of our approach to controlled vocabularies.