What is linked data? (Note that I’m ignoring any of the specifics of RDF, on which Linked Data depends.)

The “data” of linked data is metadata on the web that describes documents and resources; the linked part refers to the links that exist between metadata items. If this seems a little abstract, consider the following:

I own ten books related to my four interests:

  • Anglo-Saxon language (properly, Old English)
  • The history of Winchester, England
  • Computer programming
  • Cookery

The titles I own are:

Arnow, G. W. D. M. (1998). Introduction to Programming Using Java: An Object-Oriented Approach. Addison-Wesley.

Arnow, D., Dexter, S., & Weiss, G. (2003). Introduction to Programming Using Java: An Object-Oriented Approach (2nd ed.). Addison Wesley.

Fearnley-Whittingstall, H., & Carr, F. (2008). The River Cottage Family Cookbook. Ten Speed Press.

Gamma, E., Helm, R., Johnson, R., & Vlissides, J. M. (1994). Design Patterns: Elements of Reusable Object-Oriented Software (illustrated edition.). Addison-Wesley Professional.

Hagen, A. (2006). Anglo-saxon Food & Drink. Anglo-Saxon Books.

Hawkes, B. A. L. M. A. S. C. (1970). Two Anglo-Saxon Cemeteries at Winnall, Winchester, Hampshire. Maney Publishing.

Hervey, T. (2007). The Bishops Of Winchester In The Anglo-Saxon And Anglo-Norman Periods. Kessinger Publishing, LLC.

Mitchell, B., & Robinson, F. C. (2007). A Guide to Old English (7th ed.). Wiley-Blackwell.

Sweet, H. (1982). Sweet’s Anglo-Saxon Primer (9th ed.). Oxford University Press, USA.

Sweet, H. (2008). An Anglo-Saxon Primer (3rd ed.). Tiger Xenophon.

I want some way of keeping track of my book collection, so I create a catalogue of RDF files where I tag the various books with their topics:

  • Books by Arnow, Arnow et al. and Gamma et al. are tagged as Computer Science
  • Books by Fearnley-Whittingstall and Hagen are tagged as Cookery
  • Books by Hawkes and Hervey are tagged as History — Winchester
  • Books By Mitchell & Robinson and Sweet are tagges as Language — Old English

Immediately, I see that I have several editions of the same book, so I add a simple SameAs relation between these books by adding a URL to the RDF metadata of the other book, so Arnow (1998) and Arnow et al. (2003) link to one another in this way, as do Sweet (1982) and (2008). In this way I can easily see which books are related by following a link (technically “dereferencing” a URL).

The book on design patterns is so fundamentally important within computer science that I add a SeeAlso link to this book from the other computer science titles; in the same way, I can choose to add a SeeAlso relation between the other books tagged with the same tags, allowing me to easily access each title from a related title.

Because my interest in Winchester primarily relates to the Anglo-Saxon period, and especially linguistic/onomastic aspect of its history, I find it useful to link (SeeAlso) the titles on Winchester to the books on Old English. At the same time, I also add a SeeAlso to the book on Anglo-Saxon cookery for each of the titles on Winchester and Old English.

Based on this, I can at any time explore my book collection in a novel way; from any given starting point, I have numerous avenues to explore. I have a simple way to see that there are several editions of a title, and that the titles in my collection relate to a number of topics, which typically interlink. It is difficult to find a link between cookery or Anglo-Saxon history and language and computer science, but I am sure that more formal analyses within computational linguistics would fit into the model I have described in an understandable fashion.

It is worth noting that it is debatable whether my use of SeeAlso and SameAs strictly speaking correct, but it illustrates the point about enriching a collection of metadata with links. More information about metadata schemas for linked data can be found in the links section below.

It is also worth noting that this interlinking is two way, and that this leads to redundance (in order to get from A to B you need an explicit link, in order to get from B to A, you need another explicit link). This isn’t really a problem because the data-storage overhead is minimal, and the dereferencing of URLs can be done in such a way that redundance does not create unnecessary work (by, for example, not dereferencing URLs that have already been visited).


RDF homepage (for RDF basics, schema and ontologies)

