Multiple itemtypes in Microdata

There’s a lot of discussion recently around HTML5′s microdata proposal, and how it relates to W3C’s earlier RDFa standard that is currently being updated for HTML5. Microdata solves many of the use cases of RDFa in a much simpler way. But some other use cases it cannot solve. This is because microdata assumes a world where there are very few or even just a single vocabulary; mixing vocabularies on a single item is rather difficult. Jeni Tennison has an excellent statement of the problem, along with a proposed solution.

In this post I put forward another proposal for addressing at least part of the problem.

The problem: microdata is limited to a single itemtype per element.

Why is this a problem? Because it makes mixing vocabularies really hard. If I decide to mark up an address with schema.org’s PostalAddress, then I can’t easily add markup for microdata’s built-in vCard vocabulary. I’ll have to repeat content in order to use both vocabularies. This design benefits the Google-backed schema.org; more focused special-purpose vocabularies, or open alternatives to schema.org with a transparent development process, will have a difficult stand.

An example. So let’s assume I have this address and want to mark it up with microdata:

<div>
    <span>26 Dun Aengus</span>,
    <span>Galway</span>,
    <span>Ireland</span>.
</div>

Then here’s how I would do it with schema.org terms:

<div itemscope itemtype="http://schema.org/PostalAddress">
    <span itemprop="streetAddress">26 Dun Aengus</span>,
    <span itemprop="addressLocality">Galway</span>,
    <span itemprop="addressCountry">Ireland</span>.
</div>

And here with vCard terms:

<div itemscope itemtype="http://microformats.org/profile/hcard">
    <span itemprop="street-address">26 Dun Aengus</span>,
    <span itemprop="locality">Galway</span>,
    <span itemprop="country-name">Ireland</span>.
</div>

It is clear why combining both versions into a single one is difficult. Microdata uses short property names like itemprop="street-address". If an element had multiple itemtypes, then it would be impossible to tell which itemtype the street-address property belongs to. Assuming that it belongs to both types would be dangerous; there could be cases where a property exists in both vocabularies but with different meaning. The restriction to a single type prevents such ambiguity.

Multiple itemtypes without ambiguity: Here’s the proposal. I’ll start by creating an item that has all the properties from both versions—I’m omitting the itemtypes for now to avoid ambiguity:

<div itemscope>
    <span itemprop="streetAddress street-address">26 Dun Aengus</span>,
    <span itemprop="addressLocality locality">Galway</span>,
    <span itemprop="addressCountry country-name">Ireland</span>.
</div>

Without itemtype, this generates an untyped item with six properties:

  • itemtype: none
  • property: streetAddress = 26 Dun Aengus
  • property: street-address = 26 Dun Aengus
  • property: addressLocality = Galway
  • property: locality = Galway
  • property: addressCountry = Ireland
  • property: country-name = Ireland

The altitem property. Microdata would get a new built-in property, called altitem. Let’s add an additional element with this property into the untyped item:

<meta itemprop="altitem"
      content="http://schema.org/PostalAddress streetAddress
               addressLocality addressCountry">

What’s going on here? The idea is that altitem takes a whitespace-separate list. When added to an item, it creates a new “alternate item” whose itemtype is the first element of the list. Then it looks at the rest of the list, which should be property short-names. It copies any of these named properties from the original item to the new item. So, we’d end up with a second item besides the type-less original item. This second item has:

  • itemtype: http://schema.org/PostalAddress
  • property: streetAddress = 26 Dun Aengus
  • property: addressLocality = Galway
  • property: addressCountry = Ireland

Which is exactly the same as the original schema.org item from above. Creating the vCard item is just another property:

<meta itemprop="altitem"
      content="http://microformats.org/profile/hcard street-address
                locality country-name">

This gives us:

  • itemtype: http://microformats.org/profile/hcard
  • property: street-address = 26 Dun Aengus
  • property: locality = Galway
  • property: country-name = Ireland

So now we’d have three items in total: the original untyped item, and the two typed alternate items.

What’s nice about this:

  • It doesn’t require any new syntax, just a new property.
  • Multiple types generate multiple items, which are visible in the microdata API just like normal items.
  • It plays well with itemref, so the altitem declaration doesn’t have to be repeated if I have several postal addresses on the page.
  • It’s plays well with a copy-and-paste style of web development. “If you want to use myVocab together with another vocab, just paste this snippet into your item and add the appropriate itemprops…”

Issues. Quite some details would still have to be worked out:

  • What happens to properties with full URL names? I guess they should always be copied to all items.
  • What happens to itemid? I guess all items should receive the same itemid from the original item.
  • In microdata, itemtype is inherited by nested sub-items. I’m not sure how this should work if altitem is present.
  • Properties within a microdata are ordered; there’s a question whether the order in altitem or in the original item should take precedence when alternate items are generated.
  • Would it be worth having a dedicated microdata attribute for this?
  • Would microdata clients actually implement this? There is a risk that too many implementers would take shortcuts and just implement the basic case and ignore altitem.

Summary: This post shows how multiple itemtypes could be supported in microdata without introducing new syntax, without making the common case of a single vocabulary more complex for authors, and without fundamentally changing the data model.

This entry was posted in General. Bookmark the permalink.

4 Responses to Multiple itemtypes in Microdata

  1. Xi Bai says:

    Thanks for this interesting proposal. Properties derived from different vocabs are grouped via altitems, neat! The thing is I think it does not solve the exceptional issue you mentioned (if full URIs are not used in itemprops):

    “there could be cases where a property exists in both vocabularies but with different meaning.”

    Does it?

    • @Xi: In that case, the author can only use the clashing property from one vocabulary. It is not ambiguous, but the author has to choose.

      • Xi Bai says:

        Hi, Richard,

        Point taken. Probably in that unusual but possible scenario, it’d be better if there is a chance for publishers to declare and use local alias for clashing properties in @content (e.g., content=”http://microformats.org/profile/hcard street-address: sa locality country-name”) and alias will be mapped to the full URI when an RDF/microdata parser is applied. May be however too complicated in this way.

        • There’s a trade-off between power and complexity. Semantic clashes between property names do occur, but they are rare, and I’m not sure it’s worth worrying much about.