Semantic Web compatibility

It should be possible to express hyperglossary in a way that is compatible with established Web standards, explicitly SKOS, RDFa-Lite and WebAnnotation.

The first part is easy, embedding the glossary itself in a HTML document.

Suppose a web page http://example.com/glossary/first-term represents a single term, it’s easy to make it into a SKOS concept, thus:

<html lang="en" typeof="skos:Concept">
<head>
</head>
<body>
<div>
    <h1 property="skos:prefLabel">My first glossary term</h1>
    <div property="skos:definition">
        <p>The definition of my first term.</p>
    </div>
    <p>This term specializes <a property="skos:broader" resource="second-term" href="second-term">another term</a></p>
</div>
</body>
</html>

The key attributes would be inserted in the WordPress template.

It is also possible (but not necessary) to have multiple terms in one page:

http://example.com/glossary could contain

<html lang="en" vocab="http://www.w3.org/2004/02/skos/core#" typeof="ConceptScheme">
<head>
  <base href="http://example.com/glossary/"></base>
</head>
<body>
<div>
    <h1>A glossary with a few entries</h1>
    <div resource="first-term" typeof="Concept">
        <h2 property="prefLabel">My first glossary term</h2>
        <p property="definition">The definition of my first term.</p>
        <span property="inScheme" resource="."/>
    </div>
    <div resource="second-term" typeof="Concept">
        <h2 property="prefLabel">My first glossary term</h2>
        <p property="definition">The definition of my second term.</p>
        <span property="broader" resource="first-term"></span>
        <span property="inScheme" resource="."/>
    </div>
</div>
</body>
</html>

(Here I used the vocab attribute to avoid repeating the skos: prefix.)

Note that the HTML tags are irrelevant. It could be
<a>, <dt>, <span>, whatever.

What matters are: a unique id given by resource attribute (it would be good practice to also use an id="..."), which may be relative to the page URL (or a full URL); , and the typeof and property attributes with those exact literal values.

(By email, I had sent the resource as #first-term instead of first-term, but here I’m supposing that they also exist as independent pages.)

The skos:broader gives thesaurus-like relationships between concepts. Other types of relationships between concepts would need including other ontologies, tbd.
I’m less sure about tagging, but there are a few ontologies for tagging such https://lov.linkeddata.es/dataset/lov/vocabs/tag

The more difficult part is how to say that a text fragment refers to that concept.
(The following examples could be on another page, or even be internal links in the glossary description.)
Many vocabularies (eg foaf) define the topic of a document, but the notion of text fragment is less common. Let’s use that of WebAnnotation.

So what I propose is as follows: The HTML fragment containing the reference to a concept would mostly need to have an ID attribute. It may or may not live in a HREF.

<p>text with a  <span id="target1">simple span</span> or even a <a id="target2" href="http://example.com/my-glossary#second-term">simple href</a></p>

We could mark the fact that this is a glossary reference with a class, as microformat does; but to make it visible to semantic web, we’d independently have a WebAnnotation stanza elsewhere in the same document:

<span resource="#target1_anno" vocab="http://www.w3.org/ns/oa#" typeof="Annotation">
    <span property="hasTarget" resource="#target1" typeof="SpecificResource">
        <span property="hasSource" resource="."></span>
        <span property="hasSelector" typeof="CssSelector">
            <span property="rdf:value" content="#target1" lang=""></span>
        </span>
    </span>
    <span property="motivatedBy" resource="identifying"></span>
    <span property="hasBody" resource="http://example.com/glossary/second-term"></span>
</span>

There are other ways to express that stanza, but this is a simple one, albeit verbose. Anywhere you see ‘target1’ is a placeholder, everything else would be constant.
Many libraries can extract information from there. dokie.li is a good example. There is more documentation I need to digest here.

It is also possible to propose such annotations from outside, by wrapping them in ActivityPub. I’ll develop that later.

Rejected alternatives:

  1. Put the annotation span in the text, which would be more conventional RDFa, but probably less legible HTML:
    <p>text with a  <span resource="#anno" vocab="http://www.w3.org/ns/oa#" typeof="Annotation"><span property="hasTarget" id="target1" resource="#target1" typeof="SpecificResource">not so simple annotation<span property="hasSource" resource="#"></span><span property="hasSelector" typeof="CssSelector"><span property="rdf:value" content="span#target1" lang=""></span></span></span><span property="motivatedBy" resource="identifying"></span><span property="hasBody" resource="http://example.com/my-glossary#second-term"></span></span></p>
  2. Use another, simpler vocabulary for the link between fragment and concept, while still using WebAnnotation for the fragment:
    <p>text with a <span vocab="http://www.w3.org/ns/oa#"  resource="#target1" typeof="SpecificResource"><a property="dc:subject" href="http://example.com/my-glossary#second-term">simple span</a></span></p>

BUT the SpecificResource is quite incomplete, and we don’t know that anyone can use this combination of vocabularies (or any similar one such as sioc:topic etc.)