The Context Markup Language (CML) and Context Markup Definitions (CMLD) provide mechanisms to embed and reference machine-readable semantic context within web content, enabling software agents, such as Large Language Models (LLMs), to disambiguate terms accurately. CML uses inline HTML elements and attributes, while CMLD defines an external JSON-based format, both leveraging URIs and Decentralized Identifiers (DIDs) to support Human-Centric AI (HCAI) principles of accuracy, transparency, fairness, and trust for natural persons.

This is an unofficial draft specification for CML and CMLD, proposed for discussion within a W3C Community Group. It has not been endorsed by the W3C or any official standards body. Feedback is welcome via the associated Community Group repository or mailing list.

Introduction

Software agents, particularly LLMs, struggle to disambiguate terms with multiple meanings (e.g., "thongs" as footwear or underwear, "physician" as a person or place). Existing semantic web standards like RDFa and JSON-LD are complex for authors and not optimized for LLMs or HCAI. The Context Markup Language (CML) provides inline HTML markup, while Context Markup Definitions (CMLD) offers an external JSON-based format, both using URIs and DIDs to ensure precise context, supporting HCAI principles of accuracy, transparency, and fairness.

CML embeds context directly in HTML for fine-grained disambiguation, while CMLD enables reusable, maintainable context definitions referenced by multiple documents. This specification defines the syntax, processing models, and HTML integration for both, with examples for web authors and agent developers.

Terminology

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL are to be interpreted as described in [[RFC2119]].

CML
Context Markup Language, an inline HTML markup language for embedding semantic context.
CMLD
Context Markup Definitions, an external JSON-based format for context definitions.
Software Agent
A program, such as an LLM, that processes web content for tasks like disambiguation.
URI
A Uniform Resource Identifier, as defined in [[RFC3986]].
DID
A Decentralized Identifier, as defined in [[DID-CORE]].
Human-Centric AI (HCAI)
AI systems prioritizing the needs, rights, and well-being of natural persons, per [[EU-AI-ACT]].

Namespace

The CML namespace is https://w3c.github.io/context-markup/ns#. CML elements and attributes MUST use this namespace to avoid conflicts with other vocabularies, including Chemical Markup Language. In HTML, CML elements are prefixed with cml: (e.g., <cml:context>).

The namespace URI is a placeholder and will be finalized upon W3C Community Group adoption.

CML Syntax

CML introduces two HTML elements and one attribute for inline context:

<cml:context> Element

The <cml:context> element wraps a term and provides an RDF-like triple (subject, predicate, object).

Attributes
  • subject: A URI or DID identifying the term’s concept (REQUIRED).
  • predicate: A URI defining the relationship (REQUIRED).
  • object: A literal or URI providing the context value (REQUIRED).
  • lang: An optional language tag, per [[BCP47]].
<p>
  Wear <cml:context subject="https://wikidata.org/entity/Q12955949" predicate="http://schema.org/definedAs" object="Flip-flops" lang="en">thongs</cml:context> for a beach day.
</p>
      

<cml:link> Element and cml:href Attribute

The <cml:link> element or cml:href attribute references a context resource, as defined for link relations in [[HTML]].

Attributes
  • cml:href: A URI or DID pointing to a context resource (REQUIRED).
  • rel: An optional relationship type (e.g., context), per [[HTML]].
<p>
  Buy our <cml:link cml:href="https://wikidata.org/entity/Q380339" rel="context">thongs</cml:link> for comfort.
</p>
      

CMLD Syntax and File Extension

CMLD files are JSON documents with an array of context definitions, compatible with JSON-LD. They MUST use the .ctx extension for simplicity. Alternative extensions were considered:

The .ctx extension is recommended, pending W3C Community Group finalization.

{
  "@context": "https://w3c.github.io/context-markup/ns#",
  "definitions": [
    {
      "id": "thongs-footwear",
      "subject": "https://wikidata.org/entity/Q12955949",
      "predicate": "http://schema.org/definedAs",
      "object": "Flip-flops",
      "lang": "en"
    }
  ]
}
    
id
A unique identifier (REQUIRED).
subject
A URI or DID (REQUIRED).
predicate
A URI defining the relationship (REQUIRED).
object
A literal or URI (REQUIRED).
lang
An optional language tag, per [[BCP47]].

HTML Integration

CMLD files are referenced using a <link> element with a proposed rel="context" value. Terms are linked to definitions via a data-cml-id attribute, both custom extensions not in [[HTML]].

<head>
  <link rel="context" href="context.ctx" type="application/json">
</head>
<body>
  <p>Wear <span data-cml-id="thongs-footwear">thongs</span> for a beach day.</p>
</body>
    

Processing Model

CML Processing

Agents processing CML MUST:

  1. Identify <cml:context> and <cml:link> elements or cml:href attributes in the HTML DOM.
  2. Extract attributes (subject, predicate, object, cml:href) and map to RDF triples or JSON objects.
  3. Resolve URIs or DIDs, per [[DID-CORE]].
  4. Use context for disambiguation, aligning with HCAI principles.

Agents SHOULD handle malformed tags gracefully.

CMLD Processing

Agents processing CMLD MUST:

  1. Load the CMLD file via <link rel="context">.
  2. Parse the JSON and extract definitions.
  3. Match data-cml-id attributes to definition ids.
  4. Map attributes to RDF triples or JSON objects.
  5. Resolve URIs or DIDs, per [[DID-CORE]].
  6. Use context for disambiguation, aligning with HCAI principles.

Agents SHOULD handle malformed files or unmatched IDs gracefully.

Integration with Semantic Web and DIDs

CML and CMLD map to RDF triples, ensuring interoperability with RDFa and JSON-LD. URIs reference ontologies like Schema.org or Wikidata (e.g., https://schema.org/Physician, https://wikidata.org/entity/Q12955949). DIDs (e.g., did:example:123) link to DID documents, per [[DID-CORE]], supporting decentralized context.

Human-Centric AI Alignment

CML and CMLD support HCAI principles, per [[EU-AI-ACT]]:

Examples

Disambiguating "thongs"

CML inline:

<p>
  Wear <cml:context subject="https://wikidata.org/entity/Q12955949" predicate="http://schema.org/definedAs" object="Flip-flops" lang="en">
    <a cml:href="did:example:Footwear123" href="https://example.com/flipflops">thongs</a>
  </cml:context> for a beach day.
</p>
<p>
  Our <cml:context subject="https://wikidata.org/entity/Q380339" predicate="http://schema.org/definedAs" object="Underwear">
    <cml:link cml:href="https://example.com/underwear" rel="context">thongs</cml:link>
  </cml:context> offer comfort.
</p>
      

CMLD file (context.ctx):

{
  "@context": "https://w3c.github.io/context-markup/ns#",
  "definitions": [
    {
      "id": "thongs-footwear",
      "subject": "https://wikidata.org/entity/Q12955949",
      "predicate": "http://schema.org/definedAs",
      "object": "Flip-flops",
      "lang": "en"
    },
    {
      "id": "thongs-underwear",
      "subject": "https://wikidata.org/entity/Q380339",
      "predicate": "http://schema.org/definedAs",
      "object": "Underwear"
    }
  ]
}
      

CMLD HTML:

<head>
  <link rel="context" href="context.ctx" type="application/json">
</head>
<body>
  <p>Wear <span data-cml-id="thongs-footwear">thongs</span> for a beach day.</p>
  <p>Our <span data-cml-id="thongs-underwear">thongs</span> offer comfort.</p>
</body>
      

Disambiguating "physician"

CML inline:

<p>
  Dr. Smith, a <cml:context subject="https://wikidata.org/entity/Q39631" predicate="http://schema.org/definedAs" object="Medical doctor" lang="en">
    <a cml:href="did:example:Doc123" href="https://example.com/dr-smith">physician</a>
  </cml:context>, specializes in cardiology.
</p>
<p>
  Visit our <cml:context subject="https://schema.org/Physician" predicate="http://schema.org/definedAs" object="Medical practice">
    <cml:link cml:href="https://example.com/clinic" rel="context">physician</cml:link>
  </cml:context> at 123 Main St.
</p>
      

CMLD file excerpt:

{
  "definitions": [
    {
      "id": "physician-person",
      "subject": "https://wikidata.org/entity/Q39631",
      "predicate": "http://schema.org/definedAs",
      "object": "Medical doctor",
      "lang": "en"
    },
    {
      "id": "physician-place",
      "subject": "https://schema.org/Physician",
      "predicate": "http://schema.org/definedAs",
      "object": "Medical practice"
    }
  ]
}
      

CMLD HTML:

<p>Dr. Smith, a <span data-cml-id="physician-person">physician</span>, specializes in cardiology.</p>
<p>Visit our <span data-cml-id="physician-place">physician</span> at 123 Main St.</p>
      

Disambiguating "bank"

CML inline:

<p>
  Deposit money at the <cml:context subject="https://wikidata.org/entity/Q22687" predicate="http://schema.org/definedAs" object="Financial institution" lang="en">
    <a cml:href="did:example:Bank123" href="https://example.com/bank">bank</a>
  </cml:context> on Main St.
</p>
<p>
  Fish along the <cml:context subject="https://wikidata.org/entity/Q319686" predicate="http://schema.org/definedAs" object="Riverbank">
    <cml:link cml:href="https://example.com/riverbank" rel="context">bank</cml:link>
  </cml:context> of the river.
</p>
      

CMLD file excerpt:

{
  "definitions": [
    {
      "id": "bank-financial",
      "subject": "https://wikidata.org/entity/Q22687",
      "predicate": "http://schema.org/definedAs",
      "object": "Financial institution",
      "lang": "en"
    },
    {
      "id": "bank-river",
      "subject": "https://wikidata.org/entity/Q319686",
      "predicate": "http://schema.org/definedAs",
      "object": "Riverbank"
    }
  ]
}
      

CMLD HTML:

<p>Deposit money at the <span data-cml-id="bank-financial">bank</span> on Main St.</p>
<p>Fish along the <span data-cml-id="bank-river">bank</span> of the river.</p>
      

Disambiguating "apple"

CML inline:

<p>
  Eat an <cml:context subject="https://wikidata.org/entity/Q89" predicate="http://schema.org/definedAs" object="Fruit" lang="en">
    <a cml:href="did:example:Fruit123" href="https://example.com/apple-fruit">apple</a>
  </cml:context> for a healthy snack.
</p>
<p>
  The new <cml:context subject="https://wikidata.org/entity/Q312" predicate="http://schema.org/definedAs" object="Technology company">
    <cml:link cml:href="https://example.com/apple-inc" rel="context">Apple</cml:link>
  </cml:context> phone is innovative.
</p>
      

CMLD file excerpt:

{
  "definitions": [
    {
      "id": "apple-fruit",
      "subject": "https://wikidata.org/entity/Q89",
      "predicate": "http://schema.org/definedAs",
      "object": "Fruit",
      "lang": "en"
    },
    {
      "id": "apple-company",
      "subject": "https://wikidata.org/entity/Q312",
      "predicate": "http://schema.org/definedAs",
      "object": "Technology company"
    }
  ]
}
      

CMLD HTML:

<p>Eat an <span data-cml-id="apple-fruit">apple</span> for a healthy snack.</p>
<p>The new <span data-cml-id="apple-company">Apple</span> phone is innovative.</p>
      

Implementation Considerations

CML and CMLD face adoption challenges:

Security and Privacy Considerations

CML and CMLD introduce minimal security risks:

Conformance requirements use [[RFC2119]] terminology. Implementations MUST conform to the syntax and processing models defined in this specification.

Thanks to the W3C Semantic Web Community, Decentralized Identifier Working Group, and HCAI researchers for inspiration. Special acknowledgment to Wikidata and Schema.org for context resources.