ActivityPub is a standard for publishing structured social network data on the Web in JSON-LD format. This document describes various methods for discovering the ActivityPub object described by an HTML page, and conversely the HTML page for an ActivityPub object.
This is a draft of the Social Web Incubator Group (SocialCG) Discovery Task Force.
ActivityPub is a standard for publishing structured social network data on the Web in JSON-LD format and sharing that data from client to server and from server to server. This document describes several methods for discovering the ActivityPub object described by an HTML page, and conversely the HTML page for an ActivityPub object.
Social data in the ActivityPub model is a "resource" like a person, image, or place. That resource has an ActivityPub JSON-LD representation (if it doesn't, it's not covered by this document!) and may have an HTML representation. The ActivityPub representation and the HTML representations each have an URL -- possibly the same URL.
Some resources have a relationship to another resource, which is its author or creator. The author resource has an ActivityPub JSON-LD representation (again, if it doesn't, it's not covered by this document!) and may have an HTML representation.
A resource may have an author resource with an ActivityPub representation, but not have its own ActivityPub representation. An example is an article published in a content-management system (CMS) that is ascribed to an actor with an ActivityPub account who wants to receive credit and/or feedback for the work.
Discovery, in this document, means starting with the URL or content of one representation, and identifying another representation. Each section of the document covers a different starting data type, and a different goal.
In this document, the terms "publisher" and "consumer" are used as in Activity Streams 2.0 Core. The terms are extended to include implementations that publish or consume HTML representations of resources with ActivityPub representations.
In this document, several methods of discovery are described. Different methods are implemented by different publishers and consumers, and have different trade-offs in terms of complexity, performance, and reliability.
A section on verification explains how to verify that the discovered information is accurate.
The final section defines best practices for publishers to maximize interoperability and minimize development effort.
A consumer may start with the full contents of an HTML
document, including markup and other content. For example, a
browser-based application may have access to the HTML loaded in
the browser window. A consumer with an HTML document may be
able to extract the URL from the environment -- for example,
using the document.location property in a
JavaScript environment. Using the document content for
discovery can return the ActivityPub equivalent without the
HTTP requests that discovery by URL requires, saving some time
and network traffic.
The
link element is a metadata element used in the
<head> section of an HTML document. It
provides links for the whole document, using a number of
different link relations.
To indicate its equivalent ActivityPub object, the HTML
page at https://html.example/watch/video-1.html
could include the following link element:
<!doctype html>
<html>
<head>
<title>Video 1</title>
<link
rel="alternate"
type="application/activity+json"
href="https://ap.example/api/descriptors/video-1.jsonld" />
</head>
<body>
<!-- rest of the page -->
</body>
</html>
Consumers need to parse the HTML to find the
link element with the alternate
relation and an ActivityPub-compatible media type as
type. This can be slow and complicated.
Some servers may include a link element
with an alternate relation and with a JSON
type or JSON-LD type that does not link to an ActivityPub
resource.
<!doctype html>
<html>
<head>
<title>Video 1</title>
<link
rel="alternate"
type="application/json"
href="https://api.example/unrelated/videodescriptor.json" />
</head>
<body>
<!-- rest of the page -->
</body>
</html>
The
a element is an element used in the
<body> section of an HTML document. It can
be used to define relationships with other documents, with
the benefit that the link is (usually) visible and clickable
by a reader.
To indicate its equivalent ActivityPub object, the HTML
page at
https://html.example/profiles/person-1.html
could include the following a element:
<!doctype html>
<html>
<head>
<title>Person 1</title>
</head>
<body>
<a
rel="alternate"
type="application/activity+json"
href="https://ap.example/users/person-1.jsonld" >
Actor data for Person 1
</a>
<!-- rest of the page -->
</body>
</html>
Consumers will need to parse the HTML to find the
a element with the alternate
relation and an ActivityPub-compatible media type as
type. This can be even more slow and complicated
than with the link header. The link
header is usually in the first few kilobytes of a document,
and will usually be nested only 2 levels below the document
in the DOM tree. An a element may be anywhere in
the body, maybe nested very deep in the
tree.
As with the link element, some servers may
include an a element with an
alternate relation and with a JSON type or
JSON-LD type that does not link to an ActivityPub
resource.
In addition, many content management systems allow end
users to set rel and other properties on
a elements, which may result in false matches.
Even more than with other methods, using the a
element for discovery requires verification.
HTML documents can include JSON-LD data in a
<script> element in the
<head> section of the document. This data
can be used to provide metadata about the document, including
its equivalent ActivityPub object.
Given a page that shows an image at
https://html.example/gallery/image-17.html, the
HTML for the page could look like this:
<!DOCTYPE html>
<html lang="en">
<head>
<title>Image 17</title>
<script type="application/ld+json">
{
"@context": "https://www.w3.org/ns/activitystreams",
"type": "Image",
"id": "https://ap.example/api/images/image-17.jsonld",
"url": {
"type": "Link",
"mediaType": "text/html",
"href": "https://html.example/gallery/image-17.html"
}
}
</script>
</head>
<body>
<h1>Image 17</h1>
<p><img src="https://html.example/images/image-17.png"></p>
</body>
</html>
This embedded JSON-LD specifies that an ActivityPub object
with the ID
https://ap.example/api/images/image-17.jsonld
exists, and that it has an HTML page url at
https://html.example/gallery/image-17.html, that
is, the current page's URL. This is a roundabout, but clear,
way to specify the ActivityPub ID of the current page.
Consumers need to parse the HTML page, and the embedded JSON-LD, to extract the ActivityPub object ID. An advantage to this technique is that other properties of the ActivityPub object can be embedded as well; however, to confirm those properties, the consumer will need to fetch the object from its canonical URL, the ID, anyways.
Complicated structures for the url property
may make it hard to confirm that the object's URL is the
same as the current page's.
Embedded JSON-LD is very popular for embedding Schema.org metadata. This can lead to false positives when looking for ActivityPub objects.
These discovery techniques require an URL as input. Consumers may start with URLs if they are extracting links from RSS feeds or microblogging content, or when converting from other social networking platform content.
Content negotiation is a catch-all term for ways of negotiating the representation of a resource through the HTTP protocol. In this document, it will specifically cover proactive negotiation using the Accept header.
Given the URL for an HTML document, such as
https://mixed.example/some/path/to/note-1, a
consumer could attempt to retrieve the corresponding
ActivityPub JSON-LD object using this HTTP request:
GET /some/path/to/note-1 HTTP/1.1
Host: mixed.example
Accept: application/activity+json, application/ld+json, application/json
A compliant server may respond with the ActivityPub JSON-LD object in the body of the response:
HTTP/1.1 200 OK
Content-Type: application/activity+json
{
"@context": "https://www.w3.org/ns/activitystreams",
"id": "https://mixed.example/some/path/to/note-1",
"type": "Article",
"content": "This is a note."
}
This is typically used when the ActivityPub server and the HTML server are implemented in the same software package. Because this has historically been the case for many implementations, some consumers expect this behavior to be the default.
Alternately, the server may respond with a 308
Permanent Redirect to indicate the location of the
JSON-LD representation.
HTTP/1.1 308 Permanent Redirect
Location: https://mixed.example/different/path/to/note-1.jsonld
If the server does not support content negotiation, it
may respond with a 406 Not Acceptable status
code.
HTTP/1.1 406 Not Acceptable
Content-Type: text/plain
No representation matching this request could be found.
Less compliant servers may ignore the
Accept header altogether and return the HTML
content regardless:
HTTP/1.1 200 OK
Content-Type: text/html
<html>
<head>
<title>Note 1</title>
</head>
<body>
<p>This is a note.</p>
A more difficult failure mode to detect arises when the
server does not support ActivityPub, but does support
content negotiation for another JSON format. Such a server
returns a 200 OK status code with a JSON
object that does not use JSON-LD, or JSON-LD object that
does not use the Activity Streams 2.0 vocabulary:
HTTP/1.1 200 OK
Content-Type: application/json
{
"property": "value",
"otherProperty": "otherValue"
}
The HTTP Link header can be used to indicate an alternative representation of a resource. A consumer can use this header to discover the ActivityPub JSON-LD object for an HTML page.
Given the URL for an HTML document, such as
https://html.example/user/test1/article-1, the
consumer can use an HTTP HEAD request to get the
headers for the resource, which will hopefully include the
Link header:
HEAD /user/test1/article-1 HTTP/1.1
Host: html.example
A compliant server will respond with the headers for the resource:
HTTP/1.1 200 OK
Link: <https://ap.example/api/articles/article-1.jsonld>; rel="alternate"; type="application/activity+json"
The link header with the alternate relation
type, and an ActivityPub-compatible media type, indicates
that the ActivityPub JSON-LD object is available at the
linked URL.
This can be a very efficient method of discovery, since the consumer does not need to download the entire HTML document and parse its contents.
Servers may also include the Link header in
the response to a GET request for the HTML
page.
GET /user/test1/article-1 HTTP/1.1
Host: html.example
A compliant server will respond with the headers for the resource:
HTTP/1.1 200 OK
Link: <https://ap.example/api/articles/article-1.jsonld>; rel="alternate"; type="application/activity+json"
Content-type: text/html
<html>
<head>
...
Some servers may return the full body of the HTML
document in response to a HEAD request,
without including a Link header.
HTTP/1.1 200 OK
Content-type: text/html
<html>
<head>
...
Webfinger
is a standard for discovering metadata about a resource
identified with an URL. Finding the ActivityPub URL for an
actor identified with an acct: URL is well
documented in the
ActivityPub and Webfinger report. However, Webfinger can
be used to find metadata about other resources, including
HTML pages with https: URLs.
Given an URL for a document, like
https://html.example/group-1.html, a GET request
can be made to an URL in the /.well-known/ path
of the domain for the URL, as follows:
GET /.well-known/webfinger?resource=https%3A%2F%2Fhtml.example%2Fgroup-1.html HTTP/1.1
Host: html.example
Note that the /.well-known/webfinger path is
fixed and required for Webfinger.
A compliant server will respond with the metadata for the resource:
HTTP/1.1 200 OK
Content-Type: application/jrd+json
{
"subject": "https://html.example/group-1.html",
"links": [
{
"rel": "alternate",
"type": "application/activity+json",
"href": "https://ap.example/api/groups/group-1.jsonld"
}
]
}
Note that unlike other URLs used in the examples in this
report, the /.well-known/webfinger path is fixed
and required for Webfinger.
The JRD JSON format includes a
number of properties, as defined in the Webfinger RFC
7033. The relevant data structure in this example is the
object in the links array with the
rel property set to alternate and
the type property set to
application/activity+json, an
ActivityPub-compatible media type. The href
property of this link is URL of the ActivityPub equivalent
for the HTML page.
Not all Webfinger-aware servers return JRD documents for
https URLs. Others might only return JRD
documents for URLs that represent actors, such as
registered users.
As with other link-relation-based discovery mechanisms,
like the HTTP Link header or the <link> element, a
JSON or JSON-LD media in the link's type
property might not indicate an ActivityPub URL, but some
other JSON or JSON-LD object.
These techniques require the ActivityPub object's URL as
input. Often, the object's URL is obtained either as a property
of another ActivityPub object, or from the id
property of the ActivityPub JSON-LD document.
If these techniques aren't successful, the consumer can use the URL to fetch the ActivityPub JSON-LD document, and then use a discovery technique that takes a document as input.
It's possible for the HTML and JSON-LD representations of an object to be found at the same URL.
Given the URL for an ActivityPub object, such as
https://mixed.example/some/path/to/note-1, a
consumer could attempt to retrieve the corresponding HTML
resource using this HTTP request:
GET /some/path/to/note-1 HTTP/1.1
Host: mixed.example
Accept: text/html
A compliant server may respond with the HTML document in the body of the response:
HTTP/1.1 200 OK
Content-Type: text/html
<html>
<head>
<title>Note 1</title>
</head>
<body>
<p>This is a note.</p>
Alternately, the server may respond with a 308
Permanent Redirect to indicate the location of the
HTML representation.
HTTP/1.1 308 Permanent Redirect
Location: https://mixed.example/different/path/to/note-1.html
If the server does not support content negotiation, it
may respond with a 406 Not Acceptable status
code.
HTTP/1.1 406 Not Acceptable
Content-Type: text/plain
No representation matching this request could be found.
Less compliant servers may ignore the
Accept header altogether and return the
JSON-LD content regardless:
HTTP/1.1 200 OK
Content-Type: application/activity+json
{
"@context": "https://www.w3.org/ns/activitystreams",
"id": "https://mixed.example/some/path/to/note-1",
"type": "Note",
"content": "This is a note."
}
It is possible to use the Link header to
identify an HTML page related to a given ActivityPub JSON-LD
resource. A Link header with the
alternate link relation and a type
equal to text/html indicates an HTML page
representing the same object.
The advantage of this technique is that it does not
require downloading and parsing the JSON-LD content of the
ActivityPub object. The Link header has fewer
options for formatting than other methods such as the
url property, for example, making it slightly
easier for consumers.
Given an ActivityPub JSON-LD object at
https://ap.example/some/path/person-1.jsonld,
a consumer could use a HEAD HTTP request to
get the relevant headers for the resource:
HEAD /some/path/person-1.jsonld HTTP/1.1
Host: ap.example
The publisher would respond with the HTTP headers,
including a Link header:
HTTP/1.1 200 OK
Content-Type: application/activity+json
Link: <https://html.example/profiles/person-1.html>; rel="alternate"; type="text/html"
Some non-compliant HTTP servers will send the full body
of the resource in the response to the HEAD
request.
The Webfinger protocol can be used to find an HTML page related to an ActivityPub object in a number of ways.
The consumer can identify the resource for a Webfinger
query in two ways. First, the id property,
usually an https URL, can be passed as the
resource parameter for the Webfinger query.
Alternately, if the ActivityPub object is an actor, an
acct URL in the format
acct:username@domain.example can be constructed
using the technique for
Webfinger reverse discovery. This acct URL
can be used as the resource parameter for the
Webfinger query.
The publisher can provide a link to the HTML representation of the object in the JRD output of the Webfinger query in at least two ways.
First, the links property of the output
object can contain a link object with a rel
property set to alternate and the
type property set to text/html. If
such a link exists, its href property is the URL
of the related HTML page.
Second, the links property of the JRD output
object may include an object with a rel property
set to http://webfinger.net/rel/profile-page.
This is defined to be "the main home/profile page that a
human should visit when getting info about that webfinger
account." (https://webfinger.net/rel/)
It is not guaranteed to be HTML, but a type
property can further define that. Per the definition, "it's
likely text/html if it's for users."
An advantage of using Webfinger for discovery is that it
is widely implemented by ActivityPub publishers to enable
using acct URLs as identities.
Given an ActivityPub Place object at
https://ap.example/geo/place-7.jsonld, a
consumer could use a Webfinger query to find the HTML page
for the object:
GET /.well-known/webfinger?resource=https%3A%2F%2Fap.example%2Fgeo%2Fplace-7.jsonld HTTP/1.1
Host: ap.example
Note that the /.well-known/webfinger path
is fixed and required for Webfinger.
The publisher could return the following JRD output:
{
"subject": "https://ap.example/geo/place-7.jsonld",
"links": [
{
"rel": "alternate",
"type": "text/html",
"href": "https://html.example/map/nl/ams/17921.html"
}
]
}
In this example, the links property of the
JRD object contains a single object with a rel
property set to alternate and a
type property set to text/html.
The href property of this object is the URL of
the HTML page representing the object.
Alternately, given an ActivityPub Person
object at
https://ap.example/profiles/person-19.jsonld,
the consumer could construct an acct URL as
acct:person-19@ap.example and use it as the
resource parameter for the Webfinger
query:
GET /.well-known/webfinger?resource=acct%3Aperson-19%40ap.example HTTP/1.1
Host: ap.example
Note that the /.well-known/webfinger path
is fixed and required for Webfinger discovery.
The publisher could return the following JRD output:
{
"subject": "acct:person-19@ap.example",
"links": [
{
"rel": "http://webfinger.net/rel/profile-page",
"type": "text/html",
"href": "https://html.example/profiles/person-19.html"
}
]
}
In this output, the
http://webfinger.net/rel/profile-page
relationship identifies an HTML page for the
Person object.
Some servers may not return JRD documents for
https URLs. Others might only return JRD
documents for URLs that represent actors, such as
registered users.
These techniques require the ActivityPub JSON-LD document as the input for the process. The document can be obtained through delivery via the ActivityPub protocol, or through the ActivityPub API, or by other means.
If none of these techniques are successful, the consumer can
obtain the URL of the object from the id property,
and then try one or more of the techniques that require an
URL.
url propertyActivityPub objects can have an optional url
property, which "[i]dentifies one or more links to
representations of the object." The property is the preferred
way to indicate a corresponding HTML page for an ActivityPub
object.
As with many Activity Vocabulary properties, this can have several formats:
Link object. This structure is used to
provide additional information about the link, including
the mediaType. For an equivalent HTML
representation, the mediaType property will be
"text/html". The href property of the
Link object is the URL.Link
objects.In this example, the url property is only a
string.
{
"@context": "https://www.w3.org/ns/activitystreams",
"id": "https://ap.example/some/path/person-1.jsonld",
"type": "Person",
"name": "Person One",
"url": "https://html.example/profile/person-1.html"
}
In the next example, the url property is a
full Link-type object with
mediaType property equal to "text/html".
{
"@context": "https://www.w3.org/ns/activitystreams",
"id": "https://ap.example/geo/place-17.jsonld",
"type": "Place",
"nameMap": {
"en": "Berlin"
},
"url": {
"type": "Link",
"mediaType": "text/html",
"href": "https://html.example/map/de/ber/ber.html"
}
}
In this final example, the url property is
an array of Link-type objects with different
mediaType properties.
{
"@context": "https://www.w3.org/ns/activitystreams",
"id": "https://ap.example/photos/gallery/image-3.jsonld",
"type": "Image",
"summary": "Jason and Carol at the lake house",
"url": [
{
"type": "Link",
"mediaType": "text/html",
"href": "https://html.example/gallery/3.html"
},
{
"type": "Link",
"mediaType": "image/webp",
"href": "https://upload.example/files/08/17/2021/lakehouse.webp"
}
]
}
url property failureWhen the url property is only a string, it
may not represent an HTML page. Especially for objects with
binary content types, like Image,
Video, and Audio, the
url property is often used for the URL of the
respective binary representation of the object.
The mediaType of Link-type
objects in the url property is not always
defined, and when it is defined, it is not always
"text/html".
The property is defined only for representations of the
current object. However, the Link-type object
can have a link relation property, rel.
Publishers may misuse the url property to
including links that aren't a representation of the object,
but instead a related object, like "next" or "author".
Publishers of HTML representations and ActivityPub representations include data or metadata to help with discovery of related representations or resources. This data is a claim that the linked resource really has the relationship stated.
Unfortunately, not all claims are true. Consumers need to verify the claims made by publishers, using the verification techniques described here. Some techniques are direct and can be used with confidence; others are heuristics that provide some level of support to the claims, but are not foolproof.
Verification is a process of confirming that an HTML page and an ActivityPub object represent the same resource. This is necessary to ensure that the publisher of one representation is not falsely connecting two unrelated resources.
The most reliable verification method is two-way discovery. This consists of first doing discovery in one direction, and then doing discovery in the other direction with the results. For example, doing discovery from an HTML page to an ActivityPub object, and then doing discovery from the ActivityPub object to, hopefully, the same HTML page.
Given an HTML page with the URL
https://html.example/downloads/image-14.html,
the consumer could discover the ActivityPub JSON-LD URL
using the Link header method:
HEAD /downloads/image-14.html HTTP/1.1
Host: html.example
The publisher would respond with the HTTP headers:
HTTP/1.1 200 OK
Content-Type: text/html
Link: <https://ap.example/api/images/image-14.jsonld>; rel="alternate"; type="application/activity+json"
Then, the consumer could fetch the ActivityPub object
at
https://ap.example/api/images/image-14.jsonld
and look for the URL of the HTML page in the
url property:
{
"@context": "https://www.w3.org/ns/activitystreams",
"id": "https://ap.example/api/images/image-14.jsonld",
"name": "Image 14",
"type": "Image",
"url": "https://html.example/downloads/image-14.html"
}
By comparing the URL in the ActivityPub object with the original HTML page URL, the consumer can confirm that the two representations relate to the same resource.
Two-way discovery can fail if the discovered representation does not have a path back to the origin representation.
Two-way discovery can become complicated if more than one discovered representation is found, which may connect back to more than one original representation. Verifying multiple relationships, and ignoring those that cannot be verified, complicates this process for the consumer.
Another mechanism for verifying a discovery process is to compare the origins of the original representation's URL and the discovered representation's URL. The origin is the combination of the scheme, host, and port of a URL. If the origins are the same, the publisher of both the HTML and the ActivityPub JSON-LD is probably the same, so the discovery results are probably reliable.
Content negotiation without a redirect will always have the same origin, as the URL of the HTML page and the URL of the JSON-LD representation are the same.
Given an HTML page with the URL
https://mixed.example/profiles/person-3, the
consumer could discover the ActivityPub JSON-LD URL using
the <link> element method:
<link
rel="alternate"
type="application/activity+json"
href="https://mixed.example/api/person/person-3" />
The href property of the <link>
element is the URL of the ActivityPub JSON-LD
representation. The origin of the URL of the HTML page is
https://mixed.example, and the origin of the
URL of the ActivityPub JSON-LD representation is
https://mixed.example, so the origins match
and the discovery is verified.
Same origin verification assumes that a single
publisher controls an entire domain. Although this is
often true for machine-readable formats like JSON-LD,
having multiple publishers in control of parts of a
domain is more common for HTML documents. For example,
documents with URLs starting with
https://html.example/home/user1/ might be
created by one user, and those starting with
https://html.example/home/user2/ might be
created by another. A carefully crafted <link> or
other mechanism could be used by one user to link their
HTML page to an ActivityPub object created by
another.
Same origin verification will give a false negative if the publisher is using different domains for HTML pages and ActivityPub JSON-LD objects. This can happen if ActivityPub features are added on to an existing published Web site, or if the publisher needs to keep the domains separate for implementation reasons. If same origin verification gives a negative result, other methods such as two-way verification should be used.
Another means of verification, or more precisely an excuse for skipping verification, is an allowlist. This is a list of origins or, possibly, other properties of the representation that can be used to confirm trust in the publisher and skip verification.
Assuming that each origin is controlled by a single publisher, if the consumer trusts the publisher, they can skip verification of discovery when the original representation has an URL with that origin.
Given an HTML page with the URL
https://html.example/profiles/person-3, the
consumer could discover the ActivityPub JSON-LD URL using
the Embedded JSON-LD method:
<script type="application/ld+json">
{
"@context": "https://www.w3.org/ns/activitystreams",
"id": "https://ap.example/api/person/person-3",
"type": "Person",
"name": "Person Three"
"url": "https://html.example/profiles/person-3"
}
</script>
Given that the consumer trusts the publisher of
https://html.example, they can skip
verification of the discovery process, and accept
https://ap.example/api/person/person-3 as
the ActivityPub JSON-LD representation's URL.
Maintaining an allowlist is time-consuming. The number of domains with Web sites is in the hundreds of millions; identifying even a tiny fraction to be trusted takes a lot of human effort.
Depending on allowlists as the only means of verification severely limits the number of domains that can be interacted with.
Verification of author discovery is necessary to ensure that attackers cannot maliciously ascribe content to an actor that did not create it.
Unfortunately, the only current way to fully verify the
authorship of an HTML object is by scanning the
outbox property of the actor object. With tens
or hundreds of thousands of items in the outbox not
unusual, this is a time-consuming process that is subject
to possible errors.
outbox properties in ActivityPub are
OrderedCollection objects, often with
OrderedCollectionPage objects that represent
pages of content. Scanning this collection from newest to
oldest members, the consumer can look for
Create activities with the url
property set to the HTML representation URL being verified,
or for activity objects with the url property
set to the HTML representation being verified.
The consumer has discovered that
https://ap.example/user/person-6.jsonld is
the author of the resource represented by the HTML
document at
https://html.example/blog/article-9.html.
The consumer retrieves the ActivityPub JSON-LD for the
person:
{
"@context": "https://www.w3.org/ns/activitystreams",
"id": "https://ap.example/user/person-6.jsonld",
"type": "Person",
"name": "Person Six",
"inbox": "https://ap.example/user/person-6/inbox",
"outbox": "https://ap.example/user/person-6/outbox",
"following": "https://ap.example/user/person-6/following",
"followers": "https://ap.example/user/person-6/followers",
"liked": "https://ap.example/user/person-6/liked"
}
The consumer then fetches the URL that is the value of
the outbox property:
{
"@context": "https://www.w3.org/ns/activitystreams",
"id": "https://ap.example/user/person-6/outbox",
"type": "OrderedCollection",
"totalItems": 3803,
"first": "https://ap.example/user/person-6/outbox/page/39"
}
It fetches the first page of the collection:
{
"@context": "https://www.w3.org/ns/activitystreams",
"id": "https://ap.example/user/person-6/outbox/page/39",
"type": "OrderedCollectionPage",
"partOf": "https://ap.example/user/person-6/outbox",
"next": "https://ap.example/user/person-6/outbox/page/40",
"items": [
{
"type": "Create",
"actor": "https://ap.example/user/person-6.jsonld",
"object": {
"id": "https://ap.example/object/article-10.jsonld",
"type": "Article",
"name": "Article Ten",
"url": "https://html.example/blog/article-10.html"
}
},
{
"type": "Create",
"actor": "https://ap.example/user/person-6.jsonld",
"object": {
"id": "https://ap.example/object/article-9.jsonld",
"type": "Article",
"name": "Article Nine",
"url": "https://html.example/blog/article-9.html"
}
},
{
"type": "Create",
"actor": "https://ap.example/user/person-6.jsonld",
"object": {
"id": "https://ap.example/object/article-8.jsonld",
"type": "Article",
"name": "Article Eight",
"url": "https://html.example/blog/article-8.html"
}
}
]
}
The second object in the items array is a
Create activity with an object
property with a value that includes an url
property with same value as the HTML URL we are trying to
verify. Since the claim from the HTML and the ActivityPub
JSON-LD support each other, the relation is verified.
This method is error-prone and depends on fetching
hundreds or thousands of pages. Not all pages will
include the full representation of the objects in their
items array; some may just include the
id property of each, requiring even more
requests.
Some authors may not include a Create
activity for every object they create on the Web. They
may use an author discovery process to identify the
author, but not include the object in the ActivityPub
representation.
This method, or heuristic, assumes that the claims of authorship are mutually supporting if the URLs of the representations have the same origin. The origin of an URL includes its protocol, domain name, and port number. The method assumes that the same entity (like a person or organization) controls all URLs published with this origin, and therefore would not make claims to contradict itself.
Given an HTML page with the URL
https://mixed.example/profiles/person-3.html,
the consumer could discover the ActivityPub JSON-LD URL
using the Embedded JSON-LD method with the value
https://mixed.example/api/person-3.jsonld.
Because both URLs have the origin
https://mixed.example, the consumer will
assume that the discovery is verified.
Assuming that the same entity controls creation of all URLs on a server is somewhat risky. For HTML creation, especially, some servers divide up into per-user paths. Other servers allow user-uploaded data, including JSON and HTML.
Consumers may include a list of origins or other properties of the representation that don't require verification. This assumes an externally-established trust relationship.
Given an HTML page with the URL
https://html.example/profiles/person-3.html,
the consumer could discover the ActivityPub JSON-LD URL
using the Embedded JSON-LD method with the value
https://ap.example/api/person-3.jsonld. If
the consumer trusts the publisher of
https://html.example, they can skip
verification of the discovery process, and accept
https://ap.example/api/person-3.jsonld as
the ActivityPub JSON-LD representation's URL.
Establishing trust relationships out-of-band is labor intensive, and most consumers will only have a small number of trusted domains or other entities.
Publishers that want consumers to be able to discover ActivityPub object and their authors should consider these methods.
When publishing an Activity Streams 2.0 JSON-LD object for ActivityPub, publishers should consider these best practices.
Link object on its own, or an array of
Link objects, as the url property
of the ActivityPub object. Explicitly include the
mediaType property with a value of
"text/html". Avoid using multiple links with the same
mediaType, since there's no easy way to
distinguish which one is the correct HTML representation of
the resource.attributedTo
property, either as an URL or as a JSON object with at
least the id property. For activities, include
actor, and for public keys, include
owner.Link headers with both
"alternate" and "author" relations. Add the
"application/activity+json" media type explicitly. Don't
include multiple links with the same relation and media
type, since there's not a clear algorithm for choosing
between them.https: URLs for actor
objects, and preferably for all objects. Include members of
the links array both for the "alternate"
relation, with media type "text/html", and the "author"
relation, with media type "application/activity+json" and
if possible with media type "text/html" as well. For
actors, implement the
http://webfinger.net/rel/profile-page
relation.When publishing HTML representations of an ActivityPub resource, include these discovery options:
link element with rel set to
"alternate" and type set to
"application/activity+json". If possible, also include a
link element for the author resource, with rel
set to "author" and type set to
"application/activity+json". Avoid having multiple links
with the same relation and media type, since there's not an
easy way to determine which one is the best.a elements to the page layout that
users can find and click. Add at least one a
element with rel equal to "alternate" and
type equal to "application/activity+json". Add
at least one a element with rel
equal to "author" and type equal to
"application/activity+json".Don't include multiple
a links with the same rel and
type values unless they also have the same
href value.Link headers
for discovery. Add one Link header with
rel set to "alternate" and type
set to "application/activity+json". Add another with
rel set to "author" and type set
to "application/activity+json". Avoid having duplicate
Link headers with the same rel
and type values.meta element with name
fediverse:creator and value set to the
Webfinger of the author preceded by an "@" symbol.These are some of the user stories that motivate this work.
Like the contents, so that I can share it with
my followers, let the author know I appreciated it, and save
it to my liked collection. A browser-based
ActivityPub API client could submit a Like
activity to the user's ActivityPub server, but it would need
to know the ID of the ActivityPub equivalent of the
page.Announce the contents, so that I can share it
with my followers. A browser-based ActivityPub API
client could submit a Announce activity to the
user's ActivityPub server, but it would need to know the ID
of the ActivityPub equivalent of the page.Follow the actor, so that I can get
updates about their activities in my inbox. A
browser-based ActivityPub API client could submit a
Follow activity to the user's ActivityPub
server, but it would need to know the actor ID of the actor
whose profile is being viewed.Note activity to the user's
ActivityPub server, with the profile actor's ID in the
to property, but the API client would need to
know the actor ID of the actor whose profile is being
viewed.Note
activity to the user's ActivityPub server, with the image's
author's actor ID in the to property, but the
API client would need to know the actor ID of the author of
the image.https://html.example/blog/page-1.html, an
ActivityPub API client could discover the related ActivityPub
ID https://ap.example/api/page-1.jsonld,
retrieve it with machine-readable metadata, and provide
affordances for interacting with the object, such as liking,
sharing, or replying.content is to the actor's profile page, it's
necessary to be able to turn that link into an actor ID to
allow more inspection of the actor and affordances like
following or blocking.This document uses a consistent format for example URLs:
https://{name}.example/{path}/{type}-{ordinal}{?ext}
Where:
{name} is the domain name of the server. There
are three default domain names used:
ap.example - A server that primarily
provides ActivityPub JSON-LD documents.html.example - A server that primarily
provides HTML documents.mixed.example - A server that provides
both ActivityPub JSON-LD and HTML documents.{path} is the path to the object. It should
be opaque; none of the paths in this document have semantic
meaning unless otherwise specified.{type} is the content type of the resource.
This will usually be the lowercase version of the ActivityPub
object type, such as Note, Person,
or Image.{ordinal} is an ordinal number, when
multiple objects are being described in the same
discussion.{ext} is an optional "file extension" that
indicates the Internet media type of the resource,
including:
.jsonld for JSON-LD objects.html for HTML documents.png for PNG images, .jpg
for JPEG images, etc..jsonld extension is not
common practice for ActivityPub id values. It
is used in this report to highlight that the URL is for a
JSON-LD representation.
The structure used in the examples is merely mnemonic and non-normative. None of the techniques described in this document depend on a particular URL structure, unless otherwise specified.
Unless otherwise specified, the techniques described below can be used with any Activity Streams 2.0 types. The best-defined groups of AS2 types for HTML discovery are actor types:
PersonApplicationServiceGroupOrganizationand digital content types:
NoteArticleImageVideoAudioDocumentPageOther ActivityPub types, such as activity types, are less likely to have their own HTML representations.
Collection types are often better represented
by an object they are closely related to. For example, an
actor's outbox collection is often provided on the
actor's profile page, which is a representation of the actor.
Similarly, the likes or replies of an
Image object are often provided on the object's
page, and don't have an independent HTML representation. That
said, this document does not preclude the possibility of HTML
representations for collection types.