HTML Method
The HTML method extracts feed URLs from HTML content by scanning link elements and anchor tags.
Follows RSS Board Autodiscovery and WHATWG Feed Autodiscovery specs.
How It Works
The HTML method scans two types of elements:
Link Elements
Looks for <link> elements that advertise feeds:
<!-- rel="alternate" with feed MIME type -->
<link rel="alternate" type="application/rss+xml" href="/feed.xml" />
<!-- rel="feed" (WHATWG spec) -->
<link rel="feed" href="/feed" />Anchor Elements
Scans <a> tags for feed links using two strategies:
- URI matching — Checks if
hrefcontains common feed paths like/feed,/rss.xml. - Label matching — Checks if the link text,
title, oraria-labelcontains words like "RSS", "Feed", "Subscribe". This catches icon-only links that have no visible text.
<!-- Matched by URI -->
<a href="/feed.xml">XML</a>
<!-- Matched by label (text) -->
<a href="/subscribe">RSS Feed</a>
<!-- Matched by label (title / aria-label) on an icon-only link -->
<a href="/blog/syndication" title="RSS feed"><svg>...</svg></a>Configuration
Feedscout comes with reasonable defaults, but you can customize how HTML is parsed if needed.
Link Selectors
Control which <link> elements are matched:
import { mimeTypes } from 'feedscout/feeds'
const feeds = await discoverFeeds(url, {
methods: {
html: {
linkSelectors: [
{ rel: 'alternate', types: mimeTypes },
{ rel: 'feed' },
],
},
},
})Anchor URIs
Specify URI patterns to match in anchor href attributes:
const feeds = await discoverFeeds(url, {
methods: {
html: {
anchorUris: ['/feed', '/rss', '/atom', '/rss.xml', '/feed.xml'],
},
},
})Anchor Labels
Specify text patterns to match in anchor content:
const feeds = await discoverFeeds(url, {
methods: {
html: {
anchorLabels: ['rss', 'feed', 'atom', 'subscribe'],
},
},
})Anchor Attributes
Specify element attributes to scan for the anchor labels, on the anchor itself and on its descendants. This finds icon-only feed links whose label lives in an attribute rather than visible text — for example a title, an aria-label, or the layer name Framer emits on a feed icon (data-framer-name="RSS Icon"):
const feeds = await discoverFeeds(url, {
methods: {
html: {
anchorAttributes: ['aria-label', 'title', 'data-framer-name'],
},
},
})Ignored URIs
Exclude certain URI patterns from anchor matching:
const feeds = await discoverFeeds(url, {
methods: {
html: {
anchorIgnoredUris: ['wp-json/oembed/', 'wp-json/wp/'],
},
},
})Default Values
You can import the default HTML options:
import { defaultHtmlOptions } from 'feedscout/feeds'The defaults include comprehensive anchor URIs and common feed-related labels. You can also import individual pieces:
import {
linkSelectors,
anchorLabels,
urisComprehensive,
ignoredUris,
} from 'feedscout/feeds'Using Directly
You can use the HTML discovery function directly to get URIs without validation:
import { discoverUrisFromHtml } from 'feedscout/methods'
const uris = discoverUrisFromHtml(htmlContent, {
baseUrl: 'https://example.com',
linkSelectors: [{ rel: 'alternate', types: ['application/rss+xml'] }],
anchorUris: ['/feed'],
anchorLabels: ['rss'],
anchorIgnoredUris: [],
})
// [
// 'https://example.com/feed.xml',
// 'https://example.com/rss',
// ]