Customize Data Extraction

By default, Feedscout uses Feedsmith to parse and validate discovered feed URLs. You can provide a custom extractor to change how feeds are validated or to extract additional metadata.

How Extractors Work

After discovering potential feed URLs, Feedscout fetches each URL and passes the content to an extractor function. The extractor determines if the content is a valid feed and extracts metadata. See the DiscoverExtractFn type for the interface.

Use Cases

Custom extractors can be used for:

Adding custom metadata — Extract additional fields like language, images, or items
Custom validation — Reject feeds with no items, old feeds, or based on other criteria
Using a different parser — Replace the default Feedsmith parser with another library
Blogroll extractors — Custom extractors also work with discoverBlogrolls

Example

typescript

import type { DiscoverExtractFn, DiscoverResult } from 'feedscout'
import { parseFeed } from 'feedsmith'

type CustomFeedResult = {
  format: string
  title?: string
  itemCount: number
}

const customExtractor: DiscoverExtractFn<CustomFeedResult> = async ({ url, content }) => {
  try {
    const { format, feed } = parseFeed(content)

    return {
      url,
      isValid: true,
      format,
      title: feed.title,
      itemCount: feed.items?.length ?? 0,
    }
  } catch (error) {
    return { url, isValid: false, error }
  }
}

const feeds = await discoverFeeds(url, {
  methods: ['html', 'guess'],
  extractFn: customExtractor,
})

// [{
//   url: 'https://example.com/feed.xml',
//   isValid: true,
//   method: 'guess',
//   format: 'rss',
//   title: 'Example Blog',
//   itemCount: 10,
// }]

Customize Data Extraction ​

How Extractors Work ​

Use Cases ​

Example ​

Customize Data Extraction

How Extractors Work

Use Cases

Example