- Primary source: Google News sitemap: https://www.brusselstimes.com/google-news-sitemap.xml.
Brussels Times scraping instructions
- Primary source: Google News sitemap:
https://www.brusselstimes.com/google-news-sitemap.xml. - Use
node scripts/fetch-brussels-times.mjs YYYY-MM-DD; it filters recent Brussels-relevant categories and outputs JSON. - Avoid scraping article pages unless a summary is genuinely required; sitemap title/category/date is usually enough for the daily digest.
- Language: source is English. Translate summaries into French/Dutch if multilingual output is generated.
- Cache path:
docs/cache/press-raw/brusselstimes.json.