BRUZZ scraping instructions

Primary list source: https://www.bruzz.be/rss.xml.
Important: the RSS feed has title/URL/date/short description only. It does not contain the full story.
Workflow:
1. Fetch RSS.
2. Filter items to the target date and URLs under https://www.bruzz.be/actua/.
3. Scrape every selected article page with Lightpanda: /usr/local/bin/lightpanda fetch --dump markdown --obey_robots <article-url>.
4. Use the scraped page text for summaries/excerpts, falling back to RSS description only if page scraping fails.
Do not use https://www.bruzz.be/actua as the main listing page: as of 2026-05-14 it returns a Drupal 404 page.
Keep Brussels news only; skip navigation, live radio blocks, ads, and repeated teasers.
Language: source is Dutch. Translate summaries into the target review language when writing the final review.
Cache paths:
- raw RSS: docs/cache/press-raw/bruzz-rss.xml
- article metadata: docs/cache/press-raw/bruzz-articles.json
- full scraped article pages: docs/cache/press-raw/bruzz-pages/*.md