- Primary source: https://bx1.be/feed/ (WordPress RSS).
BX1 scraping instructions
- Primary source:
https://bx1.be/feed/(WordPress RSS). - Use RSS first because it is structured, cheap, and includes article URLs/titles.
- Fallback:
/usr/local/bin/lightpanda fetch --dump markdown --obey_robots https://bx1.be/news/. - Keep only Brussels-relevant local news; skip video players, social embeds, programme pages, and duplicate tracking URLs.
- Language: source is usually French. Translate summaries into the target review language when writing the final review.
- Cache paths:
- raw RSS:
docs/cache/press-raw/bx1-rss.xml - fallback markdown:
docs/cache/press-raw/bx1.md
- raw RSS: