# BX1 scraping instructions

- Primary source: `https://bx1.be/feed/` (WordPress RSS).
- Use RSS first because it is structured, cheap, and includes article URLs/titles.
- Fallback: `/usr/local/bin/lightpanda fetch --dump markdown --obey_robots https://bx1.be/news/`.
- Keep only Brussels-relevant local news; skip video players, social embeds, programme pages, and duplicate tracking URLs.
- Language: source is usually French. Translate summaries into the target review language when writing the final review.
- Cache paths:
  - raw RSS: `docs/cache/press-raw/bx1-rss.xml`
  - fallback markdown: `docs/cache/press-raw/bx1.md`
