Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addition of Braunschweiger Zeitung #340
Addition of Braunschweiger Zeitung #340
Changes from 2 commits
0f68583
90cf541
b14dcda
447ee9e
60be80a
460bbdd
5c6f687
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a
NewsMap
to me and not aSitemap
PS:
There also seems to be a sitemap judging by this link
https://www.braunschweiger-zeitung.de/sitemaps/archive/sitemap-2017-08-p00.xml.gz
Looks like a monthly pattern with pagination.
Could you check for what date range you get something back and if there is a second page
p01
?And afterwards add this as a sitemap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, how did you find it? I checked the robots.txt file and that only mentioned the NewsMap. Regarding the SiteMap: it seems rather weird. The earliest I can find is: https://www.braunschweiger-zeitung.de/sitemaps/archive/sitemap-2006-11-p00.xml.gz (at least if you consider the dates in the url), but if you access the sitemap, it covers articles from September 2016. I think that is the correct lower bound of available dates though. If using dates after that, it seems to work just fine. There's no page 1, one file seems to contain all articles of that month. In case I would want to add it, how would I go about it? I am having trouble to find an index for the sitemaps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got it by searching for
braunschweiger-zeitung.de sitemap
in google.This snippet generates all available sitemaps. You can add it with the
+
operator to sources.