How to crawl a sitemap with Screaming Frog
Published 1 week ago | Last update 1 week ago
Screaming Frog is a great tool for SEO, helping SEO's fix errors and run audits on their website. Today we will be looking at a feature of Screaming Frog.
382 views, 0 RAMs, and 0 comments
Screaming Frog is a great tool for all sorts of uses.
It's basically a tamed spider that crawls your website exactly as Google Bot would, except it gives you the data from the crawl in the form of a report allowing you to fix errors.
There are times when letting the Screaming Frog spider crawl your website on it's own won't really do the job, examples of this include:
- You're crawling someone elses website
- You only want to crawl a part of your website
- You're scraping content and only want to visit the URL's where this content is contained.
- You're website has orphan pages or subdomains
- You're testing a website before launch so linking/archive pages are not available
- Fuck it, some other retarded reason
To do this you may find yourself wishing you can crawl the website via the sitemap, well you can!
Step 1: Find the sitemap
I'm going to assume everyone is a responsible person and they are crawling their own website, so finding sitemaps should be very easy. However for those who are not responsible or you don't know where your sitemap is, viewing the robot.txt file of most websites will lead you directly to this.
This can be found at:
You can also use google, a search similar to this will probably find most sitemaps.
site:domain.com sitemap xml
Step 2: Tell Screaming Frog where the sitemap is
Now we know where this is, lets load it into Screaming Frog.
Head to "Configuration" > "Spider", you will see this screen.
Note: If you can't see this, please update your program.
Load your sitemaps into the box highlighted, if you have multiple sitemaps just put one per line.
Click "OK" when complete.
Step 3: Crawl baby crawl
Now Screaming Frog knows what to do, enter the domain in the crawl field and click "Start".
Screaming Frog will now be crawling your sitemap(s) loaded in the previous step.
I'm Hash Brown, I've done "computer stuff" for all my life.
Here are some of my latest articles:
Account created 2 months ago.
28 posts, 230 comments, and 56 RAMs.
2 days ago:
Posted thread GeneratePress: Total overview of what it can offer and why it's the only theme I use.