How do I collect only part of a website with WebSite Auditor?
There are two main methods how you can make WebSite Auditor find and collect only part of a website.
Use robots.txt instructions
- In your robots.txt file create instructions for SEO-PowerSuite-bot and set disallow to all pages and categories that you do not want to be collected into your project.
If you do not know how to create instructions in robots. txt, check this article to learn how you can create and modify your robots.txt with Website Auditor.
- Open your Website Auditor and start a new project with expert options enabled:
- At Step 2 make sure to enable SEO-PowerSuite-bot instructions:
- Click Finish and Website Auditor will be collecting only those pages that are not closed for scanning in your robots.txt file.
- Start a new project with expert options and at Step 2 switch to the Filtering tab.
- If you want to collect only a specific part of your website that has some common word or characters in the URL address, enter that word in the first filtering option (Collect only pages with the following words in the URL).
- However, if your choice is to ignore specific pages or categories, while collecting all the rest of t he domain, then use the second filtering option: Do not Scan Pages with the following words in the URL.
The difference of this method from the method with using robots.txt is that Website Auditor will not visit or scan pages that are disallowed in robots.txt.
But if using filters the software will still visit and scan excluded pages but they will not be added to the final project. The second method is longer and more resource-demanding, especially, when scanning huge domains.