Why do I see 403, 404 and 5xx resources in my project, yet these pages are alive?
If you're running an SEO scan on your website, it can be disheartening to see some of 4xx and 5xx pages listed as problematic. However, it's possible that these pages are actually alive and functioning properly. So, why does this problem happen, and how can you avoid it?
The root cause of this issue is typically related to how the SEO software is scanning your website. When the software sends out requests to your pages, it's looking for specific responses. A 4xx error occurs when a page cannot be found, and a 5xx error occurs when there's an issue on the server side. However, sometimes these error codes can be triggered incorrectly.
One common reason for this is that the software is sending too many requests too quickly. When this happens, the server can become overwhelmed and may start returning errors even though the pages are actually functioning properly.
By default Website Auditor is sending 5 simultaneous tasks, which is handled pretty well by 99% of websites. But if your website is hosted on a server that is not very powerful, this can result is 4xx/5xx errors.
To avoid this problem, there are a few steps you can take:
Reduce number of simultaneous tasks
Open Preferences -> Misc Global Settings -> set Number of Simultaneous Tasks to 1 or 2 threads:
- Set pauses between queries
Switch to Preferences -> Crawler Settings -> Speed tab. Enable pauses between queries. The default recommended limit is 10 requests per 1 second, but you can play around with the setting and change number of requests depending on the capacity of your server.