Why do I see 403, 404 and 5xx resources in my project, yet these pages are alive?
If you're running an SEO scan on your website, it can be disheartening to see some of 4xx and 5xx pages listed as problematic. However, it's possible that these pages are actually alive and functioning properly. So, why does this problem happen, and how can you avoid it?
The root cause of this issue is typically related to how the SEO software is scanning your website. When the software sends out requests to your pages, it's looking for specific responses. A 4xx error occurs when a page cannot be found, and a 5xx error occurs when there's an issue on the server side. However, sometimes these error codes can be triggered incorrectly.
One common reason for this is that the software is sending too many requests too quickly. When this happens, the server can become overwhelmed and may start returning errors even though the pages are actually functioning properly.
By default Website Auditor is sending 5 simultaneous tasks, which is handled pretty well by 99% of websites. But if your website is hosted on a server that is not very powerful, this can result is 4xx/5xx errors.
To avoid this problem, there are a few steps you can take:
-
Reduce number of simultaneous tasks
Open Preferences -> Misc Global Settings -> set Number of Simultaneous Tasks to 1 or 2 threads: - Set pauses between queries
Switch to Preferences -> Crawler Settings -> Speed tab. Enable pauses between queries. The default recommended limit is 10 requests per 1 second, but you can play around with the setting and change number of requests depending on the capacity of your server. - Enable JavaScript
There are some scenarios where JavaScript execution could cause a 4xx error. For example, if a JavaScript file is not found or is moved to a different location on the server, the browser may receive a 404 (Not Found) error when it tries to request that file. This could cause other elements on the page to fail as well, depending on how the JavaScript is written.
To avoid that, try to enable JavaScript execution in WebSite Auditor: Preferences -> Crawler Settings ->JavaScript tab
Comments
2 comments
What if I get error 403 forbidden for external website links in my posts? When I checked them manually, they are accessible. What can I do to get them off the reports in Website Auditor? Thanks
Aldin Lapinig, Very often external resources obey the same rules as internal resources on your website: they may block you due to excessive querying or if JavaScript is disabled. I recommend you to follow the same steps as described in this article.
As a second option you may check what is the real status code that these external recourses return. A recourse may be open and visible in your external browser, yet the response code may return a 403 Forbidden message. I recommend you to check the actual response status code with browser via Developers Tools.
Please sign in to leave a comment.