You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
By default, Apify ignores URL fragments when computing URL uniqueness. This means http://www.example.com#foo and http://www.example.com#bar are considered equal. These URLs are skipped once http://www.example.com is crawled. This makes sense for many websites because URL fragments often link to sections within the same HTML page.
We found an example of an internal website whose URL fragment links load different HTML pages. Running ai-scan --crawl produced logs that discovered many links but exited without crawling them.
Specifying explicit input URLs doesn't work around the problem. We still strip URL fragments from input URLs.
Describe the solution you'd like
Clients of the service and the accessibility-insights-scan package should be able to control whether Apify includes URL fragments in its uniqueness check. We may be able to leverage the keepUrlFragment argument in Apify.
Clients who use URL fragments to link to sections of a page (like we do in https://accessibilityinsights.io/) would not use the option (to avoid scans on duplicate UI).
The text was updated successfully, but these errors were encountered:
This issue has been marked as ready for team triage; we will triage it in our weekly review and update the issue. Thank you for contributing to Accessibility Insights!
Is your feature request related to a problem? Please describe.
By default, Apify ignores URL fragments when computing URL uniqueness. This means
http://www.example.com#foo
andhttp://www.example.com#bar
are considered equal. These URLs are skipped oncehttp://www.example.com
is crawled. This makes sense for many websites because URL fragments often link to sections within the same HTML page.We found an example of an internal website whose URL fragment links load different HTML pages. Running
ai-scan --crawl
produced logs that discovered many links but exited without crawling them.Specifying explicit input URLs doesn't work around the problem. We still strip URL fragments from input URLs.
Describe the solution you'd like
Clients of the service and the
accessibility-insights-scan
package should be able to control whether Apify includes URL fragments in its uniqueness check. We may be able to leverage the keepUrlFragment argument in Apify.Clients who use URL fragments to link to sections of a page (like we do in https://accessibilityinsights.io/) would not use the option (to avoid scans on duplicate UI).
The text was updated successfully, but these errors were encountered: