mirror of
https://forge.fsky.io/wl/pages.git
synced 2025-04-19 17:03:41 -05:00
31 lines
2.1 KiB
Markdown
31 lines
2.1 KiB
Markdown
+++
|
|
title = "Anubis is a joke"
|
|
date = 2025-04-16
|
|
description = "an easily bypassable one, and not actually protecting your site (against anything other than really low effort scrapes)"
|
|
+++
|
|
|
|
Over the past few months, a lot of people have turned to Anubis by Xe Iaso for trying to protect
|
|
their sites, primarily Git forges and alternative frontends, against AI scraping.
|
|
|
|
Anubis is a new PoW captcha "solution" that (allegedly) holds out scrapers by slowing down your
|
|
browsing and forcing you to enable JavaScript to pass a challenge to view the site. Once it's wasted
|
|
a few seconds of your time and made you reevaluate the worth of whatever you were visiting, the
|
|
stupid anime girl (previously AI generated) it shows you give a smile and you're on your way. This
|
|
challenge only will work on Chromium and its Google-funded controlled opposition, Firefox. Basilisk
|
|
does seem to work, though with broken CSS. It doesn't even work on Safari (allegedly, I don't own an
|
|
iToy to test this with) and no other browser (until you read the next section) works on this.
|
|
|
|
There's one small problem to Anubis though. By default (which no installation I've checked changes),
|
|
Anubis will only present a challenge to User-Agents with "Mozilla" and some obvious scraper agents,
|
|
at the time of me writing this. You can check this in /data/botPolicies.json.
|
|
|
|
This means all one of those evil scrapers Anubis is supposed to protect against have to do to bypass
|
|
Anubis is not use one of these User-Agents. It also means that you too can completely bypass this as
|
|
I know it's been annoying a lot of people lately. You can curl a site using the default config (most
|
|
of them), and it won't give an Anubis challenge, it'll just show you the site in its original
|
|
form. No special options, no custom User-Agent, just curl http://domain.name and it'll let you
|
|
through. This is applicable to your normal browser as well, just give it a user agent that doesn't
|
|
contain "Mozilla" or any of the other terms in the file and you won't have any problems.
|
|
|
|
I was expecting a much more involved workaround to dealing with this piece of shit but no, all you
|
|
have to do is give it a UA not containing some keywords.
|