A blocklist and a passlist can be configured in /etc/searxng/limiter.toml::
[botdetection.ip_lists]
pass_ip = [
'51.15.252.168', # IPv4 of check.searx.space
]
block_ip = [
'93.184.216.34', # IPv4 of example.org
]
Closes: https://github.com/searxng/searxng/issues/2127
Closes: https://github.com/searxng/searxng/pull/2129
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
In my tests I see bots rotating IPs (with endless IP lists). If such a bot has
100 IPs and has three attempts (SUSPICIOUS_IP_MAX = 3) then it can successfully
send up to 300 requests in one day while rotating the IP. To block the bots for
a longer period of time the SUSPICIOUS_IP_WINDOW, as the time period in which an
IP is observed, must be increased.
For normal WEB-browsers this is no problem, because the SUSPICIOUS_IP_WINDOW is
deleted as soon as the CSS with the token is loaded.
SUSPICIOUS_IP_WINDOW = 3600 * 24 * 30
Time (sec) before sliding window for one suspicious IP expires.
SUSPICIOUS_IP_MAX = 3
Maximum requests from one suspicious IP in the :py:obj:`SUSPICIOUS_IP_WINDOW`."""
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
For correct determination of the IP to the request the function
botdetection.get_real_ip() is implemented. This fonction is used in the
ip_limit and link_token method of the botdetection and it is used in the
self_info plugin.
A documentation about the X-Forwarded-For header has been added.
[1] https://github.com/searxng/searxng/pull/2357#issuecomment-1566211059
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
- counting requests in LONG_WINDOW and BURST_WINDOW is not needed when the
request is validated by the link_token method [1]
- renew a ping-key on validation [2], this is needed for infinite scrolling,
where no new token (CSS) is loaded. / this does not fix the BURST_MAX issue in
the vanilla limiter
- normalize the counter names of the ip_limit method to 'ip_limit.*'
- just integrate the ip_limit method straight forward in the limiter plugin /
non intermediate code --> ip_limit now returns None or a werkzeug.Response
object that can be passed by the plugin to the flask application / non
intermediate code that returns a tuple
[1] https://github.com/searxng/searxng/pull/2357#issuecomment-1566113277
[2] https://github.com/searxng/searxng/pull/2357#discussion_r1208542206
[3] https://github.com/searxng/searxng/pull/2357#issuecomment-1566125979
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
To intercept bots that get their IPs from a range of IPs, there is a
``SUSPICIOUS_IP_WINDOW``. In this window the suspicious IPs are stored for a
longer time. IPs stored in this sliding window have a maximum of
``SUSPICIOUS_IP_MAX`` accesses before they are blocked. As soon as the IP makes
a request that is not suspicious, the sliding window for this IP is droped.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
To activate the ``link_token`` method in the ``ip_limit`` method add the
following to your ``/etc/searxng/limiter.toml``::
[botdetection.ip_limit]
link_token = true
Related: https://github.com/searxng/searxng/pull/2357#issuecomment-1554116941
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
In order to be able to meet the outstanding requirements, the implementation is
modularized and supplemented with documentation.
This patch does not contain functional change, except it fixes issue #2455
----
Aktivate limiter in the settings.yml and simulate a bot request by::
curl -H 'Accept-Language: de-DE,en-US;q=0.7,en;q=0.3' \
-H 'Accept: text/html'
-H 'User-Agent: xyz' \
-H 'Accept-Encoding: gzip' \
'http://127.0.0.1:8888/search?q=foo'
In the LOG:
DEBUG searx.botdetection.link_token : missing ping for this request: .....
Since ``BURST_MAX_SUSPICIOUS = 2`` you can repeat the query above two time
before you get a "Too Many Requests" response.
Closes: https://github.com/searxng/searxng/issues/2455
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>