Bangs with a `*` suffix (e.g. `!!d*`) overwrite Bangs with the same
prefix (e.g. `!!d`) [1]. This can be avoid when a non printable character is
used to tag a LEAF_KEY.
[1] https://github.com/searxng/searxng/pull/740#issuecomment-1010411888
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
There is an issue with redis v4.1.0 [1] / for the interim lets remove this
python dependency.
[1] https://github.com/searxng/searxng/issues/741
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
An ambiguous bang like `!!d` raises an exception in function get_bang_url(). A
bang is only unique when the bang_definition from get_bang_definition_and_ac() is
a string / for a ambiguous bang the returned bang_definition is a dictionary.
Reported-by: user prg at #searxng:matrix.org on 2022/01/11
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
In case of CAPTCHA raise a SearxEngineCaptchaException and suspend for 7 days.
When get_sc_code() fails raise a SearxEngineResponseException and suspend for 7
days.
[1] https://github.com/searxng/searxng/pull/695
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Startpage has introduced new anti-scraping measures that make SearXNG instances
run into captchas:
1. some arguments has been removed and a new `sc` has been added.
2. search path changed from `do/search` to `sp/search`
3. POST request is no longer needed
Closes: https://github.com/searxng/searxng/issues/692
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
api.openverse.engineering is a little picky and wants to have a trailing slash
in the path:
/v1/images? -->/ v1/images/?
otherwise it redirects, here is the debug log:
DEBUG searx.network.openverse : HTTP Request: GET https://api.openverse.engineering/v1/images?&page=1&page_size=20&format=json&q=foo "HTTP/2 301 Moved Permanently" (text/html; charset=utf-8)
DEBUG searx.network.openverse : HTTP Request: GET https://api.openverse.engineering/v1/images/?&page=1&page_size=20&format=json&q=foo "HTTP/2 200 OK" (application/json)
WARNING searx.engines.openverse : ErrorContext('searx/search/processors/online.py', 105, 'count_error(', None, '1 redirects, maximum: 0', ('200', 'OK', 'api.openverse.engineering')) True
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
The implementation of the etools engine is poor. No date-range support, no
language support and it is broken by a CAPTCHA.
etools is a metasearch engine, the major search engines it supports (google,
bing, wikipedia, Yahoo) are already available in SeaarXNG.
While etools does support several engines we currently don't support directly,
support for them should be added directly to SearXNG if there is demand.
In practice: in SearXNG the worse etools results will be mixed with good results
from other engines we have (as long as there is no captcha).
At best case, what we win with etools is in e.g. results from de.ask.com in a
query from a german request .. in all other cases worse results are bubble up in
SearXNG's result list.
[1] https://github.com/searxng/searxng/issues/696#issuecomment-1005855499
Closes: https://github.com/searxng/searxng/issues/696
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
The previous implementation used two hash sets and a list.
... that's not necessary ... a single hash map suffices.
And it's also less error prone ... because the previous data structure
allowed a setting to be enabled and disabled at the same time.