- Add documentation to the plugin
- Harmonize FastText language model with SearXNG's language model
Reosurces::
import fasttext # --> +10 MB
fasttext.load_model(str(data_dir / 'lid.176.ftz')) # --> +4MB
Suggested-by: @dalf
- To speed up and simplify the deployment use fasttext-wheel instead of fasttext
- Building numpy on the Alpine Linux of docker-images takes ages --> install
py3-numpy from Alpines package manager (apk)
- Alpine Linux on docker-images (musl libc) do not support fasttext-wheel (gnu
libc) --> patch Dockerfile and build from fastetxt:
sed -i s/fasttext-wheel/fasttext/ requirements.txt
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
settings.yml:
* The default URL was unix:///usr/local/searxng-redis/run/redis.sock?db=0
* The default URL is now "false"
The default URL makes the log difficult to deal with:
if the admin didn't install a Redis instance, the logs record a false error.
It worked before because SearXNG initialized the Redis connection when the limiter started.
In this commit, SearXNG initializes Redis in searx/webapp.py
so various components can use Redis without taking care of the initialization step.
no_result_for_http_status contains a list of HTTP status.
These HTTP status are seen an empty result list.
In other cases an exception is thrown as usual.
Previously raise_for_httperror were ignoring all HTTP error,
which make defective engines invisible in the stats.
By using new property `qwant_categ:` the category of qwant is no longer bound to
the category of SearXNG.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Neeva is "the world's first ad-free, private search engine" and uses data from Apple, Bing, Yelp and "others".
They claim to crawl "hundreds of millions" of URLs a day (https://twitter.com/Neeva/status/1536447373903335426).
This implements the Deepl Translation engine. It works nearly like lingva but
directly to the deepl API. This api only needs a to-lang, from-lang is a fake
by now.
There is a free option to use [1].
[1] https://www.deepl.com/pro-api?cta=header-pro-api for registering a free account.
yep.com is still in beta, the api.yep.com does not have paging support. There
is only a 'limit' argument with a maximum of 100 results.
yep.com seems fast; there is nor need for a timeout of 12 sec.
The API returns JSON nevertheless what the HTTP header is, the "show more"
button on yep.com's web site does not set a special HTTP Accept header.
FYI: The index does not support languages, the WEB UI does not offer a language
selection of the results and the entire index seems in English.
Closes: https://github.com/searxng/searxng/issues/1619
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Most engines that support languages (and regions) use the Accept-Language from
the WEB browser to build a response that fits to the language (and region).
- add new engine option: send_accept_language_header
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
The engine name is not only a *name* its also a identifier that is used in
logs, HTTP headers and more. Unicode characters in the name of an engine could
cause various issues.
Closes: https://github.com/searxng/searxng/issues/1544
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Emojipedia is an emoji reference website which documents the meaning and
common usage of emoji characters in the Unicode Standard. It is owned by Zedge
since 2021. Emojipedia is a voting member of The Unicode Consortium.[1]
Cherry picked from @james-still [2[3] and slightly modified to fit SearXNG's
quality gates.
[1] https://en.wikipedia.org/wiki/Emojipedia
[2] 2fc01eb20f
[3] https://github.com/searx/searx/pull/3278
Add a new setting: general.donation_url
By default the value is https://docs.searxng.org/donate.html
When the value is false, the link is hidden
When the value is true, the link goes to the infopage donation,
the administrator can create a custom page.
Since PR 932 [1][2] static files can't be delivered by HTTP server any longer.
This patch makes the hash paramter in the URL of static files:
/static/themes/simple/css/searxng.min.css?5fde34a74bc438c7b56ec8c6501e131cc9914bd8
optional. By default the hash parameter is disabled.
HINT:
Instances that do not deliver static files by their HTTP server and have a
long expire time [3] should enable this option.
----
This is only a interim solution, on the long run:
make static.build.commit
creates files including the file name:
css/searxng-5fde34a74bc438c7b56ec8c6501e131cc9914bd8.min.css
and a mapping.json with this content[4]
[1] https://github.com/searxng/searxng/issues/964
[2] https://github.com/searxng/searxng/pull/932#issuecomment-1067039518
[3] 5583336440
[4] https://github.com/searxng/searxng/pull/932#issuecomment-1067216426
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Two different threads ( = two different user queries) can call the request
function in a row and then the response function. The namespace will be same
since this is the same engine.
To keep exactly the same value ``base_url`` must be stored in params and then
retrieve using ``resp.search_params["base_url"]``.
Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#discussion_r799324861
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
can replace filtron:
* rate limite the number of request per IP and per (IP, User-Agent)
* block some bots
use Redis
data stored in Redis never contains the IP addresses, only HMAC using the secret_key
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>