- Add documentation to the plugin
- Harmonize FastText language model with SearXNG's language model
Reosurces::
import fasttext # --> +10 MB
fasttext.load_model(str(data_dir / 'lid.176.ftz')) # --> +4MB
Suggested-by: @dalf
- To speed up and simplify the deployment use fasttext-wheel instead of fasttext
- Building numpy on the Alpine Linux of docker-images takes ages --> install
py3-numpy from Alpines package manager (apk)
- Alpine Linux on docker-images (musl libc) do not support fasttext-wheel (gnu
libc) --> patch Dockerfile and build from fastetxt:
sed -i s/fasttext-wheel/fasttext/ requirements.txt
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Currentty, when oa_doi_rewrite find a DOI in the result URL, it replace the URL.
In this commit, the plugin adds the key "doi" to the result,
so the paper.html can show it.
Only raise "suspicious Accept-Encoding" when both "gzip" and "deflate" are missing from Accept-Encoding.
Prevent Browsers which only implement one compression solution from being blocked by the limiter plugin.
Example Browser which is currently blocked: Lynx Browser (https://lynx.invisible-island.net)
This is a rewrite of the hostname_replace.py that:
- don't stop to replace URL in fields ('data_src', 'audio_src') if there isn't a
'parsed_url',
- adds a comment about keep or remove a result from the result list
- adds a loop over ['data_src', 'audio_src'] instead of doubling code lines
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Embedded HTML breaks SearXNG architecture. To modularize, HTML is generated in
the templates (oscar & simple) and result parameter 'embedded' is replaced by
'data_src' (and 'audio_src'), an URL for embedded content (<iframe>).
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
To test you need to redirect embeded videos (e.g.) from youtube to a invidios
instance. Search for videos using engine `!youtube lebowski`. The result URLs
and the embeded videos should link to the invidios instance.
Here is an example of such a `hostname_replace` configuration::
hostname_replace:
# youtube --> Invidious
'(.*\.)?youtube-nocookie\.com': 'invidio.xamh.de'
'(.*\.)?youtube\.com$': 'invidio.xamh.de'
'(.*\.)?invidious\.snopyta\.org$': 'invidio.xamh.de'
'(.*\.)?vid\.puffyan\.us': 'invidio.xamh.de'
'(.*\.)?invidious\.kavin\.rocks$': 'invidio.xamh.de'
'(.*\.)?inv\.riverside\.rocks$': 'invidio.xamh.de'
Closes: https://github.com/searxng/searxng/issues/873
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
can replace filtron:
* rate limite the number of request per IP and per (IP, User-Agent)
* block some bots
use Redis
data stored in Redis never contains the IP addresses, only HMAC using the secret_key
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
Disable the python code formatting from python-black, where the readability of
code suffers by formatting.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Instead of a hard-coded `oadoi.org` default, use the default value from
`settings.yml`.
Fix an issue in the themes: The replacement 'current_doi_resolver' contains the
doi_resolver_url, not the name of the DOI resolver. Compare return value of::
searx.plugins.oa_doi_rewrite.get_doi_resolver(...)
Fix a typo in `get_doi_resolver(..)`: suggested by @kvch:
*L32 should set doi_resolver not doi_resolvers*
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
use
from searx.engines.duckduckgo import _fetch_supported_languages, supported_languages_url # NOQA
so it is possible to easily remove all unused import using autoflake:
autoflake --in-place --recursive --remove-all-unused-imports searx tests