recoll is a local search engine based on Xapian:
http://www.lesbonscomptes.com/recoll/
By itself recoll does not offer web or API access,
this can be achieved using recoll-webui:
https://framagit.org/medoc92/recollwebui.git
This engine uses a custom 'files' result template
set `base_url` to the location where recoll-webui can be reached
set `dl_prefix` to a location where the file hierarchy as indexed by recoll can be reached
set `search_dir` to the part of the indexed file hierarchy to be searched, use an empty string to search the entire search domain
Xpath engine and results template changed to account for the fact that
archive.org doesn't cache .onions, though some onion engines migth have
their own cache.
Disabled by default. Can be enabled by setting the SOCKS proxies to
wherever Tor is listening and setting using_tor_proxy as True.
Requires Tor and updating packages.
To avoid manually adding the timeout on each engine, you can set
extra_proxy_timeout to account for Tor's (or whatever proxy used) extra
time.
A new "base" engine called command is introduced. It is the foundation for all command line engines for now.
You can use this engine to create your own command line engine.
Add some engines (commented out to make sure no one enables anything accidentally):
* git grep: This engine lets you grep in the searx repo.
* locate: If locate is installed and initialized, you can search on the FS.
* find: You can find files with a specific name from where you started searx.
* pattern search in files: This engine utilizes the command fgrep.
* regex search in files: This engine runs `grep` to find a file based on its contents.
Sending queries through POST, while better for privacy, breaks functionality
with certain extensions (e.g. Firefox containers). Since Firefox does
not send cookies when requesting `/opensearch.xml`, users cannot easily
switch to GET on the client side unless they make a custom search
engine. This commit allows admins to modify the default method on their
side so they can set it to GET if needed.
- enabling HTTPS for sci-hub.tw by default
- making sci-hub the default DOI resolver as it has the largest collection of scientific articles.
- replaced doai.io with dissem.in, as it redirects to this new domain.
Co-authored-by: Aurora of Earth <auroraofearth@ya.ru>