recoll is a local search engine based on Xapian:
http://www.lesbonscomptes.com/recoll/
By itself recoll does not offer web or API access,
this can be achieved using recoll-webui:
https://framagit.org/medoc92/recollwebui.git
This engine uses a custom 'files' result template
set `base_url` to the location where recoll-webui can be reached
set `dl_prefix` to a location where the file hierarchy as indexed by recoll can be reached
set `search_dir` to the part of the indexed file hierarchy to be searched, use an empty string to search the entire search domain
This makes it easier to separately handle search and index requests
from a web server or from a reverse proxy.
If a request to index contains a query, a permanent redirect HTTP response
is returned. This should give some level of backwards compatibility
for users that have set a searx instance in their browser's search bar.
Xpath engine and results template changed to account for the fact that
archive.org doesn't cache .onions, though some onion engines migth have
their own cache.
Disabled by default. Can be enabled by setting the SOCKS proxies to
wherever Tor is listening and setting using_tor_proxy as True.
Requires Tor and updating packages.
To avoid manually adding the timeout on each engine, you can set
extra_proxy_timeout to account for Tor's (or whatever proxy used) extra
time.
When the user add searx as a search engine, the browser loads the /opensearch.xml URL without the cookies.
Without the query parameters, the user preferences are ignored (method and autocomplete).
In addition, opensearch.xml is modified to support automatic updates,
see https://developer.mozilla.org/en-US/docs/Web/OpenSearch
A new "base" engine called command is introduced. It is the foundation for all command line engines for now.
You can use this engine to create your own command line engine.
Add some engines (commented out to make sure no one enables anything accidentally):
* git grep: This engine lets you grep in the searx repo.
* locate: If locate is installed and initialized, you can search on the FS.
* find: You can find files with a specific name from where you started searx.
* pattern search in files: This engine utilizes the command fgrep.
* regex search in files: This engine runs `grep` to find a file based on its contents.
Sending query params over GET seems to be the only way to be able to
enable autocomplete in the browser. This commit adds the necessary URL
formatting to opensearch.xml. In order to identify queries coming from
the URL bar (rather than an AJAX request), which requires a different
JSON format and MIME type, the request headers are checked for
"X-Requested-With: XMLHttpRequest" which is added by jQuery request.
Inline styles are blocked by default with Content Security Policy (CSP). Move
the rest of inline styles to CSS and correct the HTML template of the oscar
preference page.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
A *brand* of searx is a fork which might have its own design and some special
functions which might bee reasonable in a special context.
In this sense, the fork might have its own documentation but not its own issue
tracker. The *upstream* of a brand is always https://github.com/asciimoo from
where the brand-fork pulls the master branch regularly. A fork which has its
own issue tracker is a spin-off and out of the scope of the searx project
itself. The conclusion is:
- hard code ISSUE_URL (in the Makefile)
- always refer to DOCS_URL
- links in the about page refer to the *upstream* (searx project)
except DOCS_URL
- "fork me on github" ribbons refer to the *upstream*
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
We have some variables in the build environment which are also needed in the
templating process. Theses variables are relavant if one creates a fork with
its own branding. We treat these variables under the term 'brands'.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
When results are fetched from any programming related documentation site
(like git-scm.com, docs.python.org etc), content in Info box is shown as
raw HTML code.
This change addresses the issue by using "safe" filter feature provided by
Django. See,
- https://docs.djangoproject.com/en/3.0/ref/templates/builtins/#safe
- Searx issue tracker (issue #1649), for more information.
Resolves: #1649
In low width devices like mobile, tablet etc, info box is present at
bottom of the page.
This change addresses the issue by rearranging column grids for low
width devices and move side bar at top of the page. See
- https://getbootstrap.com/docs/3.3/css/#grid-column-ordering.
- and Searx issue tracker (issue#1777), for more information.
Effect: Along with Info, Suggestion and Link boxes also move to top of
the page.
Resolves: #1777
Adding a CR in some files and in others not, is a good starting point for a
DOS+Unix mess we all have already seen in many projects.
Patch fixes all files matching (even those comming from grunt's build)::
find ./searx -exec file {} \; | grep CR
BTW: Same with mixing TAB and SPACE indent styles in one and the same file. So
if sources are tuched here in this patch, its also fixed.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Add image format and source information to display - needs changes to engines to actually display something.
Displays result.source (website from which the image was taken) and result.img_format (image type and size).
Result is styled with result-format and result-source classes. See PR #1566 for an example of an engine which has the necessary changes.
Strip <span class="highlight">...</span> in the oscar image template.
The new url parameter "timeout_limit" set timeout limit defined in second.
Example "timeout_limit=1.5" means the timeout limit is 1.5 seconds.
In addition, the query can start with <[number] to set the timeout limit.
For number between 0 and 99, the unit is the second :
Example: "<30 searx" means the timeout limit is 3 seconds
For number above 100, the unit is the millisecond:
Example: "<850 searx" means the timeout is 850 milliseconds.
In addition, there is a new optional setting: outgoing.max_request_timeout.
If not set, the user timeout can't go above searx configuration (as before: the max timeout of selected engine for a query).
If the value is set, the user can set a timeout between 0 and max_request_timeout using
<[number] or timeout_limit query parameter.
Related to #1077
Updated version of PR #1413 from @isj-privacore
- npm package update
- apply #1226
- implement vim help dialog
- display cookies and search URL with preferences
- allow to enable / disable Open Access DOI rewrite
- add a clear text button on the left of the search button
- implement #1011 : the HTML title page is not set when using POST
- remove searx/static/themes/simple/img/loader.gif
- use full width when only there are only images as result
Add match_language function in utils to match any user given
language code with a list of engine's supported languages.
Also add language_aliases dict on each engine to translate
standard language codes into the custom codes used by the engine.
The `filename` parameter of the `url_for` function doesn't need a leading `/`, or else the resulting URL features a double-slash `//` that throws off searx 0.12.0 with Apache 2.4.25 on Debian, resulting in missing favicons.