Commit Graph

367 Commits

Author SHA1 Message Date
frob 80163de195
Merge 3d139086c1 into 0f9694c90b 2024-11-24 02:01:36 +00:00
Markus Heiser b176323e89 [fix] calculator: use locale from UI (not from selected language)
Closes: https://github.com/searxng/searxng/issues/3956
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-10-28 15:53:57 +01:00
Markus Heiser a3921b5ed7 [mod] add test to check compat.py module
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-10-26 15:59:42 +02:00
Grant Lanham 3e87354f0e [fix] float operations in calculator plugin
This patch adds an additional *isinstance* check within the ast parser to check
for float along with int, fixing the underlying issue.

Co-Authored: Markus Heiser <markus.heiser@darmarit.de>
2024-10-15 08:10:52 +02:00
Grant Lanham d448def1a6 [refactor] unit tests (continued) - plugins
This commit includes some refactoring in unit tests.  As we test more plugins,
it seems unweildy to include every test class in the test_plugins.py file.  This
patch split apart all of the test plugins to their own respective files,
including the new test_plugin_calculator.py file.
2024-10-15 08:10:52 +02:00
0xhtml 8b6a3f3e11 [enh] engine: mojeek - add language support
Improve region and language detection / all locale

Testing has shown the following behaviour for the different
default and empty values of Mojeeks parameters:

| param    | idx | value  | behaviour                 |
| -------- | --- | ------ | ------------------------- |
| region   |  0  | ''     | detect region based on IP |
| region   |  1  | 'none' | all regions               |
| language |  0  | ''     | all languages             |
2024-10-15 06:37:01 +02:00
Markus Heiser 7ab577a1fb [mod] Revision of the favicon solution
All favicons implementations have been documented and moved to the Python
package:

    searx.favicons

There is a configuration (based on Pydantic) for the favicons and all its
components:

    searx.favicons.config

A solution for caching favicons has been implemented:

    searx.favicon.cache

If the favicon is already in the cache, the returned URL is a data URL [1]
(something like `data:image/png;base64,...`).  By generating a data url from
the FaviconCache, additional HTTP roundtripps via the favicon_proxy are saved:

    favicons.proxy.favicon_url

The favicon proxy service now sets a HTTP header "Cache-Control: max-age=...":

    favicons.proxy.favicon_proxy

The resolvers now also provide the mime type (data, mime):

    searx.favicon.resolvers

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-10-05 08:18:28 +02:00
Brock Vojkovic e17d7632d0 [feat] add favicons to result urls 2024-10-05 08:18:28 +02:00
Grant Lanham 44a06190bb [refactor] unit tests to utilize paramaterized and break down monolithic tests
- for tests which perform the same arrange/act/assert pattern but with different
  data, the data portion has been moved to the ``paramaterized.expand`` fields

- for monolithic tests which performed multiple arrange/act/asserts,
  they have been broken up into different unit tests.

- when possible, change generic assert statements to more concise
  asserts (i.e. ``assertIsNone``)

This work ultimately is focused on creating smaller and more concise tests.
While paramaterized may make adding new configurations for existing tests
easier, that is just a beneficial side effect.  The main benefit is that smaller
tests are easier to reason about, meaning they are easier to debug when they
start failing.  This improves the developer experience in debugging what went
wrong when refactoring the project.

Total number of tests went from 192 -> 259; or, broke apart larger tests into 69
more concise ones.
2024-10-03 13:20:32 +02:00
Grant Lanham 2a29e16d25 [feat] implement mariadb engine 2024-10-03 13:04:06 +02:00
Grant Lanham 14241e7dac Add paramaterized with example of refactor
reduce test name size

fix imports
2024-09-22 08:03:02 +02:00
Alexander Sulfrian e86c96974d [fix] self_info: request.user_agent is not a str
The user_agent attribute of the Flask request object is an instance of
the werkzeug.user_agent.UserAgent class.

This will fix the following error of the self_info plugin:

> ERROR:searx.plugins.self_info: Exception while calling post_search
> Traceback (most recent call last):
>   File "searx/plugins/__init__.py", line 203, in call
>     ret = getattr(plugin, plugin_type)(*args, **kwargs)
>   File "searx/plugins/self_info.py", line 31, in post_search
>     search.result_container.answers['user-agent'] = {'answer': gettext('Your user-agent is: ') + ua}
> TypeError: can only concatenate str (not "UserAgent") to str
2024-08-30 11:29:34 +02:00
Grant Lanham 5276219b9d Fix tineye engine url, datetime parsing, and minor refactor
Changes made to tineye engine:
1. Importing logging if TYPE_CHECKING is enabled
2. Remove unecessary try-catch around json parsing the response, as this
masked the original error and had no immediate benefit
3. Improve error handling explicitely for status code 422 and 400
upfront, deferring json_parsing only for these status codes and
successful status codes
4. Unit test all new applicable changes to ensure compatability
2024-08-21 08:41:53 +02:00
Markus Heiser 5be55e3309 [fix] unit tests: fix load / unload engines & fix messages
- https://github.com/searxng/searxng/pull/3746#issuecomment-2300965005
- https://github.com/searxng/searxng/issues/2988#issuecomment-2226929084

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-08-21 08:28:13 +02:00
Markus Heiser 2039060b64 [mod] revision of the settings_loader
The intention of this PR is to modernize the settings_loader implementations.
The concept is old (remember, this is partly from 2014), back then we only had
one config file, meanwhile we have had a folder with config files for a very
long time.  Callers can now load a YAML configuration from this folder as
follows ::

    settings_loader.get_yaml_cfg('my-config.yml')

- BTW this is a fix of #3557.

- Further the `existing_filename_or_none` construct dates back to times when
  there was not yet a `pathlib.Path` in all Python versions we supported in the
  past.

- Typehints have been added wherever appropriate

At the same time, this patch should also be downward compatible and not
introduce a new environment variable. The localization of the folder with the
configurations is further based on:

    SEARXNG_SETTINGS_PATH (wich defaults to /etc/searxng/settings.yml)

Which means, the default config folder is `/etc/searxng/`.

ATTENTION: intended functional changes!

 If SEARXNG_SETTINGS_PATH was set and pointed to a not existing file, the
 previous implementation silently loaded the default configuration.  This
 behavior has been changed: if the file or folder does not exist, an
 EnvironmentError exception will be thrown in future.

Closes: https://github.com/searxng/searxng/issues/3557
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-14 18:10:06 +02:00
Markus Heiser a3500c1efc [fix] tear down TEST_ENGINES after TestBang is proceeded
Engines are loaded into global name `searx.engines.engines` other applications
such as statistics or the histogram use this global variable to search for
values in their own memories, which can lead to key errors as described in

- https://github.com/searxng/searxng/issues/2988

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Closes: https://github.com/searxng/searxng/issues/2988
2024-07-13 17:13:41 +02:00
Markus Heiser d80fcbc635 [fix] unit test_xpath.py: name 'logger' is not defined
Depending on the order in which the unit tests are executed, the python modules
of the engines are initialized (monkey patched) or not. As the order of the
tests is not static, random errors may occur.

To avaoid random `NameError: name 'logger' is not defined` in the unit tests of
the xpath engine, a logger is monkey patched into the xpath py-module.

```
make test.unit
TEST      tests/unit
......EE...................
======================================================================
ERROR: test_response (tests.unit.engines.test_xpath.TestXpathEngine.test_response)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./tests/unit/engines/test_xpath.py", line 60, in test_response
    self.assertEqual(xpath.response(response), [])
                     ^^^^^^^^^^^^^^^^^^^^^^^^
  File "./searx/engines/xpath.py", line 309, in response
    logger.debug("found %s results", len(results))
    ^^^^^^
NameError: name 'logger' is not defined

======================================================================
ERROR: test_response_results_xpath (tests.unit.engines.test_xpath.TestXpathEngine.test_response_results_xpath)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./tests/unit/engines/test_xpath.py", line 102, in test_response_results_xpath
    self.assertEqual(xpath.response(response), [])
                     ^^^^^^^^^^^^^^^^^^^^^^^^
  File "./searx/engines/xpath.py", line 309, in response
    logger.debug("found %s results", len(results))
    ^^^^^^
NameError: name 'logger' is not defined
```

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-28 08:54:19 +02:00
frob 3d139086c1
Merge branch 'searxng:master' into elasticsearch-custom-query 2024-06-23 14:24:06 +02:00
Grant Lanham 9a9ca307fe [fix] implement tests and remove usage of gen_useragent in engines 2024-06-23 11:51:41 +02:00
Richard Lyons 1f908a6222 [fix] engine unit tests.
Enables unit tests in the engines directory by adding __init__.py, and fixups
for the enabled tests.
2024-06-23 09:24:05 +02:00
Richard Lyons 4c80c2458a Check exception context for actual error message. 2024-06-22 14:02:28 +02:00
Richard Lyons 7897ad1d5a Add tests for special characters [ '"]. 2024-06-22 13:30:49 +02:00
Richard Lyons bc5068fece Fix elasticsearch custom_query. 2024-06-21 22:28:31 +02:00
Markus Heiser 542f7d0d7b [mod] pylint all files with one profile / drop PYLINT_SEARXNG_DISABLE_OPTION
In the past, some files were tested with the standard profile, others with a
profile in which most of the messages were switched off ... some files were not
checked at all.

- ``PYLINT_SEARXNG_DISABLE_OPTION`` has been abolished
- the distinction ``# lint: pylint`` is no longer necessary
- the pylint tasks have been reduced from three to two

  1. ./searx/engines -> lint engines with additional builtins
  2. ./searx ./searxng_extra ./tests -> lint all other python files

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-03-11 14:55:38 +01:00
Markus Heiser a7b51f023e [black] upgrade black 22.12.0 --> 24.2.0
The issue discussed in [1] has been solved since [2] has been merged into black
/ now we can upgrade without touching 69 files as it was needed with black
23.1.0 [3].

[1] https://github.com/searxng/searxng/pull/2159#issuecomment-1425723977
[2] https://github.com/psf/black/pull/4060
[3] https://github.com/searxng/searxng/pull/2159/files

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-03-09 08:15:50 +01:00
Markus Heiser ab8e5383fb [mod] remove X-XSS-Protection headers
Deprecated header not used by browsers nowadays[1]:

"""In modern browsers, X-XSS-Protection has been deprecated in favor of the
Content-Security-Policy to disable the use of inline JavaScript. Its use can
introduce XSS vulnerabilities in otherwise safe websites. This should not be
used unless you need to support older web browsers that don’t yet support CSP.
It is thus recommended to set the header as X-XSS-Protection: 0."""[2]

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-XSS-Protection
[2] https://infosec.mozilla.org/guidelines/web_security#x-xss-protection

Closes: https://github.com/searxng/searxng/issues/3171
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-01-31 17:23:41 +01:00
allixx e4cf0a7d4f [fix] do highlight replacement at once
Highlights all search queries in search result in one go.

Fixes the case where search query contains word from highlight HTML code,
which causes broken HTML to appear in search results.

Closes #3057
2024-01-29 13:15:37 +01:00
Markus Heiser fd814aac86 [mod] isolation of botdetection from the limiter
This patch was inspired by the discussion around PR-2882 [2].  The goals of this
patch are:

1. Convert plugin searx.plugin.limiter to normal code [1]
2. isolation of botdetection from the limiter [2]
3. searx/{tools => botdetection}/config.py and drop searx.tools
4. in URL /config, 'limiter.enabled' is true only if the limiter is really
   enabled (Redis is available).

This patch moves all the code that belongs to botdetection into namespace
searx.botdetection and code that belongs to limiter is placed in namespace
searx.limiter.

Tthe limiter used to be a plugin at some point botdetection was added, it was
not a plugin.  The modularization of these two components was long overdue.
With the clear modularization, the documentation could then also be organized
according to the architecture.

[1] https://github.com/searxng/searxng/pull/2882
[2] https://github.com/searxng/searxng/pull/2882#issuecomment-1741716891

To test:

- check the app works without the limiter, check `/config`
- check the app works with the limiter and with the token, check `/config`
- make docs.live .. and read
  - http://0.0.0.0:8000/admin/searx.limiter.html
  - http://0.0.0.0:8000/src/searx.botdetection.html#botdetection

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-11-01 06:44:56 +01:00
Markus Heiser ef56e1d684 [fix] HTMLParser: undocumented not implemented method
In python versions <py3.10 there is an issue with an undocumented method
HTMLParser.error() [1][2] that was deprecated in Python 3.4 and removed
in Python 3.5.

To be compatible to higher versions (>=py3.10) an error method is implemented
which throws an AssertionError exception like the higher Python versions do [3].

[1] https://github.com/python/cpython/issues/76025
[2] https://bugs.python.org/issue31844
[3] https://github.com/python/cpython/pull/8562

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-10-22 10:35:02 +02:00
Emilien Devos 33e722f83b better error message when no results found 2023-09-19 11:10:48 +02:00
Bnyro dcee823345 [feat] implement feeling lucky feature 2023-09-19 09:40:57 +02:00
jazzzooo 223b3487c3 [fix] spelling 2023-09-18 16:20:27 +02:00
Markus Heiser 733b795d53 [fix] make flask_babel.gettext() work in engine modules (L10n & threads)
incident:
  flask_babel.gettext() does not work in the engine modules.

cause:
  the request() and response() functions of the engine modules run in the
  processor, whose search() method runs in a thread and in the threads the
  context of the Flask app does not exist. The context of the Flask app is
  needed by the gettext() function for the L10n.

Solution:
  copy context of the Flask app into the threads. [1]

special case:
  We cannot equip the search() method of the processors with the decorator [1],
  because the decorator requires a context (Flask app) that does not yet exist
  at the time of the initialization of the processors (the initialization of the
  processors is part of the initialization of the Flask app).

[1] https://flask.palletsprojects.com/en/2.3.x/api/#flask.copy_current_request_context

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-08-09 13:27:43 +02:00
Markus Heiser fa1ef9a07b [mod] move some code from webapp module to webutils module (no functional change)
Over the years the webapp module became more and more a mess.  To improve the
modulaization a little this patch moves some implementations from the webapp
module to webutils module.

HINT: this patch brings non functional change

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-19 19:49:44 +02:00
Markus Heiser 80aaef6c95
Merge pull request #2357 / limiter -> botdetection
The monolithic implementation of the limiter was divided into methods and
implemented in the Python package searx.botdetection.  Detailed documentation on
the methods has been added.

The methods are divided into two groups:

1. Probe HTTP headers

- Method http_accept
- Method http_accept_encoding
- Method http_accept_language
- Method http_connection
- Method http_user_agent

2. Rate limit:

- Method ip_limit
- Method link_token (new)

The (reduced) implementation of the limiter is now in the module
searx.botdetection.limiter.  The first group was transferred unchanged to this
module.  The ip_limit contains the sliding windows implemented by the limiter so
far.

This merge also fixes some long outstandig issue:

- limiter does not evaluate the Accept-Language correct [1]
- limiter needs a IPv6 prefix to block networks instead of IPs [2]

Without additional configuration the limiter works as before (apart from the
bugfixes).  For the commissioning of additional methods (link_toke), a
configuration must be made in an additional configuration file.  Without this
configuration, the limiter runs as before (zero configuration).

The ip_limit Method implements the sliding windows of the vanilla limiter,
additionally the link_token method can be used in this method.  The link_token
method can be used to investigate whether a request is suspicious. To activate
the link_token method in the ip_limit method add the following to your
/etc/searxng/limiter.toml::

    [botdetection.ip_limit]
    link_token = true


[1] https://github.com/searxng/searxng/issues/2455
[2] https://github.com/searxng/searxng/issues/2477
2023-06-03 06:00:15 +02:00
Markus Heiser 2149e88bdd [mod] template preferences: split into elements (no functional change)
HINT: this patch has no functional change / it is the preparation for following
      changes and bugfixes

Over the years, the preferences template became an unmanageable beast.  To make
the source code more readable the monolith is splitted into elements.  The
splitting into elements also has the advantage that a new template can make use
of them.

The reversed checkbox is a quirk that is only used in the prefereces and must be
eliminated in the long term.  For this the macro 'checkbox_onoff_reversed' was
added to the preferences.html template.  The 'checkbox' macro is also a quirk of
the preferences.html we don't want to use in other templates (it is an
input-checkbox in a HTML form that was misused for status display).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-02 19:05:43 +02:00
Markus Heiser 38431d2e14 [fix] correct determination of the IP for the request
For correct determination of the IP to the request the function
botdetection.get_real_ip() is implemented.  This fonction is used in the
ip_limit and link_token method of the botdetection and it is used in the
self_info plugin.

A documentation about the X-Forwarded-For header has been added.

[1] https://github.com/searxng/searxng/pull/2357#issuecomment-1566211059

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 14:38:53 +02:00
Markus Heiser 16f0db4493 [mod] replace utils.match_language by locales.match_locale
This patch replaces the *full of magic* ``utils.match_language`` function by a
``locales.match_locale``.  The ``locales.match_locale`` function is based on the
``locales.build_engine_locales`` introduced in 9ae409a0 [1].

In the past SearXNG did only support a search by a language but not in a region.
This has been changed a long time ago and regions have been added to SearXNG
core but not to the engines.  The ``utils.match_language`` was the function to
handle the different aspects of language/regions in SearXNG core and the
supported *languages* in the engine.  The ``utils.match_language`` did it with
some magic and works good for most use cases but fails in some edge case.

To replace the concurrence of languages and regions in the SearXNG core the
``locales.build_engine_locales`` was introduced in 9ae409a0 [1].  With the last
patches all engines has been migrated to a ``fetch_traits`` and a
language/region concept that is based on ``locales.build_engine_locales``.

To summarize: there is no longer a need for the ``locales.match_language``.

[1] https://github.com/searxng/searxng/pull/1652

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Alexandre Flament 6748e8e2d5 Add "Auto-detected" as a language.
When the user choose "Auto-detected", the choice remains on the following queries.
The detected language is displayed.

For example "Auto-detected (en)":
* the next query language is going to be auto detected
* for the current query, the detected language is English.

This replace the autodetect_search_language plugin.
2023-02-17 15:17:36 +00:00
Alexandre Flament 6d72ef3cbe
Merge pull request #2109 from ahmad-alkadri/fix/highlight-full-word
Standalone words highlighting for query result in non-CJK characters
2023-01-17 23:24:04 +01:00
ahmad-alkadri 99b5272d9a A little fix and modified the testing for content highlight 2023-01-15 16:51:31 +01:00
Léon Tiekötter 0cedb1c6d8 Add search.suspended_times settings
Make suspended_time changeable in settings.yml
Allow different values to be set for different exceptions.

Co-authored-by: Alexandre Flament <alex@al-f.net>
2023-01-15 09:00:32 +00:00
ArtikusHG 1f8f8c1e91 Replace langdetect with fasttext 2022-12-16 21:07:39 +02:00
Alexandre Flament 269326063a Fix: don't crash when engine or name is missing in settings.yml
SearXNG crashes if the engine or name fields are missing.
With this commit, the app displays an error in the log and keeps loading.

Close #1951
2022-12-04 23:43:59 +01:00
Mohamed Elashri 8d5653e60d
Merge branch 'searxng:master' into master 2022-09-30 23:06:54 +00:00
Markus Heiser ba8959ad7c [fix] typos / reported by @kianmeng in searx PR-3366
[PR-3366] https://github.com/searx/searx/pull/3366

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-27 18:32:14 +02:00
Mohamed Elashri 5832c70680
correct sci-hub links/ add `.ru` and remove other 3rd party domains. 2022-09-24 11:03:57 -04:00
Alexandre FLAMENT 4adc9920e9 Remove usage of SEARX environment variables 2022-08-28 17:12:57 +00:00
Markus Heiser 3b0f9c07b2 [fix] improve OpenSearch description
Some HTTP-Clients do have issues with the ``opensearch.xml`` from SearXNG
(related [1][2]) while other OpenSearch descriptions[3] (e.g. from qwant) work
flawles.

Inspired by the OpenSearch description from qwant and with informations from the
specification[4] the ``opensearch.xml`` has been *improved*.

- convert `<Url>` methods from lower case to upper case (`POST`|`GET`)
- add `<moz:SearchForm>` and `xmlns:moz="http://www.mozilla.org/2006/browser/search/"`
- add `<Query role="example" searchTerms="SearXNG" />`  [4]

  OpenSearch description documents should include at least one Query element of
  `role="example"` that is expected to return search results. Search clients may
  use this example query to validate that the search engine is working properly.

- modified `<LongName>` to SearXNG
- modified `<Description>` the word 'hackable' scares uninitiated users and was removed
- add the `type="image/png"` to `<Image>`

Test can be done by::

    make run

Visit http://127.0.0.1:8888/ and add the search engine to your WEB-Browser /
test with different WEB-Browser from desktop and Smartphones (are there any iOS
user here, please test on Safari and Chrome).

[1] https://app.element.io/#/room/#searxng:matrix.org/$xN_abdKhNqUlgXRBrb_9F3pqOxnSzGQ1TG0s0G9hQVw
[2] https://github.com/searxng/searxng/issues/431
[3] https://developer.mozilla.org/en-US/docs/Web/OpenSearch
[4] https://github.com/dewitt/opensearch/blob/master/opensearch-1-1-draft-6.md#the-query-element

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-11 19:04:36 +02:00
Alexandre Flament a1e8af0796 bing.py: resolve bing.com/ck/a redirections
add a new function searx.network.multi_requests to send multiple HTTP requests at once
2022-07-08 22:02:21 +02:00