Commit Graph

1916 Commits

Author SHA1 Message Date
Aadniz
52ffc4c7f4 [fix] qwant engine: order query parameters to prevent 403 forbidden (#5410) 2025-11-03 22:53:50 +01:00
Aadniz
43065c5026 [fix] deviantart engine: pagination match change (#5384)
Pagination currently does not work for deviantart, resulting in the same page
being shown when going to the next page in SearXNG.
2025-10-28 06:21:40 +01:00
Aadniz
ea4a55fa57 [fix] qwant engine: set header Accept-Language to bypass bot detection (#5382)
Set HTTP header Accept-Language [1] for the Qwant engine.

Qwant does not seem to work on any SearXNG instance right now, and this is a fix
for this issue.

During testing, it seems like setting the Accept-Language gives more success for
bypassing bot detection (tested with a few ~20 searches).

[1] https://docs.searxng.org/dev/engines/enginelib.html#searx.enginelib.Engine.send_accept_language_header
2025-10-27 08:33:07 +01:00
Aadniz
d514dea5cc [fix] deviantart engine: does not return any results (#5383) 2025-10-27 08:02:01 +01:00
Aadniz
22e1d30017 [fix] startpage engine: properly display CAPTCHA if redirect page is seen (#5380)
Fixes an issue where startpage engine would display parsing error
(`json.decoder.JSONDecodeError`) when returning CAPTCHA redirect page.

The fix simply checks if response header has `Location` set, and if it starts
with `https://www.startpage.com/sp/captcha`, it will raise a CAPTCHA exception
before trying to parse the data.
2025-10-26 11:32:45 +01:00
Aadniz
4ca75a0450 [fix] engine qwant - return forbidden instead of showing parse error (#5377) 2025-10-25 13:43:37 +02:00
Markus Heiser
9371658531 [mod] typification of SearXNG: add new result type File
This PR adds a new result type: File

    Python class: searx/result_types/file.py
    Jinja template: searx/templates/simple/result_templates/file.html
    CSS (less) client/simple/src/less/result_types/file.less

Class 'File' (singular) replaces template 'files.html' (plural).  The renaming
was carried out because there is only one file (singular) in a result. Not to be
confused with the category 'files' where in multiple results can exist.

As mentioned in issue [1], the class '.category-files' was removed from the CSS
and the stylesheet was adopted in result_types/file.less (there based on the
templates and no longer based on the category).

[1] https://github.com/searxng/searxng/issues/5198

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-10-20 10:18:33 +02:00
Markus Heiser
ee6d4f322f [mod] engine: reuters - REST-API for Reuter's thumbnail, height:80
The size of the full-size images from ``thumbnail.url`` is usually several
MB. By reducing the full-size image to 80 pixels, the data size for a thumb is
reduced from MB to a few KB.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-10-18 14:43:35 +02:00
Bnyro
3725aef6f3 [fix] reuters: crash on empty results pages & date parsing
1. On empty result list, return empty EngineResults (#5330)

2. Use ``dateutil.parser`` to avoid ``ValueError``:

    ERROR   searx.engines.reuters : exception : Invalid isoformat string: '2022-06-08T16:07:54Z'
      File "searx/engines/reuters.py", line 91, in response
        publishedDate=datetime.fromisoformat(result["display_time"]),
    ValueError: Invalid isoformat string: '2022-06-08T16:07:54Z'

Closes: https://github.com/searxng/searxng/issues/5330
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2025-10-18 14:43:35 +02:00
Markus Heiser
e840e3f960 [fix] engine mullvadleta - ignore HTTP 403 & 429 response
It doesn't matter if you're using Mullvad's VPN and a proper browser, you'll
still get blocked for specific searches [1] with a 403 or 429 HTTP status code.
Mullvad only blocks the search request and doesn't prevent you from doing more
searches.

The logic should handle the blocked requests (403, 429), but not put the engine
on a cooldown.

[1] https://leta.mullvad.net/search?q=site%3Afoo+bar&engine=brave

Closes: https://github.com/searxng/searxng/issues/5328
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-10-18 09:05:54 +02:00
Bnyro
1d138c5968 [mod] bing engine: follow redirects (#5324)
Apparently, in China, Bing redirects from `www.bing.com` to `cn.bing.com`.
So in order to make Bing work for chinese users by default, we have to follow that redirect.

related: https://github.com/searxng/searxng/issues/5243
2025-10-17 15:43:49 +02:00
Tommaso Colella
c34bb61284 [feat] engines: add Azure resources engine (#5235)
Adds a new engine `searx/engines/azure.py` to search cloud resources on Azure.

A lot of enterprise users have to deal with Azure Public Cloud.  This helps them
easily search for cloud resources without logging in to the Portal first

How to test this PR locally?

You should create an App Registration on Azure Entra Id with Reader access on
the resources you want to search for.  You should create a Secret for the App
Registration.  After that, you should set up appropriate values in the
`settings.yml` file [1]::

   - name: azure
     engine: azure
     ...
     azure_tenant_id: "your_tenant_id"
     azure_client_id: "your_client_id"
     azure_client_secret: "your_client_secret"
     azure_token_expiration_seconds: 5000

[1] https://github.com/searxng/searxng/pull/5235#issuecomment-3397664928

Co-authored-by: Bnyro <bnyro@tutanota.com>
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2025-10-13 16:33:08 +02:00
Bnyro
8baefcc21e [fix] pinterest: crash when there's no link & show image resolution + uploader name (#5314)
closes #5231
2025-10-13 07:43:36 +02:00
Markus Heiser
954f0f62b4 [fix] startpage engine - SafeSearch works in reverse (#5290)
The Name of the option is *disable_family_filter* ->  we have to reverse the
meaning of the ascending safe-search filter level.

Closes: https://github.com/searxng/searxng/issues/5287

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-10-09 16:06:46 +02:00
Markus Heiser
d8d5de4d47 [fix] google scholar - detect CAPTCHA (HTTP redirects) (#5268)
In the case of .. response, for example, an HTTP 302 is returned by Google
Scholar::

    Our systems have detected unusual traffic from your computer
    network. Please try again later.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-10-06 10:12:38 +02:00
Markus Heiser
c6f1ea12b1 [fix] engine - cppreference has no longer a search function (#5273)
cppreference has replaced its search (``mwiki/index.php?title=``) with a DDG
search.

The engine was first introduced in SearXNG with PR-3274 [1], and even back then
the mediawiki proved to be incompatible, which is why the API could not be used
at the time. Now there isn't even a dedicated search function anymore.. I think
the cppreference project suffers from a lack of maintenance.

[1] https://github.com/searxng/searxng/pull/3247

Closes: https://github.com/searxng/searxng/issues/5271

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-10-03 08:40:24 +02:00
Markus Heiser
748b521ac6 [fix] searx/results.py - TypeError: object of type 'NoneType' has no len()
In some engines, under certain circumstances, the content field can also have
the value ``None``; in these cases, a length check results in an exception::

    File "/usr/local/searxng/searx/results.py", line 360, in merge_two_main_results
        if len(other.content) > len(origin.content):
           ^^^^^^^^^^^^^^^^^^
    TypeError: object of type 'NoneType' has no len()

[1] https://github.com/searxng/searxng/issues/5250#issuecomment-3352863488

Reported-by: @scross01 [1]
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-10-01 07:13:10 +02:00
Markus Heiser
e16b6cb148 [fix] JSON format: serialization of the result-types
The ``JSONEncoder`` (``format="json"``) must perform a conversion to the
built-in types for the ``msgspec.Struct``::

    if isinstance(o, msgspec.Struct):
        return msgspec.to_builtins(o)

The result types are already of type ``msgspec.Struct``, so they can be
converted into built-in types.

The field types (in the result type) that were not yet of type ``msgspec.Struct``
have been converted to::

    searx.weather.GeoLocation@dataclass -> msgspec.Struct
    searx.weather.DateTime              -> msgspec.Struct
    searx.weather.Temperature           -> msgspec.Struct
    searx.weather.PressureUnits         -> msgspec.Struct
    searx.weather.WindSpeed             -> msgspec.Struct
    searx.weather.RelativeHumidity      -> msgspec.Struct
    searx.weather.Compass               -> msgspec.Struct

BTW: Wherever it seemed sensible, the typing was also modernized in the modified
files.

Closes: https://github.com/searxng/searxng/issues/5250
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-10-01 07:13:10 +02:00
Markus Heiser
4f4de3fc87 [fix] openstreetmap: fix CURRENCIES.iso4217_to_name
This patch is a leftover from PR-5204 [1].

[1] https://github.com/searxng/searxng/pull/5204

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-28 07:32:41 +02:00
Markus Heiser
81cbe0befe [upd] pypi: Bump black from 24.3.0 to 25.9.0 (#5251)
In 25.1.0 [2] an old bug has been fixed: "Docstring formatting does not apply to
module docstrings" [3].

[1] https://github.com/psf/black/blob/main/CHANGES.md#2590
[2] https://github.com/psf/black/blob/main/CHANGES.md#2510
[3] https://github.com/psf/black/issues/4094

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-26 12:35:57 +02:00
Markus Heiser
d2b4bff856 [mod] demo engines: smaller improvement
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-20 10:56:46 +02:00
Markus Heiser
1520a8d545 [mod] ADS engine: revision of the engine (Paper result)
Revision of the Astrophysics Data System (ADS) engine / use of the result type
Paper as well as other typifications.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-20 10:56:46 +02:00
Markus Heiser
f8f7adce6b [mod] Z-Library engine: revision of the engine (Paper result)
Revision of the engine / use of the result type Paper as well as other
typifications.

The engine has been placed on inactive because no service is currently
available, or at least not known in the SearXNG community [1]

[1] https://github.com/searxng/searxng/issues/3610

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-20 10:56:46 +02:00
Markus Heiser
4c42704c80 [mod] Springer Nature engine: revision of the engine (Paper result)
Revision of the engine / use of the result type Paper as well as other
typifications.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-20 10:56:46 +02:00
Markus Heiser
4b4bf0ecaf [mod] Semantic Scholar engine: revision of the engine (Paper result)
Revision of the engine / use of the result type Paper as well as other
typifications.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-20 10:56:46 +02:00
Markus Heiser
bb22bb1831 [mod] PubMed engine: revision of the engine (Paper result)
Revision of the engine / use of the result type Paper as well as other
typifications.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-20 10:56:46 +02:00
Markus Heiser
96e63df8ca [mod] Open Library engine: revision of the engine (Paper result)
Revision of the engine / use of the result type Paper as well as other
typifications.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-20 10:56:46 +02:00
Markus Heiser
0691e50e13 [mod] OpenAlex engine: revision of the engine (Paper result)
Revision of the engine / use of the result type Paper as well as other
typifications.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-20 10:56:46 +02:00
Markus Heiser
599d9488c5 [mod] Google Scholar engine: revision of the engine (Paper result)
Revision of the engine / use of the result type Paper as well as other
typifications.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-20 10:56:46 +02:00
Markus Heiser
078c9fcb68 [mod] Crossref engine: revision of the engine (Paper result)
Revision of the engine / use of the result type Paper as well as other
typifications.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-20 10:56:46 +02:00
Markus Heiser
3ec6d65f9b [mod] CORE engine: revision of the engine (Paper result)
Revision of the engine / use of the result type Paper as well as other
typifications.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-20 10:56:46 +02:00
Markus Heiser
22e73727c0 [mod] Anna's Archive engine: revision of the engine (Paper result)
Revision of the engine / use of the result type Paper as well as other
typifications.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-20 10:56:46 +02:00
Markus Heiser
6c3fb9e42b [mod] arXiv engine: revision of the engine (Paper result)
Revision of the engine / use of the result type Paper as well as other
typifications.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-20 10:56:46 +02:00
Markus Heiser
09fddfde24 [mod] demo engines: smaller corrections and improvements
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-18 19:40:03 +02:00
Markus Heiser
8f8343dc0d [mod] addition of various type hints / engine processors
Continuation of #5147 .. typification of the engine processors.

BTW:

- removed obsolete engine property https_support
- fixed & improved currency_convert
- engine instances can now implement a engine.setup method

[#5147] https://github.com/searxng/searxng/pull/5147

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-18 19:40:03 +02:00
Markus Heiser
a9b088d832 [feat] engines yacy & piped: enable individual configuration of URLs (#5195)
With this change it is possible with individual engines (yacy & piped)
to configure individual URLs.

Related:

- https://github.com/searxng/searxng/issues/4869#issuecomment-327335928
- https://github.com/searxng/searxng/pull/3472/files#r1595586019
- https://github.com/searxng/searxng/issues/3428#issuecomment-2102142530

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-10 12:57:36 +02:00
Austin-Olacsi
905b13aa7e [feat] naver engine: add video embeds 2025-09-09 17:04:21 +02:00
Markus Heiser
f24d85bc4b [mod] drop: from __future__ import annotations
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-03 13:37:36 +02:00
Markus Heiser
57b9673efb [mod] addition of various type hints / tbc
- pyright configuration [1]_
- stub files: types-lxml [2]_
- addition of various type hints
- enable use of new type system features on older Python versions [3]_
- ``.tool-versions`` - set python to lowest version we support (3.10.18) [4]_:
  Older versions typically lack some typing features found in newer Python
  versions.  Therefore, for local type checking (before commit), it is necessary
  to use the older Python interpreter.

.. [1] https://docs.basedpyright.com/v1.20.0/configuration/config-files/
.. [2] https://pypi.org/project/types-lxml/
.. [3] https://typing-extensions.readthedocs.io/en/latest/#
.. [4] https://mise.jdx.dev/configuration.html#tool-versions

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Format: reST
2025-09-03 13:37:36 +02:00
Butui Hu
09500459fe [fix} engine chinaso - parse_images ImageInfo key error (#5175)
Signed-off-by: Butui Hu <hot123tea123@gmail.com>
2025-09-03 05:59:18 +02:00
Bnyro
b93cc2f9f8 [feat] engines: add repology.org engine for linux packages (#5103)
Repology_ monitors a huge number of package repositories and other sources
comparing packages versions across them and gathering other information.

Repology_ shows you in which repositories a given project is packaged, which
version is the latest and which needs updating, who maintains the package, and
other related information.

.. _Repology: https://repology.org/docs/about

Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
Format: reST
2025-09-01 16:33:31 +02:00
Butui Hu
932fb22c80 [fix] chinaoso: add random uid to cookie (#5173)
Signed-off-by: Butui Hu <hot123tea123@gmail.com>
2025-09-01 15:34:17 +02:00
Markus Heiser
9ac9c8c4f5 [mod] typification of SearXNG: add new result type Code
This patch adds a new result type: Code

- Python class:   searx/result_types/code.py
- Jinja template: searx/templates/simple/result_templates/code.html
- CSS (less)      client/simple/src/less/result_types/code.less

Signed-of-by: Markus Heiser <markus.heiser@darmarIT.de>
2025-09-01 14:51:15 +02:00
Bnyro
f971774773 [fix] annas archive: engine broken due to site HTML changes
Apparently the layout of https://annas-archive.org has changed, making changes necessary.

The issue has been reported in #5146, see there for more details.

- closes #5146
2025-08-28 19:24:37 +02:00
muthukumaran R
a0ff173799 [feat] engines: add OpenAlex Works engine (#5102)
- Adds a new engine `searx/engines/openalex.py` that integrates the OpenAlex
  Works API to return scientific paper results using the `paper.html` template.
- Uses the official API (no auth required); supports OpenAlex polite pool via `mailto`.
- Adds developer docs at `docs/dev/engines/online/openalex.rst`.

OpenAlex API reference: https://docs.openalex.org/how-to-use-the-api/api-overview
2025-08-24 14:17:30 +02:00
Bnyro
0369682690 [fix] selfhst icons: icon list url invalid, set to active
- the previous CDN icon list url no longer works
- a list of all icons is mirrored to the JSDelivr CDN however
- there's no reason to set the engine to inactive now that we use public CDNs
2025-08-20 14:27:17 +02:00
Filip Mikina
6b57705e50 [feat] engines: add GitHub Code Search engine (#5074)
This patch adds GitHub Code Search [1] engine to allow querying the codebases.

Template code.html is changed to allow passthrough of strip and highlighting
options.

Engine Searchcode is adjusted to pass filename and not rely on hardcoded
extensions.

GitHub search code API does not return the exact code line indices, this
implementation assigns the code arbitrary numbers starting from 1
(effectively relabeling the code).

The API allows for unauth calls, and the default engine settings default to
that, although the calls are heavily rate limited.

The 'text' lexer is the default pygments lexer when parsing fails.

[1] https://docs.github.com/en/rest/search/search?apiVersion=2022-11-28#search-code

Co-authored-by: Markus Heiser <markus.heiser@darmarIT.de>
2025-08-20 07:35:31 +02:00
Markus Heiser
25647c20d1 [mod] switching from pyright to basedpyright (plus first rules)
pyrightconfig.json :

  for the paths searx, searxng_extra and tests, individual rules were
  defined (for example, in test fewer / different rules are needed than in the
  searx package

searx/engines/__builtins__.pyi :

  The builtin types that are added to the global namespace of a module by the
  intended monkey patching of the engine modules / replaces the previous
  filtering of the stdout using grep.

test.pyright_modified (utils/lib_sxng_test.sh) :

  static type check of local modified files not yet commited

make test :

  prerequisite 'test.pyright' has been replaced by 'test.pyright_modified'

searx/engines/__init__.py, searx/enginelib/__init__.py :

  First, minimal typifications that were considered necessary.
2025-08-19 12:04:35 +02:00
Ishbir Singh
b606103352 [fix] reuters: published date not parsed correctly in some cases
FIxes publishedDate format in reuters engine to encompass ISO 8601 times both with and without milliseconds.
Why is this change important?

Previously, the engine would sometimes fail saying:

2025-08-12 21:13:23,091 ERROR:searx.engines.reuters: exception : time data '2024-04-15T19:08:30.833Z' does not match format '%Y-%m-%dT%H:%M:%SZ'

Traceback (most recent call last):

...
  File "/usr/local/searxng/searx/engines/reuters.py", line 87, in response

    publishedDate=datetime.strptime(result["display_time"], "%Y-%m-%dT%H:%M:%SZ"),

                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...

Note that most queries seem to work with Reuters, but there are some results that have the additional milliseconds and fail. Regardless, the change is backwards compatible as both the formats (with and without the ms) should now parse correctly.
2025-08-16 15:50:38 +00:00
Zhijie He
6b1516d6ad [fix] baidu captcha detection (#5111)
Add Baidu Captcha detection to reduce `JSONDecodeError` error

Baidu will redirect to `wappass.baidu.com` and return a captcha challenge.
Current behavior will get the data from `wappass.baidu.com` then return a
`json.decoder.JSONDecodeError` error.
2025-08-12 15:18:46 +02:00