Commit Graph

1616 Commits

Author SHA1 Message Date
M Asenov faa32d5773 fixed xpath selector for appropriate results 2022-08-21 20:08:00 +01:00
Alexandre Flament 5ed40af3ba
Merge pull request #1661 from liimee/eng-tw
Add twitter engine
2022-08-21 15:21:18 +02:00
Markus Heiser 77a0f33819 [fix] engine duden - don't raise exception on empty result list
Duden expects a word in German, so with query "amazing" the site finds nothing
and respons a 404:

    httpx.HTTPStatusError: Client error '404 Not Found' for url\
      'https://www.duden.de/suchen/dudenonline/amazing'

[1] https://github.com/searxng/searxng/issues/1543#issuecomment-1193317054

Suggested-by: @allendema [1]
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-20 08:41:03 +02:00
ta 05851978cf add explanation of token 2022-08-17 19:45:42 +07:00
ta c8acd4a3b6 add profile image to user results 2022-08-17 14:30:59 +07:00
ta b6fd7cd571 add thumbnail to results if available 2022-08-17 14:25:22 +07:00
Markus Heiser 27385e7898 [mod] qwant - add safesearch option
Closes: https://github.com/searxng/searxng/issues/1640
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-14 10:36:14 +02:00
Markus Heiser 6579d6d558 [fix] qwant - API error::locale must be one ..
The request function should not request a language (aka locale) that is not
supported by qwant. Select a locale like zh-TW ends in qwant's API error:

  ERROR searx.engines.qwant news: exception : \
  API error::locale must be one of the following values: \
    en_gb, en_ie, en_us, en_ca, en_my, en_au, en_nz, de_de, de_ch, de_at, fr_fr, \
    fr_be, fr_ch, fr_ca, fr_ad, fc_ca, co_fr, es_es, es_ar, es_cl, es_co, es_mx, \
    es_pe, es_ad, ca_es, ca_ad, ca_fr, eu_es, eu_fr, it_it, it_ch, pt_pt, pt_ad, \
    nl_be, nl_nl

The existing searx.utils.match_language function is unsuitable for this purpose,
it is replaced by function searx.locales.get_engine_locale that is based on the
methods from the babel package.

The quant's _fetch_supported_languages function has been revised to filter out
languages 8aka locales) not supported by qwant.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-14 10:36:14 +02:00
Markus Heiser 75bb8c45d0 [mod] decouple qwant's categories from SearXNG's categories
By using new property `qwant_categ:` the category of qwant is no longer bound to
the category of SearXNG.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-14 10:26:54 +02:00
ta 96ea355a1f add twitter engine 2022-08-14 08:39:41 +07:00
Markus Heiser eb02cc77c5 [fix] google - simplify XPath selectors to fetch more results
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-10 18:55:31 +02:00
Émilien Devos b9f16a77db output format protobuf to HTML for google mobile 2022-08-10 09:36:06 +00:00
Thomas Renard d4acbcfe63 [mod] add deepl translation engine
This implements the Deepl Translation engine. It works nearly like lingva but
directly to the deepl API.  This api only needs a to-lang, from-lang is a fake
by now.

There is a free option to use [1].

[1] https://www.deepl.com/pro-api?cta=header-pro-api for registering a free account.
2022-08-10 09:14:36 +02:00
Brock Vojković 24210fb10b
Revert PR #1633
This reverts the changes made to the Google results XPath in PR #1633.
2022-08-10 03:41:39 +02:00
Léon Tiekötter 94b3656b4a [fix] google engine: results XPath
Seems google rolls out changes first on the `google.com` domain and later on the
"language" domains.  By example: yesterday [1] `google.com` did not work but
`google.de` and `google.fr` did work, today they do not work any longer and this
fix is needed on all domains.

Closes: https://github.com/searxng/searxng/issues/1628
[1] https://github.com/searxng/searxng/issues/1628#issuecomment-1208191816
2022-08-09 06:23:59 +02:00
liimee 8c318562e2
add description and wikidata ID to wttr.in engine 2022-08-07 14:57:10 +07:00
ta 8aa018db95 add wttr.in engine 2022-08-07 13:04:18 +07:00
Markus Heiser 8df1f0c47e [mod] add 'Accept-Language' HTTP header to online processores
Most engines that support languages (and regions) use the Accept-Language from
the WEB browser to build a response that fits to the language (and region).

- add new engine option: send_accept_language_header

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-01 17:01:59 +02:00
Alexandre Flament 2babf59adc [fix] pyright repported errors
The errors make pyright usage useless since a new error won't be seen [1].

[1] https://github.com/searxng/searxng/pull/1569

```
  searx/compat.py:11:27 - error: Expression of type "Type[cached_property[_T@cached_property]]" cannot be assigned to declared type "Type[cached_property]"
    "Type[cached_property[_T@cached_property]]" is incompatible with "Type[cached_property]"
    Type "Type[cached_property[_T@cached_property]]" cannot be assigned to type "Type[cached_property]" (reportGeneralTypeIssues)
  searx/utils.py:69:36 - error: Expression of type "None" cannot be assigned to parameter of type "str"
    Type "None" cannot be assigned to type "str" (reportGeneralTypeIssues)
  searx/utils.py:573:85 - error: Expression of type "None" cannot be assigned to parameter of type "int"
    Type "None" cannot be assigned to type "int" (reportGeneralTypeIssues)
  searx/webapp.py:1306:22 - error: Argument of type "str" cannot be assigned to parameter "__a" of type "BytesPath" in function "join"
    Type "str" cannot be assigned to type "BytesPath"
      "str" is incompatible with "bytes"
      "str" is incompatible with protocol "PathLike[bytes]"
        "__fspath__" is not present (reportGeneralTypeIssues)
  searx/webapp.py:1306:68 - error: Argument of type "Literal['themes']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
    Type "Literal['themes']" cannot be assigned to type "BytesPath"
      "Literal['themes']" is incompatible with "bytes"
      "Literal['themes']" is incompatible with protocol "PathLike[bytes]"
        "__fspath__" is not present (reportGeneralTypeIssues)
  searx/webapp.py:1306:78 - error: Argument of type "str | Any | None" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
    Type "str | Any | None" cannot be assigned to type "BytesPath"
      Type "str" cannot be assigned to type "BytesPath"
        "str" is incompatible with "bytes"
        "str" is incompatible with protocol "PathLike[bytes]"
          "__fspath__" is not present (reportGeneralTypeIssues)
  searx/webapp.py:1306:85 - error: Argument of type "Literal['img']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
    Type "Literal['img']" cannot be assigned to type "BytesPath"
      "Literal['img']" is incompatible with "bytes"
      "Literal['img']" is incompatible with protocol "PathLike[bytes]"
        "__fspath__" is not present (reportGeneralTypeIssues)
  searx/engines/mongodb.py:8:6 - warning: Import "pymongo" could not be resolved (reportMissingImports)
  searx/engines/mysql_server.py:9:8 - warning: Import "mysql.connector" could not be resolved (reportMissingImports)
  searx/engines/postgresql.py:9:8 - warning: Import "psycopg2" could not be resolved from source (reportMissingModuleSource)
  searx/engines/xpath.py:187:28 - warning: "categories" is not defined (reportUndefinedVariable)
  searx/search/__init__.py:184:82 - warning: "flask" is not defined (reportUndefinedVariable)
  searx/search/checker/background.py:19:26 - error: Type of "schedule" is partially unknown
    Type of "schedule" is "(delay: Any, func: Any, *args: Any) -> Literal[True]" (reportUnknownVariableType)
  searx/shared/__init__.py:8:12 - warning: Import "uwsgi" could not be resolved (reportMissingImports)
  searx/shared/shared_uwsgi.py:5:8 - warning: Import "uwsgi" could not be resolved (reportMissingImports)
```
2022-07-30 18:04:44 +02:00
Markus Heiser c72d70d45c Revert "Quick fix for google engine for EU countries"
This reverts commit 747cf1a246.
2022-07-26 06:39:44 +02:00
Léon Tiekötter 950f036c03
[fix] google engine: results XPath 2022-07-26 00:24:15 +02:00
Émilien Devos 747cf1a246
Quick fix for google engine for EU countries
This revert part of the commit of 5fb2071cb2
2022-07-25 20:48:50 +00:00
Markus Heiser 0be0e63117 [fix] demo_online.py - fixed typo
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-25 20:04:00 +02:00
Emilien Devos 5fb2071cb2 [fix] google & youtube - set EU consent cookie
This change the previous bypass method for Google consent using
``ucbcb=1`` (6face215b8) to accept the consent using ``CONSENT=YES+``.

The youtube_noapi and google have a similar API, at least for the consent[1].

Get CONSENT cookie from google reguest::

    curl -i "https://www.google.com/search?q=time&tbm=isch" \
         -A "Mozilla/5.0 (X11; Linux i686; rv:102.0) Gecko/20100101 Firefox/102.0" \
         | grep -i consent
    ...
    location: https://consent.google.com/m?continue=https://www.google.com/search?q%3Dtime%26tbm%3Disch&gl=DE&m=0&pc=irp&uxe=eomtm&hl=en-US&src=1
    set-cookie: CONSENT=PENDING+936; expires=Wed, 24-Jul-2024 11:26:20 GMT; path=/; domain=.google.com; Secure
    ...

PENDING & YES [2]:

  Google change the way for consent about YouTube cookies agreement in EU
  countries. Instead of showing a popup in the website, YouTube redirects the
  user to a new webpage at consent.youtube.com domain ...  Fix for this is to
  put a cookie CONSENT with YES+ value for every YouTube request

[1] https://github.com/iv-org/invidious/pull/2207
[2] https://github.com/TeamNewPipe/NewPipeExtractor/issues/592

Closes: https://github.com/searxng/searxng/issues/1432
2022-07-25 13:27:06 +02:00
Markus Heiser 4231a5770b [fix] sjp engine - convert enginename to a latin1 compliance name
The engine name is not only a *name* its also a identifier that is used in
logs, HTTP headers and more.  Unicode characters in the name of an engine could
cause various issues.

Closes: https://github.com/searxng/searxng/issues/1544
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-24 21:10:55 +02:00
james-still 2516e21c58 [fix] emojipedia - update XPath to be relative 2022-07-24 19:14:26 +02:00
Markus Heiser 1540891561 [fix] engine tineye: handle 422 response of not supported img format
Closes: https://github.com/searxng/searxng/issues/1449
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-23 16:00:58 +02:00
Markus Heiser 4e05197444
Merge pull request #1475 from return42/Emojipedia
[mod] Add engine for Emojipedia
2022-07-15 09:30:40 +02:00
Jay 10edcbe3c2 [mod] Add engine for Emojipedia
Emojipedia is an emoji reference website which documents the meaning and
common usage of emoji characters in the Unicode Standard.  It is owned by Zedge
since 2021. Emojipedia is a voting member of The Unicode Consortium.[1]

Cherry picked from @james-still [2[3] and slightly modified to fit SearXNG's
quality gates.

[1] https://en.wikipedia.org/wiki/Emojipedia
[2] 2fc01eb20f
[3] https://github.com/searx/searx/pull/3278
2022-07-15 09:26:44 +02:00
Alexandre Flament 44f2eb50a5
Merge pull request #1219 from dalf/follow_bing_redirect
bing.py: remove redirection links
2022-07-10 18:06:22 +02:00
Emilien Devos 6face215b8 bypass google consent with ucbcb=1 2022-07-09 21:33:24 +00:00
Alexandre Flament a1e8af0796 bing.py: resolve bing.com/ck/a redirections
add a new function searx.network.multi_requests to send multiple HTTP requests at once
2022-07-08 22:02:21 +02:00
Markus Heiser 970a69012b [fix] engine z-zlibrary https URL
before this patch:

    DEBUG   searx.engines.z-library : using base_url: https:https://de1lib.org

with this patch URL is fixed to:

    DEBUG   searx.engines.z-library : using base_url: https://de1lib.org

Closes: https://github.com/searxng/searxng/issues/1435
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-05 22:27:55 +02:00
ta 14756a2674 [mod] Adds Lingva translate engine
Add the lingva engine (which grabs data from google translate).  Results from
Lingva are added to the infobox results.
2022-07-04 19:06:45 +02:00
Markus Heiser 5831c15b49 [fix] engines/openstreetmap.py typo: user_langage --> user_language
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-02 16:51:25 +02:00
Alexandre Flament 6716c6b0c3 openstreetmap engine: return the localized named.
For example: display "Tokyo" instead of "東京都" when the language is English.
2022-07-02 16:51:25 +02:00
ta 8883aed132 [fix] google play apps engine: implement engines/google_play_apps.py 2022-06-18 16:02:39 +02:00
Alexandre Flament 5bcbec9b06 Fix: use sys.modules.copy() to avoid RuntimeError
use sys.modules.copy() to avoid "RuntimeError: dictionary changed size during iteration"
see https://github.com/python/cpython/issues/89516
and https://docs.python.org/3.10/library/sys.html#sys.modules

close https://github.com/searxng/searxng/issues/1342
2022-06-18 07:39:46 +02:00
Alexandre Flament 2455f1d06a
Merge pull request #1308 from allendema/add-yep-com-json
[enh] Add yep.com via json_engine
2022-06-12 11:09:04 +02:00
Allen fd9a13a3e5 [enh] Initial no paging support for Yep.com
Upstream example query:
https://yep.com/web?q=test

https://yep.com/about
2022-06-11 14:17:44 +02:00
Alexandre Flament cd2dd5dd55 Wikidata engine: ignore dummy entities
Close #641
2022-06-11 11:09:21 +02:00
Alexandre Flament d068b67a71 Wikidata engine: minor change of the SPARQL request
The engine can be slow especially when the query won't return any answer.
See https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual/MWAPI#Find_articles_in_Wikipedia_speaking_about_cheese_and_see_which_Wikibase_items_they_correspond_to

Related to #1290
2022-06-11 10:50:11 +02:00
Markus Heiser 2de007138c [fix] prepare for pylint 2.14.0
Remove issue reported by Pylint 2.14.0:

- no-self-use: has been moved to optional extension [1]
- The refactoring checker now also raises 'consider-using-generator' messages
  for max(), min() and sum(). [2]

.pylintrc:
  - <option name>-hint has been removed since long, Pylint 2.14.0 raises an
    error on invalid options
  - bad-continuation and bad-whitespace have been removed [3]

[1] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/summary.html#removed-checkers
[2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/full.html#what-s-new-in-pylint-2-14-0
[2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.6/summary.html#summary-release-highlights

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-06-03 15:41:52 +02:00
Allen 43dc9eb7d6 [enh] Initial Petalsearch Images support
Upstream example query:

  https://petalsearch.com/search?query=test&channel=image&ps=50&pn=1&region=de-de&ss_mode=off&ss_type=normal

Depending on locale it will internally use some/all results from other
engines. See:

  https://seirdy.one/posts/2021/03/10/search-engines-with-own-indexes/#general-indexing-search-engines
2022-06-02 14:32:37 +02:00
Émilien Devos 06cb15cbf7
Reflect the real world parameter from settings.yml 2022-05-10 20:44:35 +00:00
Markus Heiser 4326009d00 [format.python] based on bugfix in 9ed626130 2022-05-07 18:23:10 +02:00
capric98 8c7e6cc983 [fix] FutureWarning from lxml
Just in case if content is None, the original code will skip extract_text(), and
just append the None value to 'content'. So just add allow_none=True, and this
will return None without raising a ValueError in extract_text().
2022-04-22 16:09:36 +02:00
Alexandre Flament bbf13a4657
Merge pull request #1101 from allendema/pass-cookies-from-settings
[enh] Allow passing headers/cookies from settings.yml
2022-04-17 11:37:07 +02:00
Allen dae8a08089
[fix[ Update only cookies/headers 2022-04-17 11:29:23 +02:00
Allen 67fb6fba84
[lint] Remove whitespace
From GH GUI
2022-04-17 10:42:25 +02:00
Allen 15862ebc35
[mod] Pass desired ebay domain in settings
https://www.ebay.de
https://www.ebay.com
htttps://www.ebay.es

etc
2022-04-16 19:10:35 +02:00
Allen 155333f625
[enh] Allow passing headers/cookies from settings.yml
Example:

   - engine: xpath
   - search_url: example.org
   - headers: {'example_header': 'example_header'}
   - cookies: {'safesearch': 'off'}
2022-04-16 17:42:04 +02:00
Alexandre Flament c474616642
Merge pull request #1071 from return42/fix-lang-dailymotion
[fix] dailymotion engine: filter by language & country
2022-04-16 11:54:49 +02:00
Alexandre Flament 1a82e79b50 dailymotion: send valid value for the language parameter 2022-04-16 09:27:34 +02:00
Markus Heiser 3bb62823ec [fix] dailymotion engine: filter by language & country
- fix the issue of fetching more the 7000 *languages*
- improve the request function and filter by language & country
- implement time_range_support & safesearch
- add more fields to the response from dailymotion (allow_embed, length)
- better clean up of HTML tags in the 'content' field.

This is more or less a complete rework based on the '/videos' API from [1].
This patch cleans up the language list in SearXNG that has been polluted by the
ISO-639-3 2 and 3 letter codes from dailymotion languages which have never been
used.

[1] https://developers.dailymotion.com/tools/

Closes: https://github.com/searxng/searxng/issues/1065
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-04-16 09:27:34 +02:00
Jabster28 9eb1b04f48
change "Wolfram|Alpha" to "Wolfram Alpha" in search results 2022-04-12 10:37:33 +01:00
Alexandre Flament 592cea0e5e
Merge pull request #1030 from austinhuang0131/master
(feat) add jisho.org
2022-04-09 18:57:20 +02:00
Alexandre Flament 74c7aee9ec jisho : code refactoring 2022-04-09 18:01:57 +02:00
Austin Huang 19fa0095a0
(fix) satisfy the linter, and btw reduce timeout 2022-04-01 09:23:24 -04:00
Austin Huang a399248f56
update jisho.py according to suggestions 2022-04-01 09:18:19 -04:00
Alexandre FLAMENT f00cdb5e51 bing engine: _fetch_supported_languages: don't use the language code as a country
ref #1029
2022-03-31 20:03:34 +00:00
Austin Huang 934ae4e086
(feat) add jisho.org
Closes #1016
2022-03-31 14:45:39 -04:00
Alexandre Flament 378b29be2f fix startpage: update XPath in _fetch_supported_languages 2022-03-19 14:16:37 +01:00
Markus Heiser 53b5a804e2 [fix] engine mediathekviewweb: replace http links by https
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-03-07 19:49:16 +01:00
Markus Heiser 20f4538e13 [fix] engine: Semantic Scholar (Science) // rework & fix
Closes: https://github.com/searxng/searxng/issues/939
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-03-05 11:53:41 +01:00
Markus Heiser 8d937179ab
Merge pull request #913 from return42/add-artwork
[mod] add artwork to mixcloud & soundcloud engines
2022-02-21 22:24:40 +01:00
Markus Heiser b08b81b434 [mod] bandcamp & genius: in result set img_src instead thumbnail
Suggested-by: @dalf https://github.com/searxng/searxng/pull/900#issuecomment-1046009057
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-21 22:12:07 +01:00
Markus Heiser bded1ee280 [fix] genius: add player an avoid exceptional programming
Add player:

- The players are just playing 30sec from the title.  Some of the player will be
  blocked because of a cross-origin request and some players will link to apple
  when you press the play button.

Avoid exceptions and (and BTW improve results)

-  ERROR   searx.engines.genius          : list index out of range

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-21 22:12:07 +01:00
Markus Heiser 36aee70c24
Merge pull request #910 from tiekoetter/fix-909
[fix] google images engine: Fix 'scrap_img_by_id' function
2022-02-20 18:29:50 +01:00
Markus Heiser 2921d3cd17 [mod] add artwork to mixcloud & soundcloud engines
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-19 21:59:12 +01:00
Markus Heiser 4a28b593c2 [fix] google images engine: Fix 'scrap_img_by_id' function
The 'scrap_img_by_id' function didn't return any longer anything useful.  This
fix allows the google images engine to present the full source image instead of
only the thumbnail.

The function scrap_img_by_id() is rpelaced by a fully rewrite to parse image
URLs by a regular expression. The new function parse_urls_img_from_js(dom)
returns a mapping of data-id to image URL.

Closes: https://github.com/searxng/searxng/issues/909
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-19 14:33:56 +01:00
Alexandre Flament ace5401632
Merge pull request #900 from return42/fix-883
[fix] bandcamp: fix itemtype (album|track) and exceptions
2022-02-19 13:42:53 +01:00
Markus Heiser 943a7fdcb5 [mod] mediathekviewweb engine: add iframe_src and use videos template
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-19 00:50:54 +01:00
Markus Heiser 05c105b837 [fix] bandcamp: fix itemtype (album|track) and exceptions
BTW: polish implementation and show tracklist for albums

Closes: https://github.com/searxng/searxng/issues/883
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-18 22:44:43 +01:00
Markus Heiser 7352c6bc79 [mod] templates: rename field for <iframe> URL to iframe_src
Rename result field data_src to iframe_src

Suggested-by: @dalf https://github.com/searxng/searxng/pull/882#issuecomment-1037997402
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-18 19:00:49 +01:00
Markus Heiser 98cab4cf75 [mod] result_templates/default.html replace embedded HTML by data_src audio_src
Embedded HTML breaks SearXNG architecture.  To modularize, HTML is generated in
the templates (oscar & simple) and result parameter 'embedded' is replaced by
'data_src' (and 'audio_src'), an URL for embedded content (<iframe>).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-13 14:20:47 +01:00
Markus Heiser 46e131fdad [mod] result_templates/videos.html: replace embedded HTML by data_src
Embedded HTML breaks SearXNG architecture.  To modularize, HTML is generated in
the templates (oscar & simple) and result parameter 'embedded' is replaced by
'data_src', an URL for embedded content (<iframe>).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-13 14:20:47 +01:00
Émilien Devos 7d3e8118b0
Update the XPath for fetching the Google results 2022-02-09 14:34:14 +01:00
Markus Heiser 906a0a99cd [fix] openstreatmap: load thumbnail from uploads.wikimedia.org
Openstreatmap images are now loaded from uploads.wikimedia.org instead of
commons.wikimedia.org to prevent redirects.

With `image_proxy` enabled images from commons.wikimedia.org cant be loaded
since they are redirected.  We already discussed this issue [875] and
@tiekoetter fixed this issue in PR [878].

Related-to:
- [875] https://github.com/searxng/searxng/issues/875
- [878] https://github.com/searxng/searxng/pull/878
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-07 13:05:52 +01:00
Markus Heiser a967e59590 [pylint] searx/engines/wikidata.py (no functional change)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-07 10:15:32 +01:00
Léon Tiekötter 1c151ae92b
[fix] wikidata: URL decoding and file extension handling
Add '.png' to the second img_src_name if it has the extension '.svg'.
Use urllib.parse.unquote for URL decoding.
2022-02-07 00:21:02 +01:00
Markus Heiser a13c5d70c7 [fix] wikidata engine: select image with higher (not lower) priority
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-06 23:35:55 +01:00
Léon Tiekötter a50f32bcfc
wikidata: load thumbnail instead of full image 2022-02-06 23:25:50 +01:00
Léon Tiekötter 560a14e77b
[fix] wikidata info box images
Wikidata info box images are now loaded from uploads.wikimedia.org instead of commons.wikimedia.org to prevent redirects

Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-06 22:16:06 +01:00
Markus Heiser b35ef9789b [pylint] engines/invidious.py
Fix remarks from pylint and remove usless comments

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-04 15:42:06 +01:00
Markus Heiser e2ec6b4211 [fix] invidious engine: store random base_url in param
Two different threads ( = two different user queries) can call the request
function in a row and then the response function.  The namespace will be same
since this is the same engine.

To keep exactly the same value ``base_url`` must be stored in params and then
retrieve using ``resp.search_params["base_url"]``.

Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#discussion_r799324861
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-04 15:42:06 +01:00
Markus Heiser ddc2102a07 [fix] solidtorrents engine: store random bas_url in param
Two different threads ( = two different user queries) can call the request
function in a row and then the response function.  The namespace will be same
since this is the same engine.

To keep exactly the same value ``base_url`` must be stored in params and then
retrieve using ``resp.search_params["base_url"]``.

Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#discussion_r799324861
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-04 14:55:21 +01:00
Markus Heiser d6061b7c8a [mod] solidtorrents engine: add metadata & torrentfile
BTW: define min_len in eval_xpath_list of 'stats' list

Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#pullrequestreview-872910744
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-04 14:53:42 +01:00
Markus Heiser f9c4868142 [fix] solidtorrents engine: use get_torrent_size from searx.utils
Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#pullrequestreview-872858489
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-04 14:53:42 +01:00
Markus Heiser d92b3d96fd [fix] solidtorrents engine: JSON API no longer exists
The API endpoint, we where using does not exist anymore.  This patch is a
rewrite that parses the HTML page.

Related: https://github.com/paulgoio/searxng/issues/17
Closes: https://github.com/searxng/searxng/issues/858

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-04 14:53:37 +01:00
Markus Heiser 50a56532c4 [pylint] engines/currency_convert.py
Fix remarks from pylint

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-01 08:02:42 +01:00
Markus Heiser 15320b5eec [fix] engines description - currency_convert.py
Currency engine has DuckDuckGo metadata

In the engine selector of the preferences window, the currency search engine has
the same metadata and wikidata url as duckduckgo, I'd assume there should be a
difference of some sort there clarifying what source the currency uses or, if
it's a duckduckgo service, at least clarifying that it's a currency service by
duck duck go.

Closes: https://github.com/searxng/searxng/issues/787
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-31 23:17:28 +01:00
Markus Heiser 60e7fee47a
Merge pull request #475 from return42/tineye
[enh] engine - add Tineye reverse image search
2022-01-31 08:51:35 +01:00
Alexandre Flament ebd3013a1a [mod] tineye engine: minor changes
* remove "disable: false" in settings.yml
* use the json() method from httpx.Response (faster character encoding detection)
2022-01-30 20:49:22 +01:00
Léon Tiekötter a6673a1a94 [fix] 1x engine
1x changed the XML result layout.
2022-01-30 19:48:40 +01:00
Markus Heiser a6b879f19c [mod] tineye engine: set engine_type to 'online_url_search'
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-30 16:30:52 +01:00
Alexandre Flament 116802852d [fix] ina engine
based on a45408e8e2
2022-01-28 22:33:41 +01:00
Markus Heiser b7f74fbe42 [mod] tineye - add some documentation
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-28 09:06:44 +01:00
Allen 880555e263 [enh] engine - add Tineye reverse image search
Other optional parameter ..

`&sort=crawl_date`
    can be appended to search_string to sort results by date.

`&domain=example.org`
    can be implemented to search_string to get results from just one domain.

Public instances could get relatively fast timed-out for 3600s.

--

Merged from @allendema's commit [1] and slightly modfied / see [2].

Related-to: [1] 455b2b4460
Related-to: [2] https://github.com/searx/searx/pull/3040
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-28 09:06:44 +01:00
Léon Tiekötter 0cbf73a1f4
Allow 'using_tor_proxy' to be set for each engine individually
Check 'using_tor_proxy' for each engine individually instead of checking globally

[fix] searx.network: update _rdns test to the last httpx version

Co-authored-by: Alexandre Flament <alex@al-f.net>
2022-01-27 22:37:02 +01:00
Markus Heiser 1a0760c10a [fix] googel engine - "some results are invalids: invalid content"
Fix google issues listet in the `/stats?engine=google` and message::

    some results are invalids: invalid content

The log is::

    DEBUG   searx                         : result: invalid content: {'url': 'https://de.wikipedia.org/wiki/Foo', 'title': 'Foo - Wikipedia', 'content': None, 'engine': 'google'}
    WARNING searx.engines.google          : ErrorContext('searx/search/processors/abstract.py', 111, 'result_container.extend(self.engine_name, search_results)', None, 'some results are invalids: invalid content', ()) True

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-18 13:23:35 +01:00
Markus Heiser f0102a95c9 [fix] google engine: remove adds and fix mobile_ui selector
1. Fix issue reported in comment [1]
2. Fix XPath selector for the response of google's mobile UI, reported in
   comment [2]

[1] https://github.com/searxng/searxng/pull/777#issuecomment-1015121322
[2] https://github.com/searxng/searxng/pull/777#issuecomment-1015236238

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-18 11:05:45 +01:00
Émilien Devos 6670063e0d
Update XPath for Google engine 2022-01-17 21:49:57 +00:00
Alexandre Flament e07417848f
Merge pull request #695 from return42/fix-sp
[fix] startpage engine / modified API
2022-01-16 20:27:36 +01:00
Alexandre Flament f9271d595f [fix] startpage: workaround to use the startpage network
workaround for the issue #762
2022-01-15 22:56:34 +01:00
Markus Heiser bf593af423 [mod] engine mysql_server: make port configurable
Cherry piked from https://github.com/searx/searx/commit/82ac634070

Suggested-by: https://github.com/searx/searx/issues/3117
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-11 23:47:40 +01:00
Markus Heiser df238e944c [mod] starpage engine: add comment about Startpage's FFox add-on
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-10 11:22:38 +01:00
Markus Heiser 21e884f369 [fix] startpage engine: fetch CAPTCHA & issues related to PR-695
In case of CAPTCHA raise a SearxEngineCaptchaException and suspend for 7 days.
When get_sc_code() fails raise a SearxEngineResponseException and suspend for 7
days.

[1] https://github.com/searxng/searxng/pull/695

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-10 11:22:38 +01:00
Markus Heiser 2f4e567e90 [fix] Get an actual `sc` argument from startpage's home page.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-10 11:22:38 +01:00
Markus Heiser 1cbcddb3f7 [pylint] Startpage engine
Fix remarks from pylint

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-10 11:22:38 +01:00
Markus Heiser f1f5e69c42 [fix] startpage engine - avoid captcha
Startpage has introduced new anti-scraping measures that make SearXNG instances
run into captchas:

1. some arguments has been removed and a new `sc` has been added.
2. search path changed from `do/search` to `sp/search`
3. POST request is no longer needed

Closes: https://github.com/searxng/searxng/issues/692
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-10 11:22:12 +01:00
Martin Fischer 576e19dad1 [fix] add default for "about" engine property
Fixes #732.
2022-01-10 08:40:06 +01:00
Markus Heiser 4fc5e5299c [fix] ccengine engine - avoid unwanted redirects
api.openverse.engineering is a little picky and wants to have a trailing slash
in the path:

    /v1/images? -->/ v1/images/?

otherwise it redirects, here is the debug log:

    DEBUG   searx.network.openverse       : HTTP Request: GET https://api.openverse.engineering/v1/images?&page=1&page_size=20&format=json&q=foo "HTTP/2 301 Moved Permanently" (text/html; charset=utf-8)
    DEBUG   searx.network.openverse       : HTTP Request: GET https://api.openverse.engineering/v1/images/?&page=1&page_size=20&format=json&q=foo "HTTP/2 200 OK" (application/json)
    WARNING searx.engines.openverse       : ErrorContext('searx/search/processors/online.py', 105, 'count_error(', None, '1 redirects, maximum: 0', ('200', 'OK', 'api.openverse.engineering')) True

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-07 14:14:31 +01:00
Léon Tiekötter 37baf46ece [fix] Rename ccengine engine to openverse
The CC engine was merged with WordPress and renamed to Openverse

Source: https://wordpress.org/news/2021/05/welcome-to-openverse/
2022-01-07 13:06:05 +01:00
Léon Tiekötter 4be6deb0a1 [fix] ccengine engine
Change domain to api.openverse.engineering
2022-01-07 13:01:37 +01:00
Markus Heiser ced656606f
Merge pull request #709 from return42/drop-etools
[fix] drop etools engine module
2022-01-07 11:18:47 +01:00
Markus Heiser 5dd3442f83 [fix] drop etools engine module
The implementation of the etools engine is poor.  No date-range support, no
language support and it is broken by a CAPTCHA.

etools is a metasearch engine, the major search engines it supports (google,
bing, wikipedia, Yahoo) are already available in SeaarXNG.

While etools does support several engines we currently don't support directly,
support for them should be added directly to SearXNG if there is demand.

In practice: in SearXNG the worse etools results will be mixed with good results
from other engines we have (as long as there is no captcha).

At best case, what we win with etools is in e.g. results from de.ask.com in a
query from a german request .. in all other cases worse results are bubble up in
SearXNG's result list.

[1] https://github.com/searxng/searxng/issues/696#issuecomment-1005855499

Closes: https://github.com/searxng/searxng/issues/696
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-07 10:41:09 +01:00
Martin Fischer e12525a1fa
Merge pull request #708 from not-my-profile/pref-refactor
Refactor `preferences`
2022-01-07 09:45:23 +01:00
Léon Tiekötter 3ab826de22 Drop microsoft academic engine
Microsoft academic was discontinued on 2021-12-31.

Source: https://www.microsoft.com/en-us/research/project/academic/articles/microsoft-academic-to-expand-horizons-with-community-driven-approach/
2022-01-07 01:35:13 +01:00
Martin Fischer bb06758a7b [refactor] add type hints & remove Setting._post_init
Previously the Setting classes used a horrible _post_init
hack that prevented proper type checking.
2022-01-06 14:21:14 +01:00
Alexandre Flament aedd6279b3
Merge pull request #634 from not-my-profile/powered-by
Introduce `categories_as_tabs` & group engines in tabs
2022-01-06 09:22:02 +01:00
Alexandre Flament d3ecadd3f8
Merge pull request #679 from dalf/brand-searxng
searxng.org: update setup.py & settings.yml
2022-01-05 19:07:53 +01:00
Martin Fischer d01e8aa8cc [mod] introduce searx.engines.Engine for type hinting 2022-01-05 11:03:44 +01:00
Martin Fischer 1e195f5b95 [mod] move group_engines_in_tab to searx.webutils 2022-01-05 11:03:44 +01:00
Martin Fischer 5d74bf3820 [enh] move dictionaries, Erowid & IMDb out of general category
The general category is the category that is searched by default.
From a privacy standpoint it doesn't make sense to send all general
queries to specialized search engines that cannot deal with those
queries anyway.
2022-01-05 11:03:44 +01:00
Martin Fischer ab90e2ac49 [enh] show categories not in any tab category in "Other" preferences tab
Previously we didn't have a good place to put search engines that don't
fit into any of the tab categories. This commit automatically puts
search engines that don't belong to any tab category in an "other"
category, that is only displayed in the user preferences (and not above
search results).
2022-01-05 11:03:44 +01:00
Martin Fischer b02f762687 [enh] add more categories 2022-01-05 11:00:11 +01:00
Martin Fischer 8e9ad1ccc2 [enh] introduce categories_as_tabs
Previously all categories were displayed as search engine tabs.
This commit changes that so that only the categories listed under
categories_as_tabs in settings.yml are displayed.

This lets us introduce more categories without cluttering up the UI.
Categories not displayed as tabs  can still be searched with !bangs.
2022-01-03 07:01:49 +01:00
Martin Fischer df34b1ddcf [enh] settings.yml: allow granular overwrites for about 2022-01-03 07:01:49 +01:00
Alexandre Flament d83aa2b0d2
Merge pull request #613 from return42/pylint-bing-images
[pylint] Bing (Images) engine
2022-01-02 22:00:55 +01:00
Alexandre Flament 76cbfbbdda reference docs.searxng.org 2022-01-02 21:18:29 +01:00
Markus Heiser 61ce0c2244 [fix] bing engines: fetch_supported_languages
The Request to and the Response from https://www.bing.com/account/general has
been changed.

[1] https://github.com/searxng/searxng/pull/672#discussion_r777104919

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-01 17:31:38 +01:00
Markus Heiser dc4f1f705d [pylint] Bing (Images) engine
Fix remarks from pylint and remove obsolete try/except block

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-28 14:43:39 +01:00
Markus Heiser 6d7a38a912 [pylint] Bing (Videos) engine
Fix remarks from pylint and remove obsolete try/except block

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-28 14:33:05 +01:00
Markus Heiser d84226bf63 [fix] issues reported by pylint
Fix pylint issues from commit (3d96a983)

    [format.python] initial formatting of the python code

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-27 10:16:20 +01:00
Markus Heiser 3d96a9839a [format.python] initial formatting of the python code
This patch was generated by black [1]::

    make format.python

[1] https://github.com/psf/black

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-27 09:26:22 +01:00
Markus Heiser fcdc2c2cd2 [format.python] disable py code formatting for some hunks of code
Disable the python code formatting from python-black, where the readability of
code suffers by formatting.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-27 09:16:03 +01:00
Martin Fischer e28c6bda35 [doc] introduce about.language and sort engines by it 2021-12-21 09:58:51 +01:00
Markus Heiser 7a215e07e7
Merge pull request #611 from return42/fix-bing
[fix] bing engine: fix paging support, show inital page.
2021-12-20 10:08:52 +01:00
Markus Heiser 2af50c2588 [pylint] Reddit engine
Add Reddit engine to pylint process

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-18 17:59:47 +01:00
Markus Heiser 6b85607274 [fix] bing engine: fix paging support, show inital page.
Follow up queries for the pages needed to be fixed.

- Split search-term in one for initial query and one for following queries.
- Set some headers in HTTP requests, bing needs for paging support.
- IMO //div[@class="sa_cc"] does no longer match in a bing response.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-18 13:50:38 +01:00
Markus Heiser b2177e5916 [pylint] Bing (Web) engine
Fix remarks from pylint and improved code-style.  In preparation for a bug-fix
of the Bing (Web) engine I add this engine to the pylint-list.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-18 13:40:36 +01:00
Markus Heiser f41734a543 [fix] engine bing-news: replace the http:// by https://
BTW: add bing_news to the pylint process

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-17 13:25:50 +01:00
Markus Heiser 8cc7c880ae
Merge pull request #587 from dalf/fix-gigablast
[fix] gigablast engine
2021-12-12 15:58:13 +01:00
Markus Heiser b5c9cc4ff3
Merge pull request #586 from dalf/remove-yggtorrent
[del] remove yggtorrent
2021-12-07 07:00:47 +01:00
Alexandre Flament 1a6207574e [fix] gigablast engine
fetch extra params after 3000 seconds
2021-12-06 22:55:15 +01:00
Alexandre Flament fbc2a6ab4b [del] remove yggtorrent
yggtorrent is behind cloudflare now
close #580
2021-12-06 21:59:51 +01:00
Alexandre Flament 037cb7dd3d [fix] imdb: don't crash when there is no result 2021-12-06 21:49:18 +01:00
Markus Heiser 6e06618e0c [fix] google-videos engine: ignore news articles
In the video search, google also sometimes includes news.  E.g. in the DE
language when you search for `!gov paris`, google adds an article from a german
newspaper (FAZ), I assume these are sponsored link (not tagged advertisement?)

Those links do not have an image / this patch ignores *video links* wqithout an
image ID.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-26 17:11:20 +01:00
Markus Heiser 1ce09df9aa [fix] google video engine - rework of the HTML parser
The google video response has been changed slightly, a rework of the parser was
needed.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-26 01:14:17 +01:00
Markus Heiser 488ace1da9 [fix] google engine - suggestion
BTW: google no longer offers *spelling suggestions*

Closes: https://github.com/searxng/searxng/issues/442
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-25 19:42:03 +01:00
Markus Heiser 5b28c9109f [fix] google images: @href index 0 not found
Sometimes there is no href in the `<a ..>` tag of a *link_node* [1].

[1] https://github.com/searxng/searxng/issues/532

Reported-by: @TheEssem
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-21 09:55:59 +01:00
Markus Heiser 4c82ac7670 [drop] engine digg - https://digg.com/api is no longer available
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-19 15:00:22 +01:00
Tom e1d60051ca
[fix] Qwant search query string
Search string: "!qwant time"
Resulting request URL: https://api.qwant.com/v3/search/web?q=q=time&count=10&offset=0&device=desktop&safesearch=1&locale=en_US
Notice the double "q="

Resulting request URL after fix: https://api.qwant.com/v3/search/web?q=time&count=10&offset=0&device=desktop&safesearch=1&locale=en_US
2021-11-17 18:13:54 +01:00
MrPaulBlack 41494d9f47 [fix] make reddit only in social media category avail.
fix https://github.com/searxng/searxng/issues/470
2021-11-01 20:37:17 +01:00
Alexandre Flament 64b29ad838 [mod] microsoft academic: increase timeout to 6 seconds
also avoid a crash when there is no result

close #433
2021-10-26 12:26:43 +02:00
Markus Heiser 713814547a [fix] yahoo engine - don't lump all search suggestions together
Closes: https://github.com/searxng/searxng/issues/421
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-21 07:51:05 +00:00
Markus Heiser f63ffbb22b [fix] engine - yahoo: rewrite and fix issues
Languages are supported by mapping the language to a domain.  If domain is not
found in :py:obj:`lang2domain` URL ``<lang>.search.yahoo.com`` is used.

BTW: fix issue reported at https://github.com/searx/searx/issues/3020

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-16 20:05:26 +00:00
Markus Heiser 38a157b56f [pylint] engines: yahoo fix several issues reported from pylint
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-16 20:05:26 +00:00
MrPaulBlack 00b0394e19 [fix] language param for qwant 2021-10-14 16:11:44 +00:00
Noémi Ványi 4cc1ee8565 [fix] qwant engine - only get results from categories
Reported-by: https://github.com/searx/searx/issues/3014
Cherry-picked: https://github.com/searx/searx/commit/3bcca43
2021-10-12 18:42:50 +00:00
Paolo Basso 64df011e2f [mod] engines - add zlibrary engine 2021-10-11 14:58:44 +00:00
Markus Heiser 3abbe6d25b [fix] engine torznab - categories, before join convert int to str
BTW add init() function and replace SearxEngineAPIException by ValueError.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-07 15:27:55 +00:00
Markus Heiser 9fb77065bd [fix] engine torznab - marginal issues reported from linters
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-07 15:27:55 +00:00
Paolo Basso d803df8d89 [mod] engines - add torznab WebAPI 2021-10-07 15:27:55 +00:00
Markus Heiser 19e41c137e [mod] set 'engine.supported_languages' from the origin python module
The key of the dictionary 'searx.data.ENGINES_LANGUAGES' is the *engine name*
configured in settings.xml.  When multiple engines are configured to use the
same origin engine (e.g. `engine: google`)::

    - name: google
      engine: google
      use_mobile_ui: false
      ...

    - name: google italian
      engine: google
      use_mobile_ui: false
      language: it
      ...

    - name: google mobile ui
      engine: google
      shortcut: gomui
      use_mobile_ui: true

There exists no entry for ENGINES_LANGUAGES[engine.name] (e.g. `name: google
mobile ui` or `name: google italian`).  This issue can be solved by recreate the
ENGINES_LANGUAGES::

    make data.languages

But this is nothing an SearXNG admin would like to do when just configuring
additional engines, since this just doubles entries in ENGINES_LANGUAGES and
BTW: `make data.languages` has various external requirements which might be not
installed or not available, on a production host.

With this patch, if engine.name fails, ENGINES_LANGUAGES[engine.engine] is used
to get the engine.supported_languages (e.g. `google` for the engine named
`google mobile`).

For an engine, when there is `language: ...` in the YAML settings, the engine
supports only one language, in this case engine.supported_languages should
contains this value defined in settings.yml (e.g. `it` for the engine named
`google italian`).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Closes: https://github.com/searxng/searxng/issues/384
2021-10-07 08:45:02 +02:00
Alexandre Flament 8a897b86f1 [mod] engines - IMDB: add thumbnails 2021-10-05 09:10:02 +02:00
Paul Alcock 823d44ed0a [mod] engines - add IMDB / Internet Movie Database
Merged from @Guilvareux's commit [1] and slightly modfied / see [2].

[1] https://github.com/searx/searx/pull/2980/commits/f2f90071
[2] https://github.com/searx/searx/pull/2980
2021-10-03 11:44:25 +02:00
Markus Heiser a5b7ed9550 [mod] engine duckduckgo - update supported_languages_url
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-01 20:01:41 +02:00
Markus Heiser 4c9b8b29ee [mod] engine duckduckgo - use DuckDuckGo-Lite
Implement a scrapper for DuckDuckGo-Lite [1].  The existing DuckDuckGo [2]
engine does not support paging.  DuckDuckgo-Lite is much faster, less verbose
and does have a paging option (reversed engineered from the input form of [1]).

[1] https://lite.duckduckgo.com/lite
[2] https://duckduckgo.com/

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-01 20:01:41 +02:00
Markus Heiser ecb3912bd0 [fix] engine stackexchange - decode HTML entities in title & content
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-29 08:08:18 +02:00
Markus Heiser b62851559b [mod] replace old stackoverflow engine by Stack Exchange API v2.3
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-28 19:12:37 +02:00
Markus Heiser 55fee1e45d [mod] engines - add Stack Exchange API v2.3
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-28 19:01:04 +02:00
Alexandre Flament b046322c7b
Merge pull request #333 from dalf/enh-engine-descriptions
RFC: /preferences: display engine descriptions
2021-09-25 11:29:25 +02:00
Alexandre Flament ab569c1e12 [fix] openstreetmap engine: optmizer SPARQL query
add
hint:Query hint:optimizer "None".
to the SPARQL query to keep the response time small.

It tells the optimizer to follow the path from ?item to the different property values
instead of the other way around.
See https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/query_optimization#Property_paths
2021-09-25 11:16:22 +02:00
Alexandre Flament 8961131497 [fix] fix the about section of some engines 2021-09-24 20:20:30 +02:00
Alexandre Flament 6f11b61cd5 [fix] openstreetmap engine: map "all" language to English 2021-09-24 20:12:18 +02:00
Markus Heiser 443bf35e09 [pylint] fix global-variable-not-assigned issues
If there is no write access, there is no need for global.  Remove global
statement if there is no assignment.

global-variable-not-assigned:
  Using global for names but no assignment is done Used when a variable is
  defined through the "global" statement but no assignment to this variable is
  done.

In Pylint 2.11 the global-variable-not-assigned checker now catches global
variables that are never reassigned in a local scope and catches (reassigned)
functions [1][2]

[1] https://pylint.pycqa.org/en/latest/whatsnew/2.11.html
[2] https://github.com/PyCQA/pylint/issues/1375

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-17 10:14:27 +02:00
Alexandre Flament 602cbc2c99
Merge pull request #297 from dalf/engine-logger-enh
debug mode: more readable logging
2021-09-14 07:06:28 +02:00
Alexandre Flament f8793fbda0 [fix] logger per engine: make .logger is always initialized
the openstreetmap engine imports code from the wikidata engine.
before this commit, specific code make sure to copy the logger variable to the wikidata engine.

with this commit searx.engines.load_engine makes sure the .logger is initialized.
The implementation scans sys.modules for module name starting with searx.engines.
2021-09-13 08:47:59 +02:00
Alexandre Flament 0e42db9da1 [mod] xpath engine: remove logging of the requested URL 2021-09-11 10:13:16 +02:00
Markus Heiser f0059b80ed [pylint] engines: drop no longer needed 'missing-function-docstring'
Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914168470
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07 13:26:59 +02:00
Markus Heiser 82847df300 [fix] add 'categories' to PYLINT_ADDITIONAL_BUILTINS_FOR_ENGINES
androp no longer needed (see line 591 in 7b235a1)::

    # pylint: disable=undefined-variable

Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914068609
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07 10:29:38 +02:00
Markus Heiser cd033b5416 [fix] drop useless pylint: disable=undefined-variable
Since 7b235a1 (see line 591) it is no longer needed to disable
'undefined-variable' for names defined in::

   PYLINT_ADDITIONAL_BUILTINS_FOR_ENGINES

Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914068609
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07 10:26:15 +02:00
Alexandre Flament ea60c03827 [fix] fix openstreetmap engine
close #298

This is a workaround: inside engine code, any call to function in another engine can crash
since the logger won't be initialized except if it is done explicitly.
2021-09-06 22:44:22 +02:00
Markus Heiser aecfb2300d [mod] one logger per engine - drop obsolete logger.getChild
Remove the no longer needed `logger = logger.getChild(...)` from engines.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-06 18:05:46 +02:00
Markus Heiser 7b235a1c36 [mod] one logger per engine
Suggested-by: @dalf in https://github.com/searxng/searxng/issues/98#issuecomment-849013518
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-06 17:47:28 +02:00
Markus Heiser 9ff881f937 [fix] remove minimum length of content for XPath engine
Instead of raising an exception and therefore hiding all results of the engine.

It make sense to remove that requirement in order to allow the implementation of
search engines that do not always have a description.  In fact some search
engines that in 99% of the case have a description like Brave Search or Mojeek
crash completely if they for some reason included a result with no description.

To test this patch try Mojeek:

    !mjk xyz

before and after the patch.

Suggested-by: 0xhtml in https://github.com/searx/searx/discussions/2933
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-04 12:41:23 +02:00
Allen a5a0a4e106 [fix] Correct engine name in for Rumble 2021-09-04 10:22:26 +02:00
Allen 49bbd250d9 [fix] Update about section of Invidious
Another website and new documentation
2021-09-04 10:22:07 +02:00
Markus Heiser b83c14cf6b [pylint] Pylint 2.10 - fix use-list-literal & use-dict-literal
Pylint 2.10 added new default checks [1]:

use-list-literal
  Emitted when list() is called with no arguments instead of using []

use-dict-literal
  Emitted when dict() is called with no arguments instead of using {}

[1] https://pylint.pycqa.org/en/latest/whatsnew/2.10.html

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-08-31 10:40:29 +02:00
Noémi Ványi 3d5e6e0abb [enh] google: add filter=0 to Google engine for more results
backport from searx ( 23b3b56a06ef831af0a1b30a12c26ebd50e329bb )
2021-08-21 17:46:16 +02:00
Samuel Dudik 7a7ef9cea6 [fix] Seznam engine - some XPath selectors has been changed
Merged from https://github.com/dudik/searx/commit/5a4207759

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-27 07:13:41 +02:00
Alexandre Flament 48fe83b901
Merge pull request #221 from dalf/fix-peertube_fetch_supported_languages
[fix] peertube: update _fetch_supported_languages
2021-07-25 10:30:53 +02:00
Markus Heiser fe67f1478f [fix] qwant engine - prevent API locale exception on lang 'all'
Has been reported in [1], error message::

    Error
        Error: searx.exceptions.SearxEngineAPIException
        Percentage: 0
        Parameters: ('API error::locale must be a string,locale must be one of
        the following values: en_gb, en_ie, en_us, en_ca, en_in, en_my, en_au,
        en_nz, cy_gb, gd_gb, de_de, de_ch, de_at, fr_fr, br_fr, fr_be, fr_ch,
        fr_ca, fr_ad, fc_ca, ec_ca, co_fr, es_es, es_ar, es_cl, es_co, es_mx,
        es_pe, es_ad, ca_es, ca_ad, ca_fr, eu_es, eu_fr, it_it, it_ch, pt_br,
        pt_pt, pt_ad, nl_be, nl_nl, pl_pl, zh_hk, zh_cn, fi_fi, bg_bg, et_ee,
        hu_hu, da_dk, nb_no, sv_se, ko_kr, th_th, cs_cz, ro_ro, el_gr',)
        File name: searx/engines/qwant.py:114
        Function: response
        Code: raise SearxEngineAPIException('API error::' + msg)

[1] https://github.com/searxng/searxng/issues/222

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-24 14:48:27 +02:00
Markus Heiser ca57c7421b [fix] qwant engine - prevent exception on date/time value is None
Has been reported in [1], error messages::

  Error
       Error: ValueError
       Percentage: 0
       Parameters: ()
       File name: searx/engines/qwant.py:159
       Function: response
       Code: pub_date = datetime.fromtimestamp(item['date'], None)

    Error
        Error: TypeError
        Percentage: 0
        Parameters: ('an integer is required (got type NoneType)',)
        File name: searx/engines/qwant.py:196
        Function: response
       Code: pub_date = datetime.fromtimestamp(item['date'])

Fix timedelta from seconds to milliseconds [1], error message::

    Error
        Error: TypeError
        Percentage: 0
        Parameters: ('unsupported type for timedelta seconds component: NoneType',)
        File name: searx/engines/qwant.py:195
        Function: response
        Code: length = timedelta(seconds=item['duration'])

[1] https://github.com/searxng/searxng/issues/222

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-24 14:48:14 +02:00
Alexandre Flament b0a12924a0 [fix] peertube: update _fetch_supported_languages
update the regex to match the changes in peertube source code
fix "make data.languages"
2021-07-23 12:03:16 +02:00
Alexandre Flament f523fd3ea7
Merge pull request #211 from MarcAbonce/onions_v3_fix_searxng
Update onion engines to v3
2021-07-16 17:25:37 +02:00
Alexandre Flament d47b8e36cf
Merge pull request #207 from return42/mongodb
[enh] add mongodb offline engine
2021-07-16 16:15:01 +02:00
Alexandre Flament 0d65a81b1c [mod] qwant engine: fix typos / minor change
minor modification of commit 628b5703f3
(no functionnal change)
2021-07-16 15:32:12 +02:00
Marc Abonce Seguin 1b05ea6a6b update onion engines to v3
remove not_evil which has been down for a while now:
https://old.reddit.com/r/onions/search/?q=not+evil&restrict_sr=on&t=year
2021-07-16 01:36:34 -07:00
Markus Heiser 0a9cd08bf1 [enh] add mongodb offline engine
Cherry-Pick: https://github.com/searx/searx/commit/198aad43
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-15 21:35:33 +02:00
Markus Heiser 628b5703f3 [mod] improve video results of the qwant engine
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-15 20:10:37 +02:00
Alexandre Flament f376b4ed3e
Merge pull request #205 from unixfox/patch-2
Add missing parameter for mobile UI search
2021-07-15 17:19:12 +02:00
Émilien Devos 6c9f276571
Add missing parameter for mobile UI search 2021-07-15 13:00:32 +00:00
Markus Heiser ef6e1bd6b9 [fix] Qwant engines - implement API v3 and add 'quant videos'
The implementation uses the Qwant API (https://api.qwant.com/v3). The API is
undocumented but can be reverse engineered by reading the network log of
https://www.qwant.com/ queries.

This implementation is used by different qwant engines in the settings.yml::

  - name: qwant
    categories: general
    ...
  - name: qwant news
    categories: news
    ...
  - name: qwant images
    categories: images
    ...
  - name: qwant videos
    categories: videos
    ...

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-14 09:47:32 +02:00
Markus Heiser 513c73a309 [drop] engine torrentz: torrentz2.eu and torrentz2.is are offline
[1] https://torrentfreak.com/torrentz2-eu-domain-suspended-by-registry-on-public-prosecutors-order-200628/

Suggested-by: @rasos https://github.com/searx/searx/issues/1875#issuecomment-877755872
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-11 13:24:33 +02:00
Émilien Devos d9d9bd720d
Fix google images
Proposed fix in https://github.com/searx/searx/pull/2115#issuecomment-876716010
2021-07-10 14:09:29 +00:00
Markus Heiser 0ef6aa5126 [docs] add documentation from the sources of the google engines
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-21 18:25:52 +02:00
Markus Heiser 05e90f2e57 [fix] google answers: normalize space of the answers.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-21 16:50:25 +02:00
Markus Heiser f096d68ec6 [mod] google engine: reduce mobile UI parameters to what is needed
Reverse engineering shows that not all of the parameters used by google's mobile
UI (aka "more results" button) are needed [1].

[1] https://github.com/searxng/searxng/pull/160#issuecomment-865013625

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-21 16:50:23 +02:00
Alexandre Flament 7a5c36408a [mod] google: add "use_mobile_ui" parameter to use mobile endpoint.
disable by default, it has to be enabled in settings.yml

related to  #159
2021-06-21 14:52:04 +02:00
Markus Heiser 9328c66e93 [fix] google news - send CONSENT Cookie to not be redirected
In the EU there exists a "General Data Protection Regulation" [1] aka GDPR (BTW:
very user friendly!) which requires consent to tracking.  To get the consent
from the user, google-news requests are redirected to confirm and get a CONSENT
Cookie from https://consent.google.de/s?continue=...

This patch adds a CONSENT Cookie to the google-news request to avoid
redirection.

The behavior of the CONTENTS cookies over all google engines seems similar but
the pattern is not yet fully clear to me, here are some random samples from my
analysis ..

Using common google search from different domains::

    google.com:        CONSENT=YES+cb.{{date}}-14-p0.de+FX+816
    google.de:         CONSENT=YES+cb.{{date}}-14-p0.de+FX+333
    google.fr:         CONSENT=YES+srp.gws-{{date}}-0-RC2.fr+FX+826

When searching about videos (google-videos)::

    google.es:         CONSENT=YES+srp.gws-{{date}}-0-RC2.es+FX+076
    google.de:         CONSENT=YES+srp.gws-{{date}}-0-RC2.de+FX+171

Google news has only one domain for all languages::

    news.google.com:   CONSENT=YES+cb.{{date}}-14-p0.de+FX+816

Using google-scholar search from different domains::

    scholar.google.de: CONSENT=YES+cb.{{date}}-14-p0.de+FX+333
    scholar.google.fr: does not use such a cookie / did not ask the user
    scholar.google.es: does not use such a cookie / did not ask the user

Interim summary:

  Pattern is unclear and I won't apply the CONSENT cookie to all google engines.
  More experience is need before we generalize the CONSENT cookies over all
  google engines.

Related:

- e9a6ab401 [fix] youtube - send CONSENT Cookie to not be redirected
- https://github.com/benbusby/whoogle-search/issues/311
- https://github.com/benbusby/whoogle-search/issues/243

[1] https://en.wikipedia.org/wiki/General_Data_Protection_Regulation
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-18 13:21:20 +02:00
Markus Heiser dd7b53d369 [fix] google-news engine - KeyError: 'hl in request
Since we added

- 1c67b6aec [enh] google engine: supports "default language"

there is a KeyError: 'hl in request,error pattern::

    ERROR:searx.searx.search.processor.online:engine google news : exception : 'hl'
    Traceback (most recent call last):
      File "searx/search/processors/online.py", line 144, in search
        search_results = self._search_basic(query, params)
      File "searx/search/processors/online.py", line 118, in _search_basic
        self.engine.request(query, params)
      File "searx/engines/google_news.py", line 97, in request
        if lang_info['hl'] == 'en':
      KeyError: 'hl'

Closes: https://github.com/searxng/searxng/issues/154
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-18 11:34:11 +02:00
Markus Heiser 343570f7fb [pylint] searx/engines/duckduckgo_definitions.py
BTW: normalize indentations

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-14 09:22:29 +02:00
Markus Heiser 2ac3e5b20b [fix] log messages from: google- images, news, scholar, videos
- HTTP header Accept-Language --> lang_info['headers']['Accept-Language']
- remove obsolete query_url log messages which is already logged by
  httpx._client:HTTP request

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-11 16:31:50 +02:00
Markus Heiser 1ac3961336 [mod] google - get_lang_info add documentataion & comments
BTW: remove obsolete log messages from google engine

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-11 16:06:36 +02:00
Alexandre Flament 1c67b6aece [enh] google engine: supports "default language"
Same behaviour behaviour than Whoogle [1].  Only the google engine with the
"Default language" choice "(all)"" is changed by this patch.

When searching for a locate place, the result are in the expect language,
without missing results [2]:

  > When a language is not specified, the language interpretation is left up to
  > Google to decide how the search results should be delivered.

The query parameters are copied from Whoogle.  With the ``all`` language:

- add parameter ``source=lnt``
- don't use parameter ``lr``
- don't add a ``Accept-Language`` HTTP header.

The new signature of function ``get_lang_info()`` is:

    lang_info = get_lang_info(params, lang_list, custom_aliases, supported_any_language)

Argument ``supported_any_language`` is True for google.py and False for the other
google engines.  With this patch the function now returns:

- query parameters: ``lang_info['params']``
- HTTP headers: ``lang_info['headers']``
- and as before this patch:
  - ``lang_info['subdomain']``
  - ``lang_info['country']``
  - ``lang_info['language']``

[1] https://github.com/benbusby/whoogle-search
[2] https://github.com/benbusby/whoogle-search/releases/tag/v0.5.4
2021-06-10 10:22:01 +02:00
Markus Heiser bf10b4a857 [fix] openstreetmap - fix some minor whitespace & indentation issues
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-09 18:08:23 +02:00
Alexandre Flament c75425655f [enh] openstreetmap / map template: improve results
implements ideas described in #69

* update the engine
* use wikidata
* update map.html template
2021-06-09 18:08:23 +02:00
Markus Heiser 5c5db719d2
Merge pull request #97 from return42/drop-searx-admin
[docs] reorder blog articles
2021-06-08 10:56:18 +00:00
Alexandre Flament 8194db4e21 [fix] peertube fetch supported languages
close #127
2021-06-04 16:17:20 +02:00
Markus Heiser f122cb0e27 [fix] typo: online_dictionnary --> online_dictionary
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-04 15:05:58 +02:00
Markus Heiser 79cc82a4db [docs] add engine "Demo Online Engine"
This engine just exists for documentation purpose.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-04 15:05:58 +02:00
Markus Heiser 1c8cf1d3a8 [docs] add engine "Demo Offline Engine"
This engine just exists for documentation purpose.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-04 15:04:38 +02:00
Alexandre Flament 7457f3fe40
Merge pull request #124 from return42/searx-merge
merge redis offline engine from searx
2021-06-02 12:35:33 +02:00
Markus Heiser 39c18274c6 [fix] enigine redis - avoid error when the engine is loaded
Should be _redis_client to avoid an error when the engine is loaded.

Suggested-by: @dalf https://github.com/searxng/searxng/pull/124#pullrequestreview-673885664
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-02 09:54:58 +02:00
Alexandre Flament 8375974dff [fix] sys.exit(1) when there is duplicate engine name 2021-06-01 16:37:20 +02:00
Markus Heiser 8908937046 [mod] searx.engines.load_engine return None instead of sys.exit(1)
Loading an engine should not exit the application (*). Instead
of exit, return None.

(*) RuntimeError still exit the application: syntax error, etc...

BTW: add documentation and normalize indentation (no functional change)

Suggested-by: @dalf https://github.com/searxng/searxng/pull/116#issuecomment-851865627
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-01 16:35:17 +02:00
Alexandre Flament 70a9208972 [mod] searx.engines.__init__: refactoring 2021-06-01 16:32:40 +02:00
Adam Tauber e4b6558339 [enh] add redis offline engine / https://redis.io/
Slightly modified merge of commit [97269be6], [01a8a5814a] and [c8d2b5eb] from
searx.

[97269be6] https://github.com/searx/searx/commit/97269be6
[01a8a581] https://github.com/searx/searx/commit/01a8a581
[c8d2b5eb] https://github.com/searx/searx/commit/c8d2b5eb

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-01 11:51:25 +02:00
Alexandre Flament 4b07df62e5 [mod] move all default settings into searx.settings_defaults 2021-06-01 08:10:15 +02:00
Kyle Anthony Williams d6a2d4f969 [enh] add engine - Docker Hub
Slightly modified merge of commit [1cb1d3ac] from searx [PR 2543]:

      This adds Docker Hub .. as a search engine .. the engine's favicon was
      downloaded from the Docker Hub website with wget and converted to a PNG
      with ImageMagick .. It supports the parsing of URLs, titles, content,
      published dates, and thumbnails of Docker images.

[1cb1d3ac] https://github.com/searx/searx/pull/2543/commits/1cb1d3ac
[PR 2543] https://github.com/searx/searx/pull/2543

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-30 15:18:36 +02:00
Alexandre Flament 1113f7e616 [mod] the bittorent search engines are available only in the files category
related to #101
2021-05-29 16:14:19 +02:00
Noémi Ványi 87a01a1736 [enh] add MySQL engine
Slightly modified merge of [c00a33fe] from searx.

[c00a33fe] c00a33feee

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-28 17:36:46 +02:00
Noémi Ványi 324aa96062 [enh] add PostgreSQL engine
Slightly modified merge of [22079ff] from searx.

[22079ff] 22079ffdef

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-28 17:34:44 +02:00
Markus Heiser 32b5a0ef7b
Merge pull request #93 from return42/genius-misc
Some minor Genius improvements
2021-05-27 14:23:22 +00:00
Markus Heiser 25b5797a0c
Merge pull request #103 from searxng/add-sqlite-engine2
[enh] add offline engine for sqlite database
2021-05-27 14:06:42 +00:00
Alexandre Flament 2ea34a3c36 [enh] add offline engine for sqlite database
To test & demonstrate this implementation download:

  https://liste.mediathekview.de/filmliste-v2.db.bz2

and unpack into searx/data/filmliste-v2.db, in your settings.yml define a sqlite
engine named "demo"::

    - name : demo
      engine : sqlite
      shortcut: demo
      categories: general
      result_template: default.html
      database : searx/data/filmliste-v2.db
      query_str :  >-
        SELECT title || ' (' || time(duration, 'unixepoch') || ')' AS title,
               COALESCE( NULLIF(url_video_hd,''), NULLIF(url_video_sd,''), url_video) AS url,
               description AS content
          FROM film
         WHERE title LIKE :wildcard OR description LIKE :wildcard
         ORDER BY duration DESC
      disabled : False

Query to test: "!demo concert"

This is a rewrite of the implementation from commit [1]

[1] searx/searx@8e90a21

Suggested-by: @virtadpt searx/searx#2808
2021-05-27 14:27:11 +02:00
Markus Heiser dc21cb5d4b [fix] unsplash engine - 'searx:result: invalid title:'
- Use result 'alt_description' as title, if not given use
  default title 'unknown'.
- Use result 'description' from unsplash as 'content'

Fix error::

    DEBUG:searx:result: invalid title: {..., 'title': None, 'content': '', 'engine': 'unsplash'}

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-25 17:26:58 +02:00
Markus Heiser a88e3e4fea [pylint] searx/engines/unsplash.py, add logger & norm indentation
- fix messages from pylint
- add logger and log request URL
- normalized various indentation

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-25 16:45:32 +02:00
Markus Heiser f963759ccc [fix] engine genius should not use the video template
Remove 'template' from result.  Engine genius should
not use the video template.  BTW: fix indentations

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-24 16:31:14 +02:00
Markus Heiser 3a71d4b175 [pylint] searx/engines/genius.py, add logger & normalized indentation
- pylint searx/engines/genius.py
- add logger and log ignored exceptions
- normalized various indentation

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-24 16:19:06 +02:00
Markus Heiser 84a943f867 [enh] XPath engine - add time safe-search support
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-23 22:26:18 +02:00
Markus Heiser 6bfe3fd033 [enh] XPath engine - add time range support
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-23 16:49:30 +02:00
Markus Heiser 1933577c8e [enh] XPath engine - add ISO 639-1 {lang} replacement to search-URL
BTW: remove obsolte params['query'] and not needed paging condition.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-23 15:05:36 +02:00
Markus Heiser 8cd544b2a6 [doc] add documentation about the XPath engine
- pylint searx/engines/xpath.py
- fix indentation of some long lines
- add logging
- add doc-strings

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-23 11:48:21 +02:00
Markus Heiser ffcebf5e12 [enh] xpath engine - add request parameter 'soft_max_redirects'
Make 'soft_max_redirects' configurable per Xpath engine::

    - name : <engine-name>
      engine : xpath
      soft_max_redirects: 1
      ...

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-17 15:04:55 +02:00
Alexandre Flament 8c1a65d32f [mod] multithreading only in searx.search.* packages
it prepares the new architecture change,
everything about multithreading in moved in the searx.search.* packages

previously the call to the "init" function of the engines was done in searx.engines:
* the network was not set (request not sent using the defined proxy)
* it requires to monkey patch the code to avoid HTTP requests during the tests
2021-05-05 13:12:42 +02:00
Marc Abonce Seguin 448bfe6005 fix Qwant's fetch_languages function 2021-05-02 17:46:40 -07:00
Michael Ilsaas 0c43cf89ca [fix] URL to solidtorrent result page
Reported-by: https://github.com/searx/searx/pull/2786
2021-04-29 10:40:47 +02:00
Markus Heiser dc29f1d826 [pylint] tag PYLINT_FILES by comment `# lint: pylint`
These py files are linted by `test.pylint`, all other files are linted by
`test.pep8`.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-04-26 20:18:20 +02:00
Markus Heiser 28b25185c5 [brand] searxng -- fix links to issue tracker & WEB-GUI
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-04-25 14:25:08 +02:00
Markus Heiser 8efabd3ab7 [mod] core.ac.uk engine
- add to list of pylint scripts
- add debug log messages
- move API key int `settings.yml`
- improved readability
- add some metadata to results

Signed-off-by: Markus Heiser <markus@darmarit.de>
2021-04-24 09:00:53 +02:00
spongebob33 7528e38c8a add core.ac.uk engine 2021-04-24 08:55:45 +02:00
Alexandre Flament d01741c9a2
Merge pull request #15 from return42/add-springer
Add a search engine for Springer Nature
2021-04-22 13:23:31 +02:00
Pierre Chevalier a80bf1ba97 [enh] Add Springer Nature engine
Springer Nature is a global publisher dedicated to providing service to research
community [1] with official API [2].

To test this PR, first get your API key following this page:

   https://dev.springernature.com/signup

In searx/engines/springer.py at line 24, add this API key.  I left my own key,
commented out in the line aboce.  Feel free to use it, if needed.

[1] https://www.springernature.com/
[2] https://dev.springernature.com/
2021-04-22 12:35:25 +02:00
habsinn 41a2e3785e [enh] add engine using API from "The Art Institute of Chicago" 2021-04-22 12:25:43 +02:00
Markus Heiser e9a6ab4015 [fix] youtube - send CONSENT Cookie to not be redirected
In the EU there exists a "General Data Protection Regulation" [1] aka GDPR (BTW:
very user friendly!) which requires consent to tracking.  To get the consent
from the user, youtube requests are redirected to confirm and get a CONSENT
Cookie from https://consent.youtube.com

This patch adds a CONSENT Cookie to the youtube request to avoid redirection.

[1] https://en.wikipedia.org/wiki/General_Data_Protection_Regulation

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Reported-by: https://github.com/searx/searx/issues/2774
2021-04-22 12:09:09 +02:00
Alexandre Flament c6d5605d27
Merge pull request #7 from searxng/metrics
Metrics
2021-04-22 08:34:17 +02:00
Alexandre Flament b7848e3422 [fix] searxng fix: sjp engine 2021-04-21 16:31:29 +02:00
Alexandre Flament 7acd7ffc02 [enh] rewrite and enhance metrics 2021-04-21 16:24:46 +02:00
Alexandre Flament aae7830d14 [mod] refactoring: processors
Report to the user suspended engines.

searx.search.processor.abstract:
* manages suspend time (per network).
* reports suspended time to the ResultContainer (method extend_container_if_suspended)
* adds the results to the ResultContainer (method extend_container)
* handles exceptions (method handle_exception)
2021-04-21 16:24:46 +02:00
Alexandre Flament 48720e20a8 Merge remote-tracking branch 'searx/master' 2021-04-19 09:35:12 +02:00
Noémi Ványi 8362257b9a
Merge pull request #2736 from plague-doctor/sjp
Add new engine: SJP - Słownik języka polskiego
2021-04-16 17:30:14 +02:00
Noémi Ványi e56323d3c8
Merge pull request #2759 from ypid/fix/typo
Fix grammar mistake in debug log output
2021-04-16 17:26:45 +02:00
Plague Doctor d275d7a35e Code refactoring. 2021-04-16 12:23:27 +10:00
Markus Heiser 062d589f86 [fix] xpath expressions to grap all items from bandcamp's response
I also found some items missing a thumbnail and I used text_extract for content
and title, to remove unneeded whitespaces.

BTW: added bandcamp's favicon

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-04-15 08:52:11 +02:00
Kyle Anthony Williams 4d3c399ee9 [feat] add bandcamp engine 2021-04-15 08:52:11 +02:00
Alexandre Flament d14994dc73 [httpx] replace searx.poolrequests by searx.network
settings.yml:

* outgoing.networks:
   * can contains network definition
   * propertiers: enable_http, verify, http2, max_connections, max_keepalive_connections,
     keepalive_expiry, local_addresses, support_ipv4, support_ipv6, proxies, max_redirects, retries
   * retries: 0 by default, number of times searx retries to send the HTTP request (using different IP & proxy each time)
   * local_addresses can be "192.168.0.1/24" (it supports IPv6)
   * support_ipv4 & support_ipv6: both True by default
     see https://github.com/searx/searx/pull/1034
* each engine can define a "network" section:
   * either a full network description
   * either reference an existing network

* all HTTP requests of engine use the same HTTP configuration (it was not the case before, see proxy configuration in master)
2021-04-12 17:25:56 +02:00
Robin Schneider dfc66ff0f0
Fix grammar mistake in debug log output 2021-04-11 22:12:53 +02:00
Alexandre Flament eaa694fb7d [enh] replace requests by httpx 2021-04-10 15:38:33 +02:00
Plague Doctor 599ff39ddf Fix conflicts 2021-04-09 06:54:03 +10:00
Plague Doctor 6631f11305 Add new engine: SJP 2021-04-08 10:21:54 +10:00
Plague Doctor 7035bed4ee Add new engine: Wordnik.com 2021-04-08 09:58:00 +10:00
Noémi Ványi 07f5edce3d Add Meilisearch engine
Website: https://www.meilisearch.com/
2021-04-06 21:57:05 +02:00
Alexandre Flament 725a69616b
Merge pull request #2681 from dalf/fix-wikipedia-title
[fix] wikipedia: remove HTML from the title
2021-03-27 17:43:36 +01:00
Noémi Ványi 9bb312c505 Remove duplicated key from dict in Semantic Scholar 2021-03-27 16:58:32 +01:00
Noémi Ványi f596f5767b fix Semantic Scholar engine 2021-03-27 16:54:01 +01:00
Adam Tauber 28286cf3f2 [fix] update seznam engine to be compatible with the new website 2021-03-27 15:29:04 +01:00
Alexandre Flament fcfcf662ff [fix] wikipedia: remove HTML from the title
fr.wikipedia.org (and it seems not other wikipedia websites),
adds HTML to api_result['displayTitle'].
(Search for '!wp :fr Braid' for example)

The commit uses api_result['title']
2021-03-25 08:31:39 +01:00
Adam Tauber 0ba71c3644 [fix] make ina engine compatible with the new response json 2021-03-25 01:20:41 +01:00
Adam Tauber 5f450fda74 [enh] add year filter to duckduckgo 2021-03-25 00:25:36 +01:00
Adam Tauber fd737dc9d8 [fix] remove debug code 2021-03-24 23:54:39 +01:00
Alexandre Flament 38c210d746
[mod] soundcloud: faster initialization
The get_cliend_id() function:
* fetches https://soundcloud.com
* then fetches each referenced javascript URL to get the client id.

This commit fetches the javascript URLs in the reverse order: the client id is in the last javascript URL.
2021-03-21 09:29:53 +01:00
Adam Tauber 4c631ac6d0 [fix] remove debug code 2021-03-15 21:47:27 +01:00
Noémi Ványi 8158d8654a fix Microsoft Academic engine 2021-03-15 20:21:28 +01:00
Adam Tauber f97b4ff7b6 [fix] update youtube_noapi paging 2021-03-15 17:22:31 +01:00
Adam Tauber dd34ac396c
Merge pull request #2652 from kvch/solr-engine
Add Apache Solr engine
2021-03-15 15:39:39 +01:00
Alexandre Flament 1664258061
Merge pull request #2655 from return42/fix-imports
[fix] remove unused import from yahoo-news engine
2021-03-15 08:38:34 +01:00
Markus Heiser 6e1f1085ef [fix] remove unused import from yahoo-news engine
Signed-off-by: Markus Heiser <markus@darmarit.de>
2021-03-14 15:13:57 +01:00
Markus Heiser 3703ebb22a [drop] Acgsou engine - www.acgsou.com no longer exists
- https://www.acgsou.com/ acgsou.com is redirected to 36dm.club
- @rinpatch do not plan on maintaining the engine [1]

[1] https://github.com/searx/searx/pull/1283#issuecomment-798783585

Signed-off-by: Markus Heiser <markus@darmarit.de>
2021-03-14 11:49:18 +01:00
Noémi Ványi ff527e2681 Add Solr engine 2021-03-13 21:18:09 +01:00
Alexandre Flament 92dd5e245e
Merge pull request #2626 from mikeri/solidtorrents
Add Solid Torrents engine
2021-03-12 19:45:22 +01:00
Alexandre Flament a1a492baed
Merge pull request #2641 from dalf/disable_http_by_default
[mod] by default allow only HTTPS, not HTTP
2021-03-12 19:21:46 +01:00
Markus Heiser 96422e5c9f [fix] APKMirror engine - update xpath selectors and fix img_src
BTW: make the code slightly more readable

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-03-09 08:34:57 +01:00
Markus Heiser d2faea423a [fix] rewrite Yahoo-News engine
Many things have been changed since last review of this engine.  This patch fix
xpath selectors, implements suggestion and is a complete review / rewrite of the
engine.

Signed-off-by: Markus Heiser <markus@darmarit.de>
2021-03-08 11:43:34 +01:00
Alexandre Flament 99e0651cea [mod] by default allow only HTTPS, not HTTP
Related to https://github.com/searx/searx/pull/2373
2021-03-08 11:35:08 +01:00
Michael Ilsaas 5549d58de3 Add Solid Torrents engine 2021-03-07 18:14:30 +01:00
Adam Tauber 44f4a9d49a [enh] add ability to send engine data to subsequent requests 2021-03-06 12:12:35 +01:00