Commit Graph

4394 Commits

Author SHA1 Message Date
Markus Heiser e9afc4f8ce [mod] Startpage: reversed engineered & upgrade to data_type: traits_v1
One reason for the often seen CAPTCHA of the Startpage requests are the
incomplete requests SearXNG sends to startpage.com: this patch is a complete new
implementation of the ``request()`` function, reversed engineered from the
Startpage's search form.  The new implementation:

- use traits of data_type: traits_v1 and drop deprecated data_type: supported_languages
- adds time-range support
- adds save-search support
- fix searxng/searxng/issues 1884
- fix searxng/searxng/issues 1081 --> improvements to avoid CAPTCHA

In preparation for more categories (News, Images, Videos ..) from Startpage, the
variable ``startpage_categ`` was set up.  The default value is ``web`` and other
categories from Startpage are not yet implemented.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 858aa3e604 [mod] wikipedia & wikidata: upgrade to data_type: traits_v1
BTW this fix an issue in wikipedia: SearXNG's locales zh-TW and zh-HK are now
using language `zh-classical` from wikipedia (and not `zh`).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser e0a6ca96cc [doc] add a description of bing engines (web, news, video, images)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 15eaf0f15f [mod] bing_news: use async API & upgrade to data_type: traits_v1
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser ff80e7637e [mod] bing_images: use async API & upgrade to data_type: traits_v1
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser bc21d28298 [mod] bing_videos: use async API & upgrade to data_type: traits_v1
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser d0f465e6fa [mod] bing: add time_range support & upgrade to data_type: traits_v1
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser c9cd376186 [mod] replace searx.languages by searx.sxng_locales
With the language and region tags from the EngineTraitsMap the handling of
SearXNG's tags of languages and regions has been normalized and is no longer
a *mystery*.  The "languages" became "locales" that are supported by babel and
by this, the update_engine_traits.py can be simplified a lot.

Other code places can be simplified as well, but these simplifications
should (respectively can) only be done when none of the engines work with the
deprecated EngineTraits.supported_languages interface anymore.

This commit replaces searx.languages by searx.sxng_locales and fix the naming of
some names from "language" to "locale" (e.g. language_codes --> sxng_locales).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 7daf4f95ef [mod] Wikipedia: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the Wikipedia engines.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser f78f908383 [mod] Google: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the Google engines.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser dba8977b09 [mod] DuckDuckGo: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the DuckDuckGo engines.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser ef143729a0 [mod] yahoo: fetch engine traits (data_type: traits_v1)
Implements a fetch_traits function for the Yahoo engine.

.. note::

   Includes migration of the request methode from 'supported_languages' to
   'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser c1ae2ef57c [mod] qwant: fetch engine traits (data_type: traits_v1)
Implements a fetch_traits function for the Qwant engines.

.. note::

   Includes migration of the request methode from 'supported_languages' to
   'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser fc0c775030 [mod] Dailymotion: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the Dailymotion engine.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 61383edb27 [mod] Startpage: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the Startpage engine.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser d3aa690a7a [mod] bing: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the Bing engines.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser a7fe22770a [mod] Peertube: re-engineered & upgrade to data_type: traits_v1
- fetch_traits(): Fetch languages from peertube's search-index source code.

  [mod] Include migration of the request methode from 'supported_languages'
        to 'traits' (EngineTraits) object.
  [fix] old supported_languages_url is no longer valid since the sources
        has been moved to a different path.

- fixed code to pass pylint
- request(): complete re-implementation based on the API docs [1]
- response(): complete re-implementation, adds serveral fields missed before
- add source code documentation

[1] https://docs.joinpeertube.org/api-rest-reference.html#tag/Search/operation/searchVideos

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 6e5f22e558 [mod] replace engines_languages.json by engines_traits.json
Implementations of the *traits* of the engines.

Engine's traits are fetched from the origin engine and stored in a JSON file in
the *data folder*.  Most often traits are languages and region codes and their
mapping from SearXNG's representation to the representation in the origin search
engine.

To load traits from the persistence::

    searx.enginelib.traits.EngineTraitsMap.from_data()

For new traits new properties can be added to the class::

    searx.enginelib.traits.EngineTraits

.. hint::

   Implementation is downward compatible to the deprecated *supported_languages
   method* from the vintage implementation.

   The vintage code is tagged as *deprecated* an can be removed when all engines
   has been ported to the *traits method*.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
searxng-bot 9f3a57c901 [translations] update from Weblate
abfec8f4 - 2023-03-23 - return42 <markus.heiser@darmarit.de>
f02ea21c - 2023-03-23 - return42 <markus.heiser@darmarit.de>
3fc6c653 - 2023-03-20 - chenghui-lee <chlee9926@gmail.com>
342bbf46 - 2023-03-20 - return42 <markus.heiser@darmarit.de>
2023-03-24 07:07:52 +00:00
Solirs ac169a0f75 Pass black formatting test 2023-03-21 00:41:36 +01:00
Solirs e26bce33d4 WIKIDATA: Add description for results 2023-03-21 00:14:54 +01:00
Markus Heiser b61b845951
Merge pull request #2266 from return42/shuffle-cipher
[mod] Shuffle httpx's default ciphers of a SSL context randomly.
2023-03-20 12:28:05 +01:00
Markus Heiser 94430e104c
Merge pull request #2238 from return42/fix-2027
[fix] fix threshold in replace_auto_language
2023-03-19 15:30:37 +01:00
Markus Heiser f2962a2f4a
Merge pull request #2239 from return42/fix-eslintrc
[fix] remove duplicate key in simple theme ESLint configuration
2023-03-19 15:30:12 +01:00
Markus Heiser 8fa54ffddf [mod] Shuffle httpx's default ciphers of a SSL context randomly.
From the analyse of @9Ninety [1] we know that DDG (and may be other engines / I
have startpage in mind) does some kind of TLS fingerprint to block bots.

This patch shuffles the default ciphers from httpx to avoid a cipher profile
that is known to httpx (and blocked by DDG).

[1] https://github.com/searxng/searxng/issues/2246#issuecomment-1467895556

----

From `What Is TLS Fingerprint and How to Bypass It`_

> When implementing TLS fingerprinting, servers can't operate based on a
> locked-in whitelist database of fingerprints.  New fingerprints appear
> when web clients or TLS libraries release new versions. So, they have to
> live off a blocklist database instead.
> ...
> It's safe to leave the first three as is but shuffle the remaining ciphers
> and you can bypass the TLS fingerprint check.

.. _What Is TLS Fingerprint and How to Bypass It:
   https://www.zenrows.com/blog/what-is-tls-fingerprint#how-to-bypass-tls-fingerprinting

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Closes: https://github.com/searxng/searxng/issues/2246
2023-03-19 13:40:31 +01:00
Markus Heiser 677903c355
Merge pull request #2257 from Solirs/fix_bad_escape
re.escape() the query in highlight_content to prevent a server side error.
2023-03-17 08:54:54 +01:00
Solirs fbb0e9d275 [fix] server side error: escape backslashes in the query highlight_content
Any backslash escapes in the replacement are processed [1], backslashes should
be escaped [2].

[1] https://docs.python.org/3/library/re.html#re.sub
[2] https://docs.python.org/3/library/re.html#re.escape

closes:
- https://github.com/searxng/searxng/issues/2256
- https://github.com/searxng/searxng/issues/2250
2023-03-17 08:46:00 +01:00
searxng-bot 86c3757872 [translations] update from Weblate
32926a19 - 2023-03-15 - return42 <markus.heiser@darmarit.de>
7aabc876 - 2023-03-16 - Linerly <linerly@protonmail.com>
c0ed00f5 - 2023-03-14 - SonoAX <giovanniilgiovo@gmail.com>
6cf287f6 - 2023-03-13 - RhysJones <proladrhys123@outlook.com>
8c4c5f83 - 2023-03-12 - Cavemanly <k.adel.2m@protonmail.com>
dffe61fa - 2023-03-10 - return42 <markus.heiser@darmarit.de>
c7736cac - 2023-03-10 - BalkanMadman <zurabid2016@gmail.com>
e831b8e3 - 2023-03-10 - BalkanMadman <zurabid2016@gmail.com>
ef3c60af - 2023-03-10 - return42 <markus.heiser@darmarit.de>
c046a677 - 2023-03-07 - BalkanMadman <zurabid2016@gmail.com>
142041d6 - 2023-03-05 - return42 <markus.heiser@darmarit.de>
119b51df - 2023-03-05 - return42 <markus.heiser@darmarit.de>
2023-03-17 07:07:53 +00:00
Alexandre Flament 3e9cddc606
rollback test 2023-03-15 19:55:20 +01:00
Alexandre Flament 41ed0ef0c7
test 2023-03-15 19:53:53 +01:00
Markus Heiser 097d092a7f
Merge pull request #2224 from searxng/update_data_update_currencies.py
Update searx.data - update_currencies.py
2023-03-15 18:36:29 +01:00
Markus Heiser 85ef1af343
Merge pull request #2222 from searxng/update_data_update_wikidata_units.py
Update searx.data - update_wikidata_units.py
2023-03-15 18:35:36 +01:00
Markus Heiser a7f1649190 [fix] remove duplicate key in simple theme ESLint configuration
Partial merge of [PR-1736]

[PR-1736] https://github.com/searxng/searxng/pull/1736

Suggested-by: @FunctionalHacker in [1]
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-05 09:12:13 +01:00
Markus Heiser 150a90c84e [fix] fix threshold in replace_auto_language
[1] https://github.com/searxng/searxng/pull/2027#pullrequestreview-1322157677
[2] https://github.com/searxng/searxng/pull/1969#issuecomment-1345354529

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-05 08:29:58 +01:00
searxng-bot 1f36fc3a45 [translations] update from Weblate
0d8ebfe1 - 2023-03-02 - AHOHNMYC <lqwh2h2cwa@protonmail.com>
1358dd6d - 2023-03-03 - mystery-z <07juwonc@kakao.com>
4d8c13db - 2023-03-01 - lhostfree951 <freeehost9191@gmail.com>
1ae581b6 - 2023-02-28 - tygyh <jonis9898@hotmail.com>
0003698f - 2023-02-28 - ewm <gnu.ewm@protonmail.com>
31c79617 - 2023-02-28 - gjveld <gjveld@gmail.com>
9015ec73 - 2023-02-28 - gallegonovato <fran-carro@hotmail.es>
03619a68 - 2023-02-25 - BalkanMadman <zurabid2016@gmail.com>
fa90585b - 2023-02-25 - BalkanMadman <zurabid2016@gmail.com>
c902c5e5 - 2023-02-26 - tentsbet <remendne@pentrens.jp>
2023-03-03 07:08:08 +00:00
Alexandre Flament 714e83d5ea
Merge pull request #2220 from Solirs/gentoo_engine_timeout
Increase timeout for gentoo wiki engine
2023-03-01 17:57:21 +01:00
Alexandre Flament 1632f18631
Merge pull request #2227 from searxng/update_data_update_engine_descriptions.py
Update searx.data - update_engine_descriptions.py
2023-03-01 17:52:57 +01:00
Alexandre Flament 5bbbb14b62
Merge pull request #2226 from searxng/update_data_update_ahmia_blacklist.py
Update searx.data - update_ahmia_blacklist.py
2023-03-01 17:52:02 +01:00
dalf 5042d94dea Update searx.data - update_engine_descriptions.py 2023-03-01 01:48:22 +00:00
dalf e30a45812f Update searx.data - update_ahmia_blacklist.py 2023-03-01 01:38:07 +00:00
dalf 935415bfcf Update searx.data - update_currencies.py 2023-03-01 01:37:46 +00:00
dalf ccd00518fd Update searx.data - update_firefox_version.py 2023-03-01 01:37:42 +00:00
dalf d7f10909fa Update searx.data - update_wikidata_units.py 2023-03-01 01:37:25 +00:00
Solirs 35fbb3578b Increase timeout for gentoo wiki engine 2023-02-28 13:54:44 +01:00
Alexandre Flament d669da81fb
Merge pull request #2027 from dalf/fix_2018
Add "auto" as a language.
2023-02-20 12:17:38 +01:00
searxng-bot 297e463e49 [translations] update from Weblate
8ff0fa33 - 2023-02-19 - return42 <markus.heiser@darmarit.de>
2023-02-19 11:46:59 +00:00
Markus Heiser 0b1444b61e [doc] improved docs of implementations for automatic speech recognition
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-19 10:09:52 +00:00
Markus Heiser 363203c579
Merge pull request #2201 from return42/fix-2190
[doc] slight improvements to the doc of the settings (base_url)
2023-02-18 18:21:14 +01:00
Alexandre Flament 6748e8e2d5 Add "Auto-detected" as a language.
When the user choose "Auto-detected", the choice remains on the following queries.
The detected language is displayed.

For example "Auto-detected (en)":
* the next query language is going to be auto detected
* for the current query, the detected language is English.

This replace the autodetect_search_language plugin.
2023-02-17 15:17:36 +00:00
Markus Heiser bb83036f48 [fix] typo in searx/plugins/tor_check.py
Related: https://github.com/searxng/searxng/pull/2189

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-17 13:09:14 +01:00