Commit Graph

1712 Commits

Author SHA1 Message Date
Austin-Olacsi cbf1e90979 add get_embeded_stream_url to searx.utils 2024-10-03 07:10:53 +02:00
0xhtml 0a0fb450b5 [enh] engine: stract - add language/region support 2024-09-29 14:29:22 +02:00
Grant Lanham 6a3375be37 [fix] use get accessor to pull desc from bing_images 2024-09-26 07:26:51 +02:00
Zhijie He 6be56aee11 add Cloudflare AI Gateway engine
add Cloudflare AI Gateway engine

add settings for Cloudflare AI Gateway engine

set utf8 encode for data, fix non english char cause 500 error

format json data

fixed indentation and config format error

fix line-length limitation in CI

reformatted code for CI

reformatted code for CI

limit system prompts to less 120 chars

cleanup unused variable & format code
2024-09-23 07:02:10 +02:00
Grant Lanham 0b832f19bf [fix] Removes ``/>`` ending tags for void HTML elements
Removes ``/>`` ending tags for void elements [1] and replaces them with ``>``.
Part of the larger cleanup to cleanup invalid HTML throughout the codebase [2].

[1] https://html.spec.whatwg.org/multipage/syntax.html#void-elements
[2] https://github.com/searxng/searxng/issues/3793
2024-09-15 15:19:51 +02:00
Markus d3a795c7e7 [fix] engine: qwant - detect captchaUrl and raise SearxEngineCaptchaException
So far a CAPTCHA was not recognized in the response of the qwant engine and a
SearxEngineAPIException was raised by mistake.  With this patch a CAPTCHA
redirect is recognized and the correct SearxEngineCaptchaException is raised.

Closes: https://github.com/searxng/searxng/issues/3806
Signed-off-by: Markus <markus@venom.fritz.box>
2024-09-15 14:45:23 +02:00
Markus cdb4927b8b [fix] fetch_traits: brave, google, annas_archive & radio_browser
This patch fixes a bug reported by CI "Fetch traits" [1] (brave) and improves
other fetch traits functions (google, annas_archive & radio_browser).

brave:

    File "/home/runner/work/searxng/searxng/searx/engines/brave.py", line 434, in fetch_traits
      sxng_tag = region_tag(babel.Locale.parse(ui_lang, sep='-'))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/home/runner/work/searxng/searxng/searx/locales.py", line 155, in region_tag
    Error:     raise ValueError('%s missed a territory')

google:

  change ERROR message about unknow UI language to INFO message

radio_browser:

  country_list contains duplicates that differ only in upper/lower case

annas_archive:

  for better diff; sort the persistence of the traits

[1] https://github.com/searxng/searxng/actions/runs/10606312371/job/29433352518#step:6:41

Signed-off-by: Markus <markus@venom.fritz.box>
2024-09-15 12:48:35 +02:00
Bnyro 84e2f9d46a [feat] gitlab: implement dedicated module
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-09-15 08:04:21 +02:00
Lucas Schwiderski f05566d925 [fix] json_engine: Fix result fields being mixed up
Fixes #3810.
2024-09-12 10:47:08 +02:00
0xhtml c45870dd71 [fix] yep engine: remove links to other engines
Yep includes links to search for the same query on Google and other
search engines as a result in the search result. This fix skips these
results.
2024-09-12 00:04:04 +02:00
Markus Heiser 9eda4044be [fix] bilibili engine - ValueError in duration & HTML in title
- ValueError in duration: issue reported in #3799
- HTML in title: related to #3770

[#3799] https://github.com/searxng/searxng/issues/3799
[#3770] https://github.com/searxng/searxng/pull/3770

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-09-06 07:13:47 +02:00
Markus 21bfb4996e [fix] engine yahoo: HTML tags are included in result titles
- https://github.com/searxng/searxng/issues/3790

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-09-03 22:26:59 +02:00
Alexander Sulfrian 6a7b1a1a57 [fix] Do not show DDG user-agent from zero click
We do not want to show the user-agent information from the duckduckgo
zero click info. This is the user-agent used by searxng and not the
user-agent used by the user.

This was already done for the IP address in:
0fb3f0e4ae
2024-08-30 09:02:37 +02:00
Austin-Olacsi e45b771ffa [feat] engine: implementation of yandex (web, images)
It's set to inactive in settings.yml because of CAPTCHA.  You need to remove
that from the settings.yml to get in use.

Closes: https://github.com/searxng/searxng/issues/961
2024-08-21 12:08:35 +02:00
Grant Lanham 5276219b9d Fix tineye engine url, datetime parsing, and minor refactor
Changes made to tineye engine:
1. Importing logging if TYPE_CHECKING is enabled
2. Remove unecessary try-catch around json parsing the response, as this
masked the original error and had no immediate benefit
3. Improve error handling explicitely for status code 422 and 400
upfront, deferring json_parsing only for these status codes and
successful status codes
4. Unit test all new applicable changes to ensure compatability
2024-08-21 08:41:53 +02:00
0xhtml 0cfed94b08 [fix] engine google: use extract_text everywhere 2024-08-08 09:59:45 +02:00
0xhtml 7f9ce3b96e [fix] engine google: strip bubble text from answers
Google underlines words inside of answers that can be clicked to show
additional definitions. These definitions inside the answer were not
correctly handled and ended up in the middle of the answer text. With
this fix, the extra definitions are stripped from the answer shown by
the frontend.
2024-08-08 09:59:45 +02:00
Markus Heiser edfd0e2fe5 [fix] brave fetch_traits: Brave added Chinese (zh-hant) to UI
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-29 10:28:53 +02:00
Markus Heiser ee959ed9fc [fix] engine geizhals: if there are no offers, there is no best price
Fault pattern: if there are no offers, then an exception has been thrown:

    IndexError: list index out of range

This patch makes the addition of “best price” dependent on whether one exists.

Closes: #3685
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-28 19:00:51 +02:00
Bnyro 304ddd8114 [feat] videos template: support for view count 2024-07-27 11:49:58 +02:00
Bnyro 84abab0808 [feat] engine: implementation of geizhals.de 2024-07-27 11:46:25 +02:00
Sylvain Cau b9ddd59c5b [enh] Add API Key support for discourse.org forums 2024-07-27 09:21:40 +02:00
Markus Heiser 657dcb973a [fix] engine yacy: update list of base URLs
https://search.lomig.me
  Poor results / tested `!yacy :en hello` and got zero results

https://yacy.ecosys.eu
  Slow response (> 6sec for trivial search terms)

https://search.webproject.link
  Dead instance / URL offline

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-20 09:59:43 +02:00
Grant Lanham 9a4fa7cc4f Update mullvad_leta.py to account for img_elem
A recent update from Mullvad Leta introduced the img_elem. This update
broke the existing logic. Now, by checking the length of the dom_result
to see if it was included in the return results, we can handle the logic
accordingly.
2024-07-15 06:58:39 +02:00
Bnyro e4da22ee51 [feat] engine: implementation of alpine linux packages
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-14 17:57:58 +02:00
Grant Lanham ef103ba80a Implement google/brave switch in Mullvad Leta
cleanup

Import annontations
2024-07-07 08:08:11 +02:00
Bnyro 4eaa0dd275 [fix] gentoo: use mediawiki engine 2024-07-03 10:24:03 +02:00
Thomas Renard 39aaac40d6 [mod] libretranslate: add direct link to translation (engine) 2024-06-30 16:18:33 +02:00
Markus Heiser 0f9926b89a [fix] brave fetch_traits: layout of the settings page has changed
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-25 15:08:18 +02:00
Markus Heiser 39ffec87b7 [fix] engine zlibrary: handle seized domain
The domains of zlibrary instances are known to be seized from time to time.
This leads to problems when, for example, the automated tasks try to update the
engine traits (aka fetch_traits). The search function should also generate a
suitable error message (currently either SSL errors or empty result lists are
returned). [1]

[1] https://github.com/searxng/searxng/issues/3610
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-25 14:40:19 +02:00
Markus Heiser b8fa4d6195 [fix] bing news results return invalid images
Closes: https://github.com/searxng/searxng/issues/3502
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-25 11:12:41 +02:00
Grant Lanham 9a9ca307fe [fix] implement tests and remove usage of gen_useragent in engines 2024-06-23 11:51:41 +02:00
Richard Lyons f195d98bfb Fix search_url building. 2024-06-20 06:30:00 +02:00
Allen 13eec44b65 [fix] \!goi irrelevant results AND display more results 2024-06-16 16:45:03 +02:00
Bnyro e9f8412a6e [perf] torrents.html, files.html: don't parse and re-format filesize 2024-06-15 15:42:29 +02:00
Bnyro df15c21b35 [feat] mozhi: fix crash, support synonyms and definition 2024-06-15 11:33:09 +02:00
Bnyro 1fe13d0ba4 [refactor] duckduckgo: use extr helper function in get_vqd 2024-06-15 11:24:05 +02:00
Bnyro 46c5309888 [feat] mojeek: implement dedicated module 2024-06-07 11:31:05 +02:00
allendema_searxng_pi ee146dbc07 [enh] Add engine for discourse forums 2024-06-07 10:16:09 +02:00
Allen 0fa81fc782 [enh] add re-usable func to filter text 2024-05-29 17:56:17 +02:00
Jeff Alyanak 0fb3f0e4ae [fix] do not show DDG IP from zero click
The zero click result from DuckDuckGo for IP should not be displayed.  It will
return the IP of the searxng server, not the user's IP, and looks a bit strange
when the `self_info` plugin is enabled as two different IPs get returned.
2024-05-29 11:23:26 +02:00
Markus Heiser a20dfbbcbd [fix] engine startpage: fetch_traits() / if lang name unknown by babel
Workflow "Update data - update_engine_traits.py" fails last night [1].
This issue has already been reported by @allendema [2].

[1] https://github.com/searxng/searxng/actions/runs/9278028691/job/25528337485#step:6:168
[2] https://github.com/searxng/searxng/pull/3504/files#r1613559565

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-05-29 07:52:18 +02:00
Austin-Olacsi 9bb75a6644 [feat] engine: implementation of findthatmeme 2024-05-28 18:18:13 +02:00
Markus Heiser c19bffde4d [fix] issues reported by pylint-3.2.2
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-05-28 18:10:04 +02:00
Daniel Kukula 87165ac532 [mod] engine hex: add sort_criteria & page_size to configuration 2024-05-28 11:55:59 +02:00
allendema_searxng_pi 68365c8c1d [enh] add instant answers from ddg 2024-05-24 10:44:17 +02:00
Daniel Kukula a49232ee29 [feat] engine: implementation of cargo search (crates.io) 2024-05-17 16:37:39 +02:00
Markus Heiser 916739d6b4 [mod] simple theme: drop img_src from default results
The use of img_src AND thumbnail in the default results makes no sense (only a
thumbnail is needed).  In the current state this is rather confusing, because
img_src is displayed like a thumbnail (small) and thumbnail is displayed like an
image (large).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-05-16 07:30:38 +02:00
Bnyro 0f2f52f0b5 [fix] google: don't display that keyword is missing in content field 2024-05-15 16:03:35 +02:00
Markus Heiser 949a73103f [mod] hex engine: normalize (some of) the linked terms
The names of the links are rather tags than real names, and they sometimes vary
greatly in their spelling:

- GitHub: github, Github
- Source code: Repository, SCM, Project Source Code
- Documentation: docs, Documentation

It was standardized to terms such as 'Source code' and 'Documentation', as
translations already exist for these terms.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-05-15 12:50:35 +02:00