Partial reverse engineering of the Google engines including a improved language
and region handling based on the engine.traits_v1 data.
When ever possible the implementations of the Google engines try to make use of
the async REST APIs. The get_lang_info() has been generalized to a
get_google_info() function / especially the region handling has been improved by
adding the cr parameter.
searx/data/engine_traits.json
Add data type "traits_v1" generated by the fetch_traits() functions from:
- Google (WEB),
- Google images,
- Google news,
- Google scholar and
- Google videos
and remove data from obsolete data type "supported_languages".
A traits.custom type that maps region codes to *supported_domains* is fetched
from https://www.google.com/supported_domains
searx/autocomplete.py:
Reversed engineered autocomplete from Google WEB. Supports Google's languages and
subdomains. The old API suggestqueries.google.com/complete has been replaced
by the async REST API: https://{subdomain}/complete/search?{args}
searx/engines/google.py
Reverse engineering and extensive testing ..
- fetch_traits(): Fetch languages & regions from Google properties.
- always use the async REST API (formally known as 'use_mobile_ui')
- use *supported_domains* from traits
- improved the result list by fetching './/div[@data-content-feature]'
and parsing the type of the various *content features* --> thumbnails are
added
searx/engines/google_images.py
Reverse engineering and extensive testing ..
- fetch_traits(): Fetch languages & regions from Google properties.
- use *supported_domains* from traits
- if exists, freshness_date is added to the result
- issue 1864: result list has been improved a lot (due to the new cr parameter)
searx/engines/google_news.py
Reverse engineering and extensive testing ..
- fetch_traits(): Fetch languages & regions from Google properties.
*supported_domains* is not needed but a ceid list has been added.
- different region handling compared to Google WEB
- fixed for various languages & regions (due to the new ceid parameter) /
avoid CONSENT page
- Google News do no longer support time range
- result list has been fixed: XPath of pub_date and pub_origin
searx/engines/google_videos.py
- fetch_traits(): Fetch languages & regions from Google properties.
- use *supported_domains* from traits
- add paging support
- implement a async request ('asearch': 'arc' & 'async':
'use_ac:true,_fmt:html')
- simplified code (thanks to '_fmt:html' request)
- issue 1359: fixed xpath of video length data
searx/engines/google_scholar.py
- fetch_traits(): Fetch languages & regions from Google properties.
- use *supported_domains* from traits
- request(): include patents & citations
- response(): fixed CAPTCHA detection (Scholar has its own CATCHA manager)
- hardening XPath to iterate over results
- fixed XPath of pub_type (has been change from gs_ct1 to gs_cgt2 class)
- issue 1769 fixed: new request implementation is no longer incompatible
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
$ make nvm.install
INFO: install (update) NVM at /800GBPCIex4/share/SearXNG/.nvm
INFO: already cloned at: /800GBPCIex4/share/SearXNG/.nvm
|| Fetching origin
INFO: checkout v0.39.1
|| HEAD is now at 9600617 v0.39.1
make: *** [Makefile:96: nvm.install] Error 1
Without this fix we need to set VERBOSE environment to avoid the 'Error 1':
$ VERBOSE=0 make nvm.install
BTW: fix an issue if there are any leftovers in ${NVM_DIR} from previos
installations
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
There's already precedence for not using GNUism sha256sum longopts as
seen in searxng/utils/lib_go.sh so update lib.sh to not use them either.
A nice side effect is now the sha256sum usage doesn't care if you're
using BSD sha256sum or GNU sha256sum which makes this work under FreeBSD.
settings.yml:
* The default URL was unix:///usr/local/searxng-redis/run/redis.sock?db=0
* The default URL is now "false"
The default URL makes the log difficult to deal with:
if the admin didn't install a Redis instance, the logs record a false error.
It worked before because SearXNG initialized the Redis connection when the limiter started.
In this commit, SearXNG initializes Redis in searx/webapp.py
so various components can use Redis without taking care of the initialization step.
Since ./utils/searxng.sh is implemented, the old installation procedures from
filtron, morty and searx can be removed.
For users who want to upgrade, the procedures for removing old installations
have still been retained.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Git v2.35.2 closes an security issue, it is no longer possible that root uses a
git repo that is owned by someone else, the error message is::
fatal: unsafe repository ('/share/darmarit.org/cache/searxng' is owned by someone else)
The fix is to run the `git diff --name-only` not as root in a sudo command.
[1] https://github.blog/2022-04-12-git-security-vulnerability-announced/
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
[1] https://docs.fedoraproject.org/en-US/releases/eol/
[2] https://docs.fedoraproject.org/en-US/releases/f35/
Tested by::
# build the container ..
$ sudo -H ./utils/lxc.sh build searx-fedora35
# open a shell in the container
$ sudo -H ./utils/lxc.sh cmd searx-fedora35 bash
[root@searx-fedora35 SearXNG]#
# install a complete SearXNG suite ..
[root@searx-fedora35 SearXNG]# ./utils/searx.sh install all
...
# install apache to export the SearXNG instance by HTTP
[root@searx-fedora35 SearXNG]# ./utils/searx.sh apache install
...
INFO: got 200 from http://10.174.184.94/searx
To build wheel `python3-devel` needs to be added to SEARX_PACKAGES_fedora::
|searx| × Building wheel for setproctitle (pyproject.toml) did not run successfully.
|searx| │ exit code: 1
...
|searx| In file included from src/spt.h:15,
|searx| from src/setproctitle.c:14:
|searx| src/spt_python.h:16:10: fatal error: Python.h: No such file or directory
|searx| 16 | #include <Python.h>
|searx| | ^~~~~~~~~~
|searx| compilation terminated.
|searx| error: command '/usr/bin/gcc' failed with exit code 1
|searx| [end of output]
...
|searx| ERROR: Failed building wheel for setproctitle
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
A script to build & install a simple & isolated redis service, dedicated to
SearXNG and connected via Unix socket.
$ ./manage redis.help
redis.:
devpkg : install essential packages to compile redis
build : build redis binaries at /800GBPCIex4/share/SearXNG/dist/redis/6.2.6/amd64
install : create user (searxng-redis) and install systemd service (searxng-redis)
remove : delete user (searxng-redis) and remove service (searxng-redis)
shell : start bash interpreter from user searxng-redis
src : clone redis source code to <path> and checkput 6.2.6
useradd : create user (searxng-redis) at /usr/local/searxng-redis
userdel : delete user (searxng-redis)
addgrp : add <user> to group (searxng-redis)
rmgrp : remove <user> from group (searxng-redis)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
- use single quote in the STATIC_BUILT_PATHS to avoid bash globbing
- don't try to commit if no files have been changed
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
* move `searx/static/themes/simple/img/searxng.svg` to `src/brand/searxng.svg`
* README.rst can use it without a reference to a theme.
* the simple theme can create `searx/static/themes/simple/img/searxng.png` using
the svg2png task
Suggested-by: @dalf https://github.com/searxng/searxng/pull/561#issuecomment-981747902
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
If the fetched branch has been rebased a 'git pull' will fails. To get fetched
branch in the working tree, a git reset is needed.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
The Node.js installation in the NVM environment can be used by IDEs and other
developer tasks. The required developer packagaes are added to the file
./.nvm_packages and will be installed when Node.js is installed. Initial we
start with:
- eslint
Having a dedicated developer enviroment, provided by nvm makes it easy to
integrate Node.js packages into various IDEs. One example is shown in the
.dir-locals.el which is used by emacs.
[1] https://github.com/nvm-sh/nvm#default-global-packages-from-file-while-installing
[2] https://eslint.org
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Issue::
$ make clean node.env
...
CLEAN [NVM] drop .nvm/
...
INFO: install Node.js by NVM
...
Now using node v16.13.0 (npm v8.1.0)
...
INSTALL searx/static/themes/oscar/package.json
npm ERR! code ENOENT
npm ERR! syscall open
# Here now comes the issue, caused by the missing 'popd' ..
npm ERR! path SearXNG/.nvm/searx/static/themes/oscar/package.json
npm ERR! errno -2
npm ERR! enoent ENOENT: no such file or directory, open 'SearXNG/.nvm/searx/static/themes/oscar/package.json'
ERROR: node.env exit with error (254)
make: *** [Makefile:99: node.env] Error 254
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Issue was::
$ LANG=C make nvm.clean
INFO: NVM is not installed
make: *** [Makefile:99: nvm.clean] Error 42
Now::
$ LANG=C make nvm.clean
CLEAN [NVM] not installed
BTW: change info_msg to build_msg
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>