searxng/dev/engines/offline/search-indexer-engines.html

290 lines
20 KiB
HTML
Raw Normal View History

<!DOCTYPE html>
<html lang="en" data-content_root="../../../">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Local Search APIs &#8212; SearXNG Documentation (2024.11.8+2fbf15ecc)</title>
<link rel="stylesheet" type="text/css" href="../../../_static/pygments.css?v=4f649999" />
<link rel="stylesheet" type="text/css" href="../../../_static/searxng.css?v=52e4ff28" />
<link rel="stylesheet" type="text/css" href="../../../_static/autodoc_pydantic.css" />
<script src="../../../_static/documentation_options.js?v=c97bae97"></script>
<script src="../../../_static/doctools.js?v=9a2dae69"></script>
<script src="../../../_static/sphinx_highlight.js?v=dc90522c"></script>
<script data-project="searxng" data-version="2024.11.8+2fbf15ecc" src="../../../_static/describe_version.js?v=fa7f30d0"></script>
<link rel="index" title="Index" href="../../../genindex.html" />
<link rel="search" title="Search" href="../../../search.html" />
<link rel="next" title="SQL Engines" href="sql-engines.html" />
<link rel="prev" title="NoSQL databases" href="nosql-engines.html" />
</head><body>
<div class="related" role="navigation" aria-label="Related">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="../../../genindex.html" title="General Index"
accesskey="I">index</a></li>
<li class="right" >
<a href="../../../py-modindex.html" title="Python Module Index"
>modules</a> |</li>
<li class="right" >
<a href="sql-engines.html" title="SQL Engines"
accesskey="N">next</a> |</li>
<li class="right" >
<a href="nosql-engines.html" title="NoSQL databases"
accesskey="P">previous</a> |</li>
<li class="nav-item nav-item-0"><a href="../../../index.html">SearXNG Documentation (2024.11.8+2fbf15ecc)</a> &#187;</li>
<li class="nav-item nav-item-1"><a href="../../index.html" >Developer documentation</a> &#187;</li>
<li class="nav-item nav-item-2"><a href="../index.html" accesskey="U">Engine Implementations</a> &#187;</li>
<li class="nav-item nav-item-this"><a href="">Local Search APIs</a></li>
</ul>
</div>
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<section id="local-search-apis">
<h1>Local Search APIs<a class="headerlink" href="#local-search-apis" title="Link to this heading"></a></h1>
<aside class="sidebar">
<p class="sidebar-title">further read</p>
<ul class="simple">
<li><p><a class="reference external" href="https://docs.meilisearch.com/learn/what_is_meilisearch/comparison_to_alternatives.html">Comparison to alternatives</a></p></li>
</ul>
</aside>
<nav class="contents local" id="contents">
<ul class="simple">
<li><p><a class="reference internal" href="#module-searx.engines.meilisearch" id="id6">MeiliSearch</a></p></li>
<li><p><a class="reference internal" href="#module-searx.engines.elasticsearch" id="id7">Elasticsearch</a></p></li>
<li><p><a class="reference internal" href="#module-searx.engines.solr" id="id8">Solr</a></p></li>
</ul>
</nav>
<aside class="sidebar">
<p class="sidebar-title">info</p>
<p>Initial sponsored by <a class="reference external" href="https://nlnet.nl/discovery">Search and Discovery Fund</a> of <a class="reference external" href="https://nlnet.nl/">NLnet Foundation</a>.</p>
</aside>
<p>Administrators might find themselves wanting to integrate locally running search
engines. The following ones are supported for now:</p>
<ul class="simple">
<li><p><a class="reference external" href="https://www.elastic.co/elasticsearch/">Elasticsearch</a></p></li>
<li><p><a class="reference external" href="https://www.meilisearch.com">Meilisearch</a></p></li>
<li><p><a class="reference external" href="https://solr.apache.org">Solr</a></p></li>
</ul>
<p>Each search engine is powerful, capable of full-text search. All of the engines
above are added to <code class="docutils literal notranslate"><span class="pre">settings.yml</span></code> just commented out, as you have to
<code class="docutils literal notranslate"><span class="pre">base_url</span></code> for all them.</p>
<p>Please note that if you are not using HTTPS to access these engines, you have to
enable HTTP requests by setting <code class="docutils literal notranslate"><span class="pre">enable_http</span></code> to <code class="docutils literal notranslate"><span class="pre">True</span></code>.</p>
<p>Furthermore, if you do not want to expose these engines on a public instance,
you can still add them and limit the access by setting <code class="docutils literal notranslate"><span class="pre">tokens</span></code> as described
in section <a class="reference internal" href="../../../admin/settings/settings_engine.html#private-engines"><span class="std std-ref">Private Engines (tokens)</span></a>.</p>
<section id="module-searx.engines.meilisearch">
<span id="meilisearch"></span><span id="engine-meilisearch"></span><h2><a class="toc-backref" href="#id6" role="doc-backlink">MeiliSearch</a><a class="headerlink" href="#module-searx.engines.meilisearch" title="Link to this heading"></a></h2>
<aside class="sidebar">
<p class="sidebar-title">info</p>
<ul class="simple">
<li><p><a class="extlink-origin reference external" href="https://github.com/searxng/searxng/blob/master/searx/engines/meilisearch.py">meilisearch.py</a></p></li>
<li><p><a class="reference external" href="https://www.meilisearch.com">MeiliSearch</a></p></li>
<li><p><a class="reference external" href="https://docs.meilisearch.com/">MeiliSearch Documentation</a></p></li>
<li><p><a class="reference external" href="https://docs.meilisearch.com/learn/getting_started/installation.html">Install MeiliSearch</a></p></li>
</ul>
</aside>
<p><a class="reference external" href="https://www.meilisearch.com">MeiliSearch</a> is aimed at individuals and small companies. It is designed for
small-scale (less than 10 million documents) data collections. E.g. it is great
for storing web pages you have visited and searching in the contents later.</p>
<p>The engine supports faceted search, so you can search in a subset of documents
of the collection. Furthermore, you can search in <a class="reference external" href="https://www.meilisearch.com">MeiliSearch</a> instances that
require authentication by setting <code class="docutils literal notranslate"><span class="pre">auth_token</span></code>.</p>
<section id="example">
<h3>Example<a class="headerlink" href="#example" title="Link to this heading"></a></h3>
<p>Here is a simple example to query a Meilisearch instance:</p>
<div class="highlight-yaml notranslate"><div class="highlight"><pre><span></span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">meilisearch</span>
<span class="w"> </span><span class="nt">engine</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">meilisearch</span>
<span class="w"> </span><span class="nt">shortcut</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">mes</span>
<span class="w"> </span><span class="nt">base_url</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">http://localhost:7700</span>
<span class="w"> </span><span class="nt">index</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">my-index</span>
<span class="w"> </span><span class="nt">enable_http</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span>
</pre></div>
</div>
</section>
</section>
<section id="module-searx.engines.elasticsearch">
<span id="elasticsearch"></span><span id="engine-elasticsearch"></span><h2><a class="toc-backref" href="#id7" role="doc-backlink">Elasticsearch</a><a class="headerlink" href="#module-searx.engines.elasticsearch" title="Link to this heading"></a></h2>
<aside class="sidebar">
<p class="sidebar-title">info</p>
<ul class="simple">
<li><p><a class="extlink-origin reference external" href="https://github.com/searxng/searxng/blob/master/searx/engines/elasticsearch.py">elasticsearch.py</a></p></li>
<li><p><a class="reference external" href="https://www.elastic.co/elasticsearch/">Elasticsearch</a></p></li>
<li><p><a class="reference external" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html">Elasticsearch Guide</a></p></li>
<li><p><a class="reference external" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/install-elasticsearch.html">Install Elasticsearch</a></p></li>
</ul>
</aside>
<p><a class="reference external" href="https://www.elastic.co/elasticsearch/">Elasticsearch</a> supports numerous ways to query the data it is storing. At the
moment the engine supports the most popular search methods (<code class="docutils literal notranslate"><span class="pre">query_type</span></code>):</p>
<ul class="simple">
<li><p><code class="docutils literal notranslate"><span class="pre">match</span></code>,</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">simple_query_string</span></code>,</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">term</span></code> and</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">terms</span></code>.</p></li>
</ul>
<p>If none of the methods fit your use case, you can select <code class="docutils literal notranslate"><span class="pre">custom</span></code> query type
and provide the JSON payload to submit to Elasticsearch in
<code class="docutils literal notranslate"><span class="pre">custom_query_json</span></code>.</p>
<section id="id3">
<h3>Example<a class="headerlink" href="#id3" title="Link to this heading"></a></h3>
<p>The following is an example configuration for an <a class="reference external" href="https://www.elastic.co/elasticsearch/">Elasticsearch</a> instance with
authentication configured to read from <code class="docutils literal notranslate"><span class="pre">my-index</span></code> index.</p>
<div class="highlight-yaml notranslate"><div class="highlight"><pre><span></span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">elasticsearch</span>
<span class="w"> </span><span class="nt">shortcut</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">es</span>
<span class="w"> </span><span class="nt">engine</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">elasticsearch</span>
<span class="w"> </span><span class="nt">base_url</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">http://localhost:9200</span>
<span class="w"> </span><span class="nt">username</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">elastic</span>
<span class="w"> </span><span class="nt">password</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">changeme</span>
<span class="w"> </span><span class="nt">index</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">my-index</span>
<span class="w"> </span><span class="nt">query_type</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">match</span>
<span class="w"> </span><span class="c1"># custom_query_json: &#39;{ ... }&#39;</span>
<span class="w"> </span><span class="nt">enable_http</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span>
</pre></div>
</div>
</section>
</section>
<section id="module-searx.engines.solr">
<span id="solr"></span><span id="engine-solr"></span><h2><a class="toc-backref" href="#id8" role="doc-backlink">Solr</a><a class="headerlink" href="#module-searx.engines.solr" title="Link to this heading"></a></h2>
<aside class="sidebar">
<p class="sidebar-title">info</p>
<ul class="simple">
<li><p><a class="extlink-origin reference external" href="https://github.com/searxng/searxng/blob/master/searx/engines/solr.py">solr.py</a></p></li>
<li><p><a class="reference external" href="https://solr.apache.org">Solr</a></p></li>
<li><p><a class="reference external" href="https://solr.apache.org/resources.html">Solr Resources</a></p></li>
<li><p><a class="reference external" href="https://solr.apache.org/guide/installing-solr.html">Install Solr</a></p></li>
</ul>
</aside>
<p><a class="reference external" href="https://solr.apache.org">Solr</a> is a popular search engine based on Lucene, just like <a class="reference external" href="https://www.elastic.co/elasticsearch/">Elasticsearch</a>. But
instead of searching in indices, you can search in collections.</p>
<section id="id5">
<h3>Example<a class="headerlink" href="#id5" title="Link to this heading"></a></h3>
<p>This is an example configuration for searching in the collection
<code class="docutils literal notranslate"><span class="pre">my-collection</span></code> and get the results in ascending order.</p>
<div class="highlight-yaml notranslate"><div class="highlight"><pre><span></span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">solr</span>
<span class="w"> </span><span class="nt">engine</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">solr</span>
<span class="w"> </span><span class="nt">shortcut</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">slr</span>
<span class="w"> </span><span class="nt">base_url</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">http://localhost:8983</span>
<span class="w"> </span><span class="nt">collection</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">my-collection</span>
<span class="w"> </span><span class="nt">sort</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">asc</span>
<span class="w"> </span><span class="nt">enable_http</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span>
</pre></div>
</div>
</section>
</section>
</section>
<div class="clearer"></div>
</div>
</div>
</div>
<span id="sidebar-top"></span>
<div class="sphinxsidebar" role="navigation" aria-label="Main">
<div class="sphinxsidebarwrapper">
<p class="logo"><a href="../../../index.html">
<img class="logo" src="../../../_static/searxng-wordmark.svg" alt="Logo of SearXNG"/>
</a></p>
<h3><a href="../../../index.html">Table of Contents</a></h3>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../../../user/index.html">User information</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../own-instance.html">Why use a private instance?</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../admin/index.html">Administrator documentation</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="../../index.html">Developer documentation</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="../../quickstart.html">Development Quickstart</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../rtm_asdf.html">Runtime Management</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../contribution_guide.html">How to contribute</a></li>
<li class="toctree-l2 current"><a class="reference internal" href="../index.html">Engine Implementations</a><ul class="current">
<li class="toctree-l3"><a class="reference internal" href="../enginelib.html">Engine Library</a></li>
<li class="toctree-l3"><a class="reference internal" href="../engines.html">SearXNGs engines loader</a></li>
<li class="toctree-l3"><a class="reference internal" href="../engine_overview.html">Engine Overview</a></li>
<li class="toctree-l3 current"><a class="reference internal" href="../index.html#engine-types">Engine Types</a><ul class="current">
<li class="toctree-l4"><a class="reference internal" href="../index.html#online-engines">Online Engines</a></li>
<li class="toctree-l4 current"><a class="reference internal" href="../index.html#offline-engines">Offline Engines</a><ul class="current">
<li class="toctree-l5"><a class="reference internal" href="../offline_concept.html">Offline Concept</a></li>
<li class="toctree-l5"><a class="reference internal" href="../demo/demo_offline.html">Demo Offline Engine</a></li>
<li class="toctree-l5"><a class="reference internal" href="command-line-engines.html">Command Line Engines</a></li>
<li class="toctree-l5"><a class="reference internal" href="nosql-engines.html">NoSQL databases</a></li>
<li class="toctree-l5 current"><a class="current reference internal" href="#">Local Search APIs</a></li>
<li class="toctree-l5"><a class="reference internal" href="sql-engines.html">SQL Engines</a></li>
</ul>
</li>
<li class="toctree-l4"><a class="reference internal" href="../index.html#online-url-search">Online URL Search</a></li>
<li class="toctree-l4"><a class="reference internal" href="../index.html#online-currency">Online Currency</a></li>
<li class="toctree-l4"><a class="reference internal" href="../index.html#online-dictionary">Online Dictionary</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../search_api.html">Search API</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../plugins.html">Plugins</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../translation.html">Translation</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../lxcdev.html">Developing in Linux Containers</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../makefile.html">Makefile &amp; <code class="docutils literal notranslate"><span class="pre">./manage</span></code></a></li>
<li class="toctree-l2"><a class="reference internal" href="../../reST.html">reST primer</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../searxng_extra/index.html">Tooling box <code class="docutils literal notranslate"><span class="pre">searxng_extra</span></code></a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../../utils/index.html">DevOps tooling box</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../src/index.html">Source-Code</a></li>
</ul>
<h3>Project Links</h3>
<ul>
<li><a href="https://github.com/searxng/searxng/tree/master">Source</a>
<li><a href="https://github.com/searxng/searxng/wiki">Wiki</a>
<li><a href="https://searx.space">Public instances</a>
<li><a href="https://github.com/searxng/searxng/issues">Issue Tracker</a>
</ul><h3>Navigation</h3>
<ul>
<li><a href="../../../index.html">Overview</a>
<ul>
<li><a href="../../index.html">Developer documentation</a>
<ul>
<li><a href="../index.html">Engine Implementations</a>
<ul>
<li>Previous: <a href="nosql-engines.html" title="previous chapter">NoSQL databases</a>
<li>Next: <a href="sql-engines.html" title="next chapter">SQL Engines</a></ul>
</li></ul>
</li>
</ul>
</li>
</ul>
<search id="searchbox" style="display: none" role="search">
<h3 id="searchlabel">Quick search</h3>
<div class="searchformwrapper">
<form class="search" action="../../../search.html" method="get">
<input type="text" name="q" aria-labelledby="searchlabel" autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false"/>
<input type="submit" value="Go" />
</form>
</div>
</search>
<script>document.getElementById('searchbox').style.display = "block"</script>
<div role="note" aria-label="source link">
<h3>This Page</h3>
<ul class="this-page-menu">
<li><a href="../../../_sources/dev/engines/offline/search-indexer-engines.rst.txt"
rel="nofollow">Show Source</a></li>
</ul>
</div>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="footer" role="contentinfo">
&#169; Copyright SearXNG team.
</div>
</body>
</html>