Setting up Hermes Search with a FastAPI Backend

After getting my ThinkCentre (“blackbox”) stable, I wanted to set up a search tool that lets my local LLMs fetch real-time web results.
This post documents the working config I used to connect Hermes 2/3 to a custom FastAPI tool, backed by SearxNG.
🔧 What We Used:
- Hermes model via Ollama (Docker install)
- Open WebUI (connected to Ollama)
- SearxNG (for anonymous web search)
- FastAPI (for a custom
/search
tool) - Hermes tool schema (Python)
🌐 Step 1: Setting Up SearxNG
SearxNG was installed separately in Docker.
It listens locally at:
http://blackbox:8888
No customizations needed apart from disabling unwanted engines in its settings.yml.
🛠️ Step 2: Building the FastAPI Search Tool
We wrote a simple Python script using FastAPI and httpx.
search_tool.py:
from fastapi import FastAPI, Query
import httpx
app = FastAPI()
SEARXNG_URL = "http://blackbox:8888"
@app.get("/search")
async def search(query: str = Query(...)):
async with httpx.AsyncClient() as client:
response = await client.get(f"{SEARXNG_URL}/search", params={"q": query, "format": "json"})
response.raise_for_status()
data = response.json()
results = []
for item in data.get("results", [])[:5]: # limit to top 5
results.append(f"{item['title']}: {item['url']}")
return {"results": results}
📂 Step 3: Environment Setup
We needed a .env
file to configure the tool in Open WebUI.
.env for Open WebUI:
OPENAI_API_KEY=ollama
OLLAMA_BASE_URL=http://ollama:11434
WEBUI_TOOLS_CUSTOM_TOOLS_DIR=/app/extensions/tools
🖥️ Step 4: Testing Locally
After launching the FastAPI server with:
uvicorn search_tool:app --host 0.0.0.0 --port 5005
We could hit:
http://blackbox:5005/search?q=hermes+ai
Example output:
{
"results": [
"Hermes AI: A New Wave in LLMs - https://example.com/hermes-ai-news",
"How Hermes Compares to GPT-4 - https://example.com/hermes-vs-gpt4",
"Running Hermes Locally - https://example.com/run-hermes-guide",
"Hermes Search Tool Setup - https://example.com/setup-hermes-search",
"Hermes Model Details - https://example.com/hermes-model-info"
]
}
🤖 Step 5: Registering the Tool in Open WebUI
In Open WebUI, we added our tool like this:
tool_manifest.json:
{
"name": "search",
"description": "Searches the web for up-to-date information.",
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query to look up."
}
},
"required": ["query"]
},
"api": {
"type": "openapi",
"url": "http://blackbox:5005/openapi.json"
}
}
Now Hermes can call the tool when it needs current information!
📜 Final Notes:
- This was all run inside our local network.
- No need to expose anything to the Internet.
- Tailscale lets me access the search tool remotely if needed.
- Future improvements: rate limiting, fallback search engines, caching.
🖤 "blackbox" specs
For context:
Our Hermes server is running on a Lenovo ThinkCentre M710q:
- CPU: Intel i5-7500T (4 cores, 8 threads)
- RAM: 16GB DDR4
- Storage:
- Samsung 980 NVMe 500GB (primary drive)
- Crucial 1TB SATA SSD (mounted at
/mnt/Crucial1TB
)
- Other:
- Docker installed
- Tailscale configured for remote access
- qBittorrent-nox running as a service
- Open WebUI for local LLM frontend
✨ Next Project
Maybe adding embedding-based retrieval so that Hermes can not only search the web, but also search my private documents.
TL;DR
This search tool setup lets Hermes feel way "smarter" without sacrificing the privacy and control of running everything locally.
