![]()
STOP ALL:
sudo pkill -f "python3 rag_api_server.py"; sudo pkill -f "ollama serve"; sudo systemctl stop apache2; sudo systemctl stop mysql;
CHECK STATUS:
sudo ps aux | grep -E 'apache2|mysqld|ollama|rag_api_server.py' | grep -v grep
stevee@hplaptop:~$ curl https://127.0.0.1:11434
Ollama is runningstevee@hplaptop:~$
Reason: This command uses ps aux to list all running processes on your system. It then filters that list for common process names associated with your services (Apache, MySQL, Ollama, and your Flask server script) and excludes the grep command itself. If a line appears in the output, that process is still running. If nothing is returned, the process is truly stopped. This is more definitive for crashed/stuck processes than just checking port listening status.
(venv_ollama_rag)stevee@hplaptop:/var/www/stevee@hplaptop:~$ sudo systemctl is-active ollama
[sudo] password for stevee:
active
stevee@hplaptop:~$ sudo systemctl is-active apache2
active
stevee@hplaptop:~$ sudo systemctl is-active mysql.service
active
START ALL:
cd /var/www/my-llm-project/ ; source llm_venv/bin/activate ;
sudo service apache2 start &>/dev/null && echo "Apache2: $(sudo systemctl is-active apache2 && echo 'active' || echo 'failed')";
sudo service mysql start &>/dev/null && echo "MYSQL: $(sudo systemctl is-active mysql && echo 'active' || echo 'failed')";
sudo systemctl start ollama &>/dev/null && echo "ollama: $(sudo systemctl is-active ollama || echo 'failed')" ;
sudo lsof -i :11434 2>/dev/null | grep "LISTEN" ; echo "Starting Flask server (this will occupy the terminal)..." ;
/var/www/my-llm-project/llm_venv/bin/python semantic_search_api.py
---------- SUDO CHANGES USER PATHS PERMS!; SO FULL PATH REQUIRED FOR PYTHON ...py files
OUTPUT:
[sudo] password for stevee:
Apache2: active
active
MySQL/MariaDB: active
active
Ollama: active
active
Starting Flask server (this will occupy the terminal)...
Loaded 148 embeddings and post data.
Gemini model initialized successfully using 'gemini-1.5-flash-latest'.
* Serving Flask app 'semantic_search_api'
* Debug mode: on
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on all addresses (0.0.0.0)
* Running on https://127.0.0.1:5000
* Running on https://stevepedwards.today:5000
Press CTRL+C to quit
* Restarting with stat
Loaded 148 embeddings and post data.
Gemini model initialized successfully using 'gemini-1.5-flash-latest'.
* Debugger is active!
* Debugger PIN: 203-532-454
127.0.0.1 - - [06/Jul/2025 12:25:51] "GET /socket.io/?EIO=4&transport=polling&t=PVWnS-D HTTP/1.1" 200 -
Client connected via WebSocket!
127.0.0.1 - - [06/Jul/2025 12:25:51] "POST /socket.io/?EIO=4&transport=polling&t=PVWnTPq&sid=Qou_ABbX6pI_eGBwAAAA HTTP/1.1" 200 -
127.0.0.1 - - [06/Jul/2025 12:25:51] "GET /socket.io/?EIO=4&transport=polling&t=PVWnTPr&sid=Qou_ABbX6pI_eGBwAAAA HTTP/1.1" 200 -
127.0.0.1 - - [06/Jul/2025 12:25:51] "GET /socket.io/?EIO=4&transport=polling&t=PVWnTPy&sid=Qou_ABbX6pI_eGBwAAAA HTTP/1.1" 200 -
Here's how it works in your startup script:
sudo systemctl is-active apache2 && echo 'active' || (sudo service apache2 start && echo 'restarting/started') || echo 'failed to start'sudo systemctl is-active apache2: This command checks if Apache2 is active. If it succeeds (Apache is active), it returns a zero exit status.&& echo 'active': Ifis-activesucceeds, thenecho 'active'runs, and the rest of the line is skipped because the condition (Apache active) has been met.|| (sudo service apache2 start && echo 'restarting/started'): Ifis-activefails (Apache is not active), then this part executes. It attempts to start Apache, and if that succeeds, it prints 'restarting/started'.|| echo 'failed to start': If bothis-activeand thestartcommand fail, then this last part executes, printing 'failed to start'.
So, || provides a way to define alternative actions or fallback options if a preceding command doesn't succeed.
Your project seamlessly integrates a local Large Language Model (LLM) with your WordPress website, enabling voice-activated semantic search and AI-powered summarization of your site's content.
General Function
The system allows users to speak (or type) a query into a search bar on your WordPress site. Instead of a traditional keyword search, it uses a local LLM to understand the meaning of the query. This semantic understanding allows it to find the most relevant posts and pages on your site, even if they don't contain the exact keywords. It then displays these results as clickable links, and offers an option to generate an AI-powered summary of the content using an online Gemini model. Real-time feedback on backend processes is shown in a terminal-like display on the webpage.
Stages of Construction and Script Roles
The project's functionality is built across several interconnected stages, involving WordPress (PHP, MySQL), a Python Flask API, a local Ollama LLM, and browser-side JavaScript and CSS.
Stage 1: Content Extraction & Embedding Generation This is the foundational step where your WordPress content is prepared for semantic understanding.
- Trigger: A new Post or Page is added/updated on your WordPress site.
- Need for Update: The LLM system needs to be aware of this new content to include it in future semantic searches.
- Script:
generate_embeddings.py- Role: This Python script is responsible for:
- Connecting to WordPress's MySQL Database: It queries your
wp_xubg_poststable to fetch theID,post_title,post_content,post_type, andpost_name(slug) of all published posts and pages. - Cleaning Content: It removes HTML tags and WordPress shortcodes from the raw
post_contentto get clean text suitable for the LLM. It also decodes HTML entities. - Saving Full Post Texts: For each cleaned post, it saves the full text into individual
.txtfiles in thefull_posts_text/directory (e.g.,12345.txt). Before doing so, it clears the entirefull_posts_text/directory to ensure only current and published posts are stored, preventing stale data buildup. - Generating Embeddings: For each cleaned post's text, it sends the text to your locally running Ollama server, specifically using the
nomic-embed-textmodel. Ollama returns a numerical vector (embedding) that represents the semantic meaning of the text. - Storing Embeddings & Metadata: It compiles all this information (Post ID, Title, Slug, Content Snippet, Word Count, and the generated Embedding) into a single JSON file:
wordpress_embeddings.json.
- Connecting to WordPress's MySQL Database: It queries your
- Why it runs: To create or refresh the "knowledge base" (the embeddings) that your semantic search API will use. This must be done whenever WordPress content changes to keep the search accurate.
- Role: This Python script is responsible for:
Stage 2: Backend API Server (Semantic Search & Summarization) This stage provides the intelligent services that the WordPress frontend will communicate with.
- Trigger: Your Flask server needs to be running in the background. It needs to be restarted whenever
wordpress_embeddings.jsonis updated orsemantic_search_api.pyitself is changed. - Script:
semantic_search_api.py- Role: This Python Flask application acts as the bridge between your WordPress frontend and the LLM services.
- Loads Knowledge Base: On startup, it reads the
wordpress_embeddings.jsonfile into memory. It stores all the post metadata and their corresponding embeddings. - Initializes Ollama Connection: Sets up the connection to your local Ollama server to handle embedding requests for search queries.
- Initializes Gemini Connection: Configures the connection to the Google Gemini API, specifically using the
models/gemini-1.5-flash-latestmodel for summarization. - Semantic Search Endpoint (
/semantic_search): When a search query is received from the frontend:- It sends the user's query to Ollama (using
nomic-embed-text) to generate an embedding for the query. - It compares this query embedding to all the stored embeddings from your WordPress posts using cosine similarity.
- It identifies the top 3 most semantically similar posts from your site.
- It returns the metadata (Title, Slug, Word Count, Score, Content Snippet) of these top 3 results to the frontend.
- It sends the user's query to Ollama (using
- Summarization Endpoint (
/summarize_article): When the "Summarise Article" button is clicked for a specific post:- It reads the full content of that post from the
.txtfile infull_posts_text/. - It sends this full content as a prompt to the online Google Gemini API (
gemini-1.5-flash-latest) to generate a concise summary. - It returns the generated summary back to the frontend.
- It reads the full content of that post from the
- Real-time Communication: Uses Socket.IO to send real-time progress updates from the backend Python scripts directly to the terminal display on your WordPress page.
- Loads Knowledge Base: On startup, it reads the
- Why it runs: It continuously listens for requests from your WordPress site and provides the semantic search and summarization intelligence.
- Role: This Python Flask application acts as the bridge between your WordPress frontend and the LLM services.
Stage 3: Frontend User Interface & Interaction This stage handles how users interact with the system on your WordPress site.
- Trigger: User visits your WordPress page.
- Scripts:
my_voice_search.php(WordPress Plugin PHP)- Role: This WordPress plugin handles the display of the search interface.
- Enqueues Assets: It loads your
style.css(for styling),socket.io.min.js(for real-time updates), andspeech-recognition.js(for logic) into your WordPress page. - Registers Shortcodes: It defines two shortcodes:
: This shortcode renders the HTML for the voice search input box, microphone button, and a feedback message area. Crucially, it also renders the "Backend Process Output" terminal display directly below the mic input, which shows real-time messages from the Flask server.Backend Process Output:
: This shortcode renders the HTML container (Found Semantic Results:
Your semantic search results will appear here after you speak or type a query.
<div id="voice-search-results">) where the semantic search results (clickable links, snippets, summarize buttons) will be dynamically inserted by JavaScript. It's placed separately on your WordPress page.
- Enqueues Assets: It loads your
- Why it runs: It's a WordPress plugin, so it's loaded by WordPress to display the user interface elements on your designated page.
- Role: This WordPress plugin handles the display of the search interface.
speech-recognition.js(JavaScript)- Role: This JavaScript file brings the frontend to life:
- Web Speech API Integration: It uses the browser's built-in Web Speech API to capture audio from the user's microphone and convert it into text.
- Sends Queries to Flask: When a voice input is transcribed (or text is typed into the search box and Enter is pressed), it sends this query via an AJAX POST request to your
semantic_search_api.py's/semantic_searchendpoint. - Displays Search Results: When it receives the JSON response from Flask (containing the top 3 semantic results, including Post Title, URL Slug, Word Count, and Snippet), it dynamically generates HTML elements for each result and inserts them into the
<div id="voice-search-results">on your WordPress page. It correctly formats "Source: [Post Title]" and "Words: [Word Count]". - Handles Summarization Requests: It listens for clicks on the "Summarise Article" buttons. When clicked, it sends a request to Flask's
/summarize_articleendpoint for that specific post's ID. Upon receiving the summary, it updates the post's snippet text directly on the page. - Manages Real-time Terminal: It uses Socket.IO to connect to your Flask server's WebSocket, receiving
progressmessages andserver_messages, and appends them to the#real-time-output(the terminal display), automatically scrolling to the bottom.
- Why it runs: It's the client-side logic that handles user interaction, communicates with the backend, and updates the webpage dynamically.
- Role: This JavaScript file brings the frontend to life:
style.css(CSS)- Role: Provides all the visual styling for the search bar, mic button, feedback area, search results, summarization snippets, and the real-time terminal display. It uses
!importantdeclarations where necessary to ensure its styles override conflicting rules from your WordPress "Dark Theme," ensuring visibility and consistent appearance. - Why it runs: Loaded by
my_voice_search.phpto define the appearance of the plugin's elements.
- Role: Provides all the visual styling for the search bar, mic button, feedback area, search results, summarization snippets, and the real-time terminal display. It uses
Order of Operations for System Update & Use:
- Local Machine / WSL2 Startup:
- You run your combined startup command:
sudo service apache2 start &>/dev/null; sudo service mysql start &>/dev/null; sudo systemctl start ollama &>/dev/null; sleep 3; cd /var/www/my-llm-project/ && source llm_venv/bin/activate && echo "Apache2: $(...)" && echo "MySQL/MariaDB: $(...)" && echo "Ollama: $(...)" && echo "Starting Flask server..." && python3 semantic_search_api.py - This command first ensures Apache2, MySQL, and Ollama services are running. It gives you immediate feedback on their status.
- Then, it activates your
llm_venvand starts yoursemantic_search_api.py(Flask server) in the foreground. This Flask server will load yourwordpress_embeddings.jsonand initialize the Gemini model.
- You run your combined startup command:
- Adding/Updating WordPress Content (Needs Update of LLM Knowledge Base):
- You add a new Post/Page or modify existing ones in WordPress.
- You run
update_agent.py:cd /var/www/my-llm-project; source llm_venv/bin/activate; python3 update_agent.py- This script will automatically:
- Run
generate_embeddings.py. This script will connect to MySQL, clearfull_posts_text/, extract all current WordPress content, generate new embeddings using Ollama, and save them towordpress_embeddings.json. - Kill any running Flask process.
- Start a new Flask server (using
semantic_search_api.py), ensuring it loads the newly generatedwordpress_embeddings.json.
- Run
- This script will automatically:
- User Interacts on WordPress Site:
- The user visits your WordPress page.
my_voice_search.phploads the necessary CSS and JavaScript.- The
shortcode renders the input field, mic button, and the terminal.Backend Process Output:
- The
shortcode renders the container for search results.Found Semantic Results:
Your semantic search results will appear here after you speak or type a query.
- The
speech-recognition.jsscript takes over:- Connects to Flask via Socket.IO for real-time terminal updates.
- Enables voice input via the mic button.
- Sends transcribed queries to the Flask
semantic_search_api.py. - Receives and displays the top 3 semantic search results from Flask, including the Post Title, Word Count, and Score.
- Manages the "Summarise Article" button clicks, sending requests to Flask and displaying the Gemini-generated summaries.
This comprehensive setup provides a powerful, voice-activated, and AI-enhanced search experience for your WordPress content!