AI Semantic Search Project Summary – Local PC Version

STOP ALL:
sudo pkill -f "python3 rag_api_server.py"; sudo pkill -f "ollama serve"; sudo systemctl stop apache2; sudo systemctl stop mysql;

Reason: This command uses ps aux to list all running processes on your system. It then filters that list for common process names associated with your services (Apache, MySQL, Ollama, and your Flask server script) and excludes the grep command itself. If a line appears in the output, that process is still running. If nothing is returned, the process is truly stopped. This is more definitive for crashed/stuck processes than just checking port listening status.

(venv_ollama_rag)stevee@hplaptop:/var/www/stevee@hplaptop:~$ sudo systemctl is-active ollama
[sudo] password for stevee:
active
stevee@hplaptop:~$ sudo systemctl is-active apache2
active
stevee@hplaptop:~$ sudo systemctl is-active mysql.service
active

START ALL:

cd /var/www/my-llm-project/ ; source llm_venv/bin/activate ;
sudo service apache2 start &>/dev/null && echo "Apache2: $(sudo systemctl is-active apache2 && echo 'active' || echo 'failed')";
sudo service mysql start &>/dev/null && echo "MYSQL: $(sudo systemctl is-active mysql && echo 'active' || echo 'failed')";
sudo systemctl start ollama &>/dev/null && echo "ollama: $(sudo systemctl is-active ollama || echo 'failed')" ;
sudo lsof -i :11434 2>/dev/null | grep "LISTEN" ; echo "Starting Flask server (this will occupy the terminal)..." ;
/var/www/my-llm-project/llm_venv/bin/python semantic_search_api.py

---------- SUDO CHANGES USER PATHS PERMS!; SO FULL PATH REQUIRED FOR PYTHON ...py files

OUTPUT:

[sudo] password for stevee:
Apache2: active
active
MySQL/MariaDB: active
active
Ollama: active
active
Starting Flask server (this will occupy the terminal)...
Loaded 148 embeddings and post data.
Gemini model initialized successfully using 'gemini-1.5-flash-latest'.
* Serving Flask app 'semantic_search_api'
* Debug mode: on
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on all addresses (0.0.0.0)
* Running on https://127.0.0.1:5000
* Running on https://stevepedwards.today:5000
Press CTRL+C to quit
* Restarting with stat
Loaded 148 embeddings and post data.
Gemini model initialized successfully using 'gemini-1.5-flash-latest'.
* Debugger is active!
* Debugger PIN: 203-532-454
127.0.0.1 - - [06/Jul/2025 12:25:51] "GET /socket.io/?EIO=4&transport=polling&t=PVWnS-D HTTP/1.1" 200 -
Client connected via WebSocket!
127.0.0.1 - - [06/Jul/2025 12:25:51] "POST /socket.io/?EIO=4&transport=polling&t=PVWnTPq&sid=Qou_ABbX6pI_eGBwAAAA HTTP/1.1" 200 -
127.0.0.1 - - [06/Jul/2025 12:25:51] "GET /socket.io/?EIO=4&transport=polling&t=PVWnTPr&sid=Qou_ABbX6pI_eGBwAAAA HTTP/1.1" 200 -
127.0.0.1 - - [06/Jul/2025 12:25:51] "GET /socket.io/?EIO=4&transport=polling&t=PVWnTPy&sid=Qou_ABbX6pI_eGBwAAAA HTTP/1.1" 200 -

Here's how it works in your startup script:

sudo systemctl is-active apache2 && echo 'active' || (sudo service apache2 start && echo 'restarting/started') || echo 'failed to start'
- sudo systemctl is-active apache2: This command checks if Apache2 is active. If it succeeds (Apache is active), it returns a zero exit status.
- && echo 'active': If is-active succeeds, then echo 'active' runs, and the rest of the line is skipped because the condition (Apache active) has been met.
- || (sudo service apache2 start && echo 'restarting/started'): If is-active fails (Apache is not active), then this part executes. It attempts to start Apache, and if that succeeds, it prints 'restarting/started'.
- || echo 'failed to start': If both is-active and the start command fail, then this last part executes, printing 'failed to start'.

So, || provides a way to define alternative actions or fallback options if a preceding command doesn't succeed.

Your project seamlessly integrates a local Large Language Model (LLM) with your WordPress website, enabling voice-activated semantic search and AI-powered summarization of your site's content.

General Function

The system allows users to speak (or type) a query into a search bar on your WordPress site. Instead of a traditional keyword search, it uses a local LLM to understand the meaning of the query. This semantic understanding allows it to find the most relevant posts and pages on your site, even if they don't contain the exact keywords. It then displays these results as clickable links, and offers an option to generate an AI-powered summary of the content using an online Gemini model. Real-time feedback on backend processes is shown in a terminal-like display on the webpage.

Stages of Construction and Script Roles

The project's functionality is built across several interconnected stages, involving WordPress (PHP, MySQL), a Python Flask API, a local Ollama LLM, and browser-side JavaScript and CSS.

Stage 1: Content Extraction & Embedding Generation This is the foundational step where your WordPress content is prepared for semantic understanding.

Trigger: A new Post or Page is added/updated on your WordPress site.
Need for Update: The LLM system needs to be aware of this new content to include it in future semantic searches.
Script: generate_embeddings.py
- Role: This Python script is responsible for:
  1. Connecting to WordPress's MySQL Database: It queries your wp_xubg_posts table to fetch the ID, post_title, post_content, post_type, and post_name (slug) of all published posts and pages.
  2. Cleaning Content: It removes HTML tags and WordPress shortcodes from the raw post_content to get clean text suitable for the LLM. It also decodes HTML entities.
  3. Saving Full Post Texts: For each cleaned post, it saves the full text into individual .txt files in the full_posts_text/ directory (e.g., 12345.txt). Before doing so, it clears the entire full_posts_text/ directory to ensure only current and published posts are stored, preventing stale data buildup.
  4. Generating Embeddings: For each cleaned post's text, it sends the text to your locally running Ollama server, specifically using the nomic-embed-text model. Ollama returns a numerical vector (embedding) that represents the semantic meaning of the text.
  5. Storing Embeddings & Metadata: It compiles all this information (Post ID, Title, Slug, Content Snippet, Word Count, and the generated Embedding) into a single JSON file: wordpress_embeddings.json.
- Why it runs: To create or refresh the "knowledge base" (the embeddings) that your semantic search API will use. This must be done whenever WordPress content changes to keep the search accurate.

Stage 2: Backend API Server (Semantic Search & Summarization) This stage provides the intelligent services that the WordPress frontend will communicate with.

Trigger: Your Flask server needs to be running in the background. It needs to be restarted whenever wordpress_embeddings.json is updated or semantic_search_api.py itself is changed.
Script: semantic_search_api.py
- Role: This Python Flask application acts as the bridge between your WordPress frontend and the LLM services.
  1. Loads Knowledge Base: On startup, it reads the wordpress_embeddings.json file into memory. It stores all the post metadata and their corresponding embeddings.
  2. Initializes Ollama Connection: Sets up the connection to your local Ollama server to handle embedding requests for search queries.
  3. Initializes Gemini Connection: Configures the connection to the Google Gemini API, specifically using the models/gemini-1.5-flash-latest model for summarization.
  4. Semantic Search Endpoint (/semantic_search): When a search query is received from the frontend:
    - It sends the user's query to Ollama (using nomic-embed-text) to generate an embedding for the query.
    - It compares this query embedding to all the stored embeddings from your WordPress posts using cosine similarity.
    - It identifies the top 3 most semantically similar posts from your site.
    - It returns the metadata (Title, Slug, Word Count, Score, Content Snippet) of these top 3 results to the frontend.
  5. Summarization Endpoint (/summarize_article): When the "Summarise Article" button is clicked for a specific post:
    - It reads the full content of that post from the .txt file in full_posts_text/.
    - It sends this full content as a prompt to the online Google Gemini API (gemini-1.5-flash-latest) to generate a concise summary.
    - It returns the generated summary back to the frontend.
  6. Real-time Communication: Uses Socket.IO to send real-time progress updates from the backend Python scripts directly to the terminal display on your WordPress page.
- Why it runs: It continuously listens for requests from your WordPress site and provides the semantic search and summarization intelligence.

Stage 3: Frontend User Interface & Interaction This stage handles how users interact with the system on your WordPress site.

Trigger: User visits your WordPress page.
Scripts:
- my_voice_search.php (WordPress Plugin PHP)
  - Role: This WordPress plugin handles the display of the search interface.
    1. Enqueues Assets: It loads your style.css (for styling), socket.io.min.js (for real-time updates), and speech-recognition.js (for logic) into your WordPress page.
    2. Registers Shortcodes: It defines two shortcodes:
      - Backend Process Output:: This shortcode renders the HTML for the voice search input box, microphone button, and a feedback message area. Crucially, it also renders the "Backend Process Output" terminal display directly below the mic input, which shows real-time messages from the Flask server.
      - Found Semantic Results: Your semantic search results will appear here after you speak or type a query.: This shortcode renders the HTML container (<div id="voice-search-results">) where the semantic search results (clickable links, snippets, summarize buttons) will be dynamically inserted by JavaScript. It's placed separately on your WordPress page.
  - Why it runs: It's a WordPress plugin, so it's loaded by WordPress to display the user interface elements on your designated page.
- speech-recognition.js (JavaScript)
  - Role: This JavaScript file brings the frontend to life:
    1. Web Speech API Integration: It uses the browser's built-in Web Speech API to capture audio from the user's microphone and convert it into text.
    2. Sends Queries to Flask: When a voice input is transcribed (or text is typed into the search box and Enter is pressed), it sends this query via an AJAX POST request to your semantic_search_api.py's /semantic_search endpoint.
    3. Displays Search Results: When it receives the JSON response from Flask (containing the top 3 semantic results, including Post Title, URL Slug, Word Count, and Snippet), it dynamically generates HTML elements for each result and inserts them into the <div id="voice-search-results"> on your WordPress page. It correctly formats "Source: [Post Title]" and "Words: [Word Count]".
    4. Handles Summarization Requests: It listens for clicks on the "Summarise Article" buttons. When clicked, it sends a request to Flask's /summarize_article endpoint for that specific post's ID. Upon receiving the summary, it updates the post's snippet text directly on the page.
    5. Manages Real-time Terminal: It uses Socket.IO to connect to your Flask server's WebSocket, receiving progress messages and server_messages, and appends them to the #real-time-output (the terminal display), automatically scrolling to the bottom.
  - Why it runs: It's the client-side logic that handles user interaction, communicates with the backend, and updates the webpage dynamically.
- style.css (CSS)
  - Role: Provides all the visual styling for the search bar, mic button, feedback area, search results, summarization snippets, and the real-time terminal display. It uses !important declarations where necessary to ensure its styles override conflicting rules from your WordPress "Dark Theme," ensuring visibility and consistent appearance.
  - Why it runs: Loaded by my_voice_search.php to define the appearance of the plugin's elements.

Order of Operations for System Update & Use:

Local Machine / WSL2 Startup:
- You run your combined startup command: sudo service apache2 start &>/dev/null; sudo service mysql start &>/dev/null; sudo systemctl start ollama &>/dev/null; sleep 3; cd /var/www/my-llm-project/ && source llm_venv/bin/activate && echo "Apache2: $(...)" && echo "MySQL/MariaDB: $(...)" && echo "Ollama: $(...)" && echo "Starting Flask server..." && python3 semantic_search_api.py
- This command first ensures Apache2, MySQL, and Ollama services are running. It gives you immediate feedback on their status.
- Then, it activates your llm_venv and starts your semantic_search_api.py (Flask server) in the foreground. This Flask server will load your wordpress_embeddings.json and initialize the Gemini model.
Adding/Updating WordPress Content (Needs Update of LLM Knowledge Base):
- You add a new Post/Page or modify existing ones in WordPress.
- You run update_agent.py: cd /var/www/my-llm-project; source llm_venv/bin/activate; python3 update_agent.py
  - This script will automatically:
    - Run generate_embeddings.py. This script will connect to MySQL, clear full_posts_text/, extract all current WordPress content, generate new embeddings using Ollama, and save them to wordpress_embeddings.json.
    - Kill any running Flask process.
    - Start a new Flask server (using semantic_search_api.py), ensuring it loads the newly generated wordpress_embeddings.json.
User Interacts on WordPress Site:
- The user visits your WordPress page.
- my_voice_search.php loads the necessary CSS and JavaScript.
- The Backend Process Output: shortcode renders the input field, mic button, and the terminal.
- The Found Semantic Results: Your semantic search results will appear here after you speak or type a query. shortcode renders the container for search results.
- The speech-recognition.js script takes over:
  - Connects to Flask via Socket.IO for real-time terminal updates.
  - Enables voice input via the mic button.
  - Sends transcribed queries to the Flask semantic_search_api.py.
  - Receives and displays the top 3 semantic search results from Flask, including the Post Title, Word Count, and Score.
  - Manages the "Summarise Article" button clicks, sending requests to Flask and displaying the Gemini-generated summaries.

This comprehensive setup provides a powerful, voice-activated, and AI-enhanced search experience for your WordPress content!

General Function

Stages of Construction and Script Roles

Backend Process Output:

Found Semantic Results:

Order of Operations for System Update & Use:

Backend Process Output:

Found Semantic Results: