ravenscott-blog/markdown/RayAI At Home Chat Assistant Server.md
2024-09-16 09:36:58 -04:00

20 KiB
Raw Blame History

A Comprehensive Look into Middleware Backend Server and Discord Bot Integration

Source Code

https://git.ssh.surf/snxraven/rayai

Introduction

In this article, we will provide a comprehensive breakdown of the setup and configuration of a home-hosted AI infrastructure. This system is composed of a middleware backend server and a Discord bot, designed to handle chat interactions, manage conversations, integrate plugins, and execute core service management. The infrastructure is built using Express.js and includes a range of plugins for enhanced functionality such as IP detection, content scraping, and handling Minecraft server requests. The server integrates with Discord, allowing users to interact with the system through specified channels.

This article will detail each component and explain the workflow.

Overview

The home-hosted AI infrastructure is composed of a backend server built using Express.js and a Discord bot powered by Discord.js. The server handles various chat-related APIs, processes HTTP requests, and manages plugins for tasks like IP detection, web scraping, and random user generation. The Discord bot interacts with the backend, forwarding user messages and executing commands such as resetting conversations and restarting core services.

Key Components:

  • Express.js: Powers the web server for handling requests and responses.
  • Axios: Sends HTTP requests to external APIs.
  • Cheerio: Scrapes and parses HTML content.
  • Llama-Tokenizer: Tokenizes text for conversation management.
  • Google-It: Searches Google and scrapes results for user queries.
  • Discord.js: Powers the bot for interacting with Discord channels.

Environment Configuration

The system is configured using environment variables, which are loaded using the dotenv package. These variables manage sensitive data such as API keys and tokens, as well as server paths and configuration settings.

Example Environment Variables:

  • PROMPT: Initial system prompt for conversation history.
  • ABUSE_KEY: API key for IP abuse detection using AbuseIPDB.
  • API_KEY: API key for My-MC.link services.
  • PATH_KEY: Path key for accessing My-MC.link services.
  • THE_TOKEN: Discord bot token for authentication.
  • CHANNEL_IDS: List of Discord channel IDs where the bot is active.
  • ROOT_IP: IP address of the backend server.
  • ROOT_PORT: Port for backend server communication.
  • MAX_CONTENT_LENGTH: Maximum length for scraped web page content.

Example .env File:

PROMPT="Welcome to the AI system"
ABUSE_KEY="your-abuse-ipdb-key"
API_KEY="your-my-mc-link-api-key"
PATH_KEY="your-my-mc-link-path-key"
THE_TOKEN="your-discord-bot-token"
CHANNEL_IDS="channel1,channel2"
ROOT_IP="127.0.0.1"
ROOT_PORT=3000
MAX_CONTENT_LENGTH=2000

Server Setup

The backend server is built using Express.js and is designed to handle multiple API endpoints related to chat interactions, conversation management, and core service control. The server makes use of middlewares for handling CORS, parsing JSON requests, and tracking conversation history based on client IP addresses.

Middleware Configuration:

  1. CORS: Allows requests from all origins and specific headers.
  2. Body-Parser: Parses incoming JSON request bodies.
  3. Conversation History: Middleware that tracks conversation history based on client IP addresses.
app.use(cors({ origin: '*', allowedHeaders: ['Content-Type', 'x-forwarded-for-id', 'x-forwarded-for-name'] }));
app.use(bodyParser.json());

The server tracks each conversation using the client's IP address, initializing a new conversation history if one doesn't already exist.

app.use((req, res, next) => {
   const ip = req.headers['x-forwarded-for-id'] || req.headers['x-forwarded-for'] || req.ip;
   if (!conversationHistory[ip]) {
      conversationHistory[ip] = [{ role: 'system', content: process.env.PROMPT }];
   }
   next();
});

Helper Functions

The server relies on several utility functions for timestamp generation, token counting, and web scraping. These functions help manage conversation history, enforce token limits, and gather content from external websites.

getTimestamp

Generates a timestamp for logging purposes, in the format MM/DD/YYYY [HH:MM:SS AM/PM].

const getTimestamp = () => {
   const now = new Date();
   const date = now.toLocaleDateString('en-US');
   const time = now.toLocaleTimeString('en-US');
   return `${date} [${time}]`;
};

countLlamaTokens

Counts the number of tokens in the conversation using the Llama-Tokenizer library.

function countLlamaTokens(messages) {
   let totalTokens = 0;
   for (const message of messages) {
      totalTokens += llamaTokenizer.encode(message.content).length;
   }
   return totalTokens;
}

scrapeWebPage

Uses Cheerio to scrape and parse web pages, extracting the title, meta description, and body content.

async function scrapeWebPage(url, length = 2000) {
   const res = await fetch(url);
   const html = await res.text();
   const $ = cheerio.load(html);
   const title = $('head title').text().trim();
   const description = $('meta[name="description"]').attr('content');
   const content = $('body').text().trim();
   
   return { title, description, content: content.substring(0, length) };
}

API Endpoints

The backend server exposes several API endpoints for handling chat interactions, restarting core services, resetting conversation history, and fetching conversation history.

/api/v1/chat

Handles chat requests, logs user messages, updates the conversation history, and interacts with the Llama API to generate responses.

app.post('/api/v1/chat', async (req, res) => {
   const ip = req.clientIp;
   const userMessage = req.body.message;
   conversationHistory[ip].push({ role: 'user', content: userMessage });
   const response = await llamaAPIRequest(conversationHistory[ip]);
   res.json(response);
});

/api/v1/restart-core

Restarts the core service using a Docker command and returns the result of the operation.

app.post('/api/v1/restart-core', (req, res) => {
   cmd('docker restart llama-gpu-server')
      .then(out => res.json(out.stdout))
      .catch(err => res.status(500).json({ message: "Error restarting core" }));
});

/api/v1/reset-conversation

Resets the conversation history for the client based on their IP address.

app.post('/api/v1/reset-conversation', (req, res) => {
   conversationHistory[req.clientIp] = [{ role: 'system', content: process.env.PROMPT }];
   res.json({ message: "Conversation history reset" });
});

Plugin Handling

The system uses various plugins to enhance functionality. Each plugin performs a specific task such as IP address detection, web scraping, or fetching random user data.

IP Plugin:

Checks IP addresses against the AbuseIPDB to detect malicious activity.

URL Plugin:

Scrapes content from URLs using Cheerio and adds it to the conversation history.

New Person Plugin:

Fetches random user data from an external API to simulate interaction with new individuals.

What Servers Plugin:

Fetches Minecraft server information from the My-MC.link platform.

Error Handling and Logging

The server includes robust error handling and logging mechanisms that provide detailed information on request processing times, potential errors, and system status.

app.use((err, req, res, next) => {
   console.error(`${getTimestamp()} Error: ${err.message}`);
   res.status(500).json({ message: "An error occurred", error: err.message });
});

Discord Bot Integration

The Discord bot integrates with the backend to allow users to interact with the AI system via specific Discord channels. The bot is built using Discord.js and listens for messages, processes commands, and forwards user input to the backend server for processing.

Client Initialization

The bot initializes a Discord client with the appropriate intents to listen for guild messages and message content.

const client = new Client({ 
   intents: [GatewayIntentBits.Guilds, GatewayIntentBits.GuildMessages, GatewayIntentBits.MessageContent] 
});

Command Handling

The bot listens for specific commands such as !reset to reset the conversation and !restartCore to restart the core service.

client.on('messageCreate', async message => {
   if (message.content === '!reset') {
      await resetConversation(message);
   } else if (message.content === '!restartCore') {
      await restartCore(message);
   } else {
      await handleUserMessage(message);
   }
});

Handling Long Messages

If the response from the backend exceeds Discords message length limit, the bot splits the message into smaller chunks and sends them sequentially.

async function sendLongMessage(message, responseText) {
   const limit = 8096;
   if (responseText.length > limit) {
      const chunks = splitIntoChunks(responseText, limit);
      for (let chunk of chunks) await message.channel.send(chunk);
   } else {
      await message.channel.send(responseText);
   }
}

How to Run the Server and Bot

  1. Clone the Repository: Clone the server and bot code to your project directory.
  2. Install Dependencies:
    npm install express body-parser cmd-promise cors cheerio llama-tokenizer-js google-it discord.js axios dotenv
    
  3. Configure Environment Variables: Create a .env file in the root directory and populate it with the required environment variables.
  4. Run the Server:
    node backend-server.js
    
  5. Run the Discord Bot:
    node discord-bot.js
    

Browser Chat Client for x64.world AI: Integration

Avalible at: https://chat.x64.world/

Introduction

This section of the article explains the functionality of the x64.world browser chat client, which provides an interactive front-end for users to communicate with the AI backend. The client uses a web-based interface where users can type messages, receive AI-generated responses, and view real-time system statistics like CPU and GPU usage. It is built using HTML, JavaScript, and integrates with external APIs to fetch and display system metrics. This guide will break down each component of the chat client and explain its role in delivering a seamless user experience.

Overview

The x64.world chat client is designed to provide a simple and interactive interface where users can chat with an AI system powered by a custom backend. It displays system statistics, handles chat inputs, and shows responses from the AI. The client uses Bootstrap for styling, highlight.js for syntax highlighting, and Marked.js for rendering Markdown content.

The JavaScript behind the chat client is responsible for sending user messages to the backend, updating the user interface with new messages, fetching system metrics, and handling themes.

UI Design

The chat clients user interface consists of several key sections:

  • Header: Displays system information, including OS version, kernel, CPU, GPU, and memory details.
  • Stats Section: Displays real-time statistics such as CPU usage, system RAM, GPU chip and VRAM usage, power draw, and network usage.
  • Chat Section: Displays user and AI messages, with Markdown formatting and syntax highlighting for code blocks. It includes an input field for sending messages, and buttons to reset conversation, toggle themes, and ask random questions.

HTML Structure:

<div id="chat-container">
    <div id="header-title">x64.World AI</div>
    <div id="subheader">...</div> <!-- System Info Section -->
    <div id="stats">...</div> <!-- Stats Display Section -->
    <div id="chat">
        <div id="messages"></div> <!-- Chat Messages Section -->
        <div id="loading"></div> <!-- Loading Indicator -->
        <div class="alert-container">...</div> <!-- Alerts and Input Section -->
    </div>
</div>

Key Features:

  • Bootstrap-based styling for a responsive and clean UI.
  • Dark and light modes for user preference, with dynamic switching.
  • Real-time system metrics updated every two seconds, including CPU, RAM, GPU, and network usage.
  • Markdown and code block rendering using marked.js and highlight.js for a developer-friendly chat interface.

Message Handling

The chat client sends messages to the AI backend, displays the response, and supports Markdown rendering and code syntax highlighting. Users can send messages by typing into the input box and pressing enter.

Sending Messages:

When a message is typed, it is encoded using he.js to prevent HTML injection, and the message is sent to the backend API. The AI's response is fetched and displayed in the chat.

async function sendMessage() {
    const messageInput = document.getElementById('messageInput');
    let message = messageInput.value.trim();

    if (message === '') return;

    message = he.encode(message); // Prevent HTML injection
    lastUserMessageElement = displayMessage(message, 'user');
    messageInput.value = '';
    toggleLoading(true);

    try {
        const response = await fetch('https://infer.x64.world/chat', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json'
            },
            body: JSON.stringify({ message })
        });

        if (response.ok) {
            const data = await response.json();
            displayMessage(data.content, 'assistant');
        } else {
            handleErrorResponse(response.status);
        }
    } catch (error) {
        displayMessage('Error: ' + error.message, 'assistant');
    } finally {
        toggleLoading(false);
    }
}

Message Rendering:

User and AI messages are displayed as div elements with appropriate classes to distinguish between user and assistant messages. Markdown content is rendered using Marked.js, and code blocks are highlighted using highlight.js.

function displayMessage(content, sender) {
    const messages = document.getElementById('messages');
    const messageElement = document.createElement('div');
    messageElement.classList.add('message', sender);

    const decodedContent = he.decode(content);
    const htmlContent = marked(decodedContent); // Convert markdown to HTML
    messageElement.innerHTML = htmlContent;

    messages.appendChild(messageElement);
    messages.scrollTop = messages.scrollHeight;

    document.querySelectorAll('pre code').forEach((block) => {
        hljs.highlightBlock(block); // Syntax highlighting
        if (sender === 'assistant') {
            addCopyButton(block); // Add copy button for code blocks
        }
    });

    return messageElement;
}

System Metrics Display

The chat client fetches real-time system metrics every two seconds and displays CPU, GPU, RAM, power draw, and network usage in the stats section.

Stats Fetching:

Metrics are fetched using the Netdata API, which provides system resource usage details such as CPU and RAM. The stats are displayed in a user-friendly format with real-time updates.

async function fetchData(chart) {
    const response = await fetch(`${NETDATA_URL}/api/v1/data?chart=${chart}&format=json`, {
        cache: 'no-store',
        headers: { 'Cache-Control': 'no-cache' }
    });
    return await response.json();
}

async function updateMetrics() {
    await updateCpuUsage();
    await updateMemoryUsage();
    await updateNetworkUsage();
    setTimeout(updateMetrics, 2000); // Update every 2 seconds
}

Updating Metrics:

Each metric is calculated and displayed dynamically using JavaScript. The updateMetrics() function is called repeatedly to ensure the stats stay up to date.

async function updateCpuUsage() {
    const data = await fetchData('system.cpu');
    const totalCPU = data.data[0].slice(1).reduce((acc, curr) => acc + curr, 0);
    document.getElementById('cpuPercentage').textContent = totalCPU.toFixed(2);
}

Netdata Integration

Netdata is used for real-time monitoring of system resources. The chat client fetches data from the Netdata API and displays information such as CPU and network usage.

Integration Example:

The chat client fetches data from the Netdata server to display the current CPU usage and memory consumption, updating every two seconds.

async function updateMemoryUsage() {
    const data = await fetchData('mem.available');
    const usedMemoryMB = data.data[0][1];
    const usedMemoryGB = usedMemoryMB / 1024;
    document.getElementById('memoryUsage').textContent = `${usedMemoryGB.toFixed(2)} GB`;
}

GPU Stats Fetching

The client includes functionality to fetch and display GPU stats such as utilization, memory usage, and power draw. The data is fetched from an endpoint that provides NVIDIA SMI details.

async function fetchGpuStats() {
    try {
        const response = await fetch('https://smi.x64.world/nvidia-smi');
        const data = await response.json();
        updateGpuStats(data);
    } catch (error) {
        console.error('Error fetching GPU stats:', error);
    }
}

function updateGpuStats(data) {
    const gpu = data.nvidia_smi_log.gpu;
    document.getElementById('gpuUtilization').innerText = gpu.utilization.gpu_util;
    document.getElementById('gpuMemoryUtilization').innerText = gpu.utilization.memory_util;
    document.getElementById('powerDraw').innerText = gpu.gpu_power_readings.power_draw;
}

Error Handling and Alerts

The client includes alert mechanisms for showing success or error messages to users. This is particularly useful when handling operations like resetting the conversation, restarting the core, or sending a message.

Example Error Handling:

When an error occurs, the client shows an alert to the user. This is done using Bootstrap's alert system, which displays error or success messages.

function displayAlert(type, message) {
    const alertElement = document.getElementById(`${type}-alert`);
    alertElement.textContent = message;
    alertElement.style.display = 'block';
    setTimeout(() => {
        alertElement.style.display = 'none';
    }, 3000);
}

How to Run the Chat Client

  1. Clone the Project: Ensure you have the HTML, CSS, and JavaScript files for the chat client.
  2. Set Up Backend: The client relies on a backend API (https://infer.x64.world) for chat interactions and system metrics. Ensure the backend is running.
  3. Serve the HTML File: You can use a simple HTTP server like live-server or http-server to serve the HTML page:
    live-server
    
  4. Access the Chat Client: Open the provided URL from

the server (usually http://localhost:8080) to access the chat client.

My findings

This home-hosted AI infrastructure combines a robust middleware backend server with a Discord bot, allowing for a dynamic and interactive experience. By leveraging various plugins and APIs, the system can handle complex chat interactions, manage conversation history, and provide additional functionality such as IP detection and content scraping. With the ability to integrate into Discord, this infrastructure offers a seamless way to interact with users in real-time, making it a powerful tool for AI-driven chat systems.

The x64.world chat client provides a clean and interactive interface for users to communicate with the AI backend while also monitoring real-time system statistics. It leverages a variety of libraries and APIs to deliver a seamless experience, from Markdown message rendering to GPU stats fetching. This browser client, along with its backend, forms the core of an efficient and scalable AI-powered chat system, accessible from any web browser.