Adding client side + new readme

2024-08-09 04:24:40 -04:00
parent 7e75ea80a6
commit e35919c004
3 changed files with 999 additions and 307 deletions
--- a/README.md
+++ b/README.md
@@ -1,16 +1,6 @@
-# AI NGINX Log Analysis 
+# AI Log Monitoring System
 ## Overview
-This project is an Express.js-based server designed to interact with models via llama-cpp-python[web] emulation server to analyze NGINX logs for potential security threats. The server processes incoming requests, analyzes the content, and provides responses that include alerts or general insights. The server can also scrape web pages and manage conversation histories per IP address.
+This repository contains two interdependent Node.js applications, `ai_log.js` and `ai_log_backend.js`. These applications collaborate to monitor NGINX logs, detect potential security threats, and manage conversation history for AI-based log analysis. This README provides a comprehensive guide for setting up, configuring, and operating these applications, along with detailed explanations of the underlying mechanisms and customizable features.
 ## Features
 - **NGINX Log Analysis**: Analyzes web traffic logs to identify potential security threats and generate appropriate responses.
 - **Conversation History**: Maintains a history of interactions for each client IP, allowing for context-aware responses.
 - **Web Scraping**: Scrapes web pages to extract and format relevant information.
 - **Token Management**: Limits the number of tokens used in the conversation to ensure responses are within the model's limits.
 - **Core Service Management**: Provides endpoints to restart the core GPT service and reset conversation histories.
 - **Cross-Origin Resource Sharing (CORS)**: Enabled for all routes, allowing for flexible API usage.
 ## Screenshots
@@ -24,81 +14,451 @@ Here are some screenshots of the application in action:
 ![Screenshot 4](screenshots/s4.png)
 ## Table of Contents
 - [Introduction](#introduction)
 - [Prerequisites](#prerequisites)
 - [Installation](#installation)
 - [Configuration](#configuration)
  - [Environment Variables](#environment-variables)
  - [Directory Structure](#directory-structure)
  - [Ignored IPs and Subnets](#ignored-ips-and-subnets)
 - [Usage](#usage)
  - [Running `ai_log.js`](#running-ai_logjs)
  - [Running `ai_log_backend.js`](#running-ai_log_backendjs)
 - [How It Works](#how-it-works)
  - [Log Monitoring and Buffering](#log-monitoring-and-buffering)
  - [Sending Logs to Backend](#sending-logs-to-backend)
  - [AI-Based Log Analysis](#ai-based-log-analysis)
  - [Token Management and Conversation History](#token-management-and-conversation-history)
  - [Security Alert Handling](#security-alert-handling)
  - [Discord Integration](#discord-integration)
 - [API Endpoints](#api-endpoints)
  - [POST /api/v1/chat](#post-apiv1chat)
  - [GET /api/v1/conversation-history](#get-apiv1conversation-history)
  - [POST /api/v1/restart-core](#post-apiv1restart-core)
  - [POST /api/v1/reset-conversation](#post-apiv1reset-conversation)
 - [Logging](#logging)
  - [Log Levels](#log-levels)
  - [Log Structure](#log-structure)
  - [Debugging](#debugging)
 - [Security Considerations](#security-considerations)
  - [IP Whitelisting](#ip-whitelisting)
  - [Rate Limiting and Banning](#rate-limiting-and-banning)
  - [Data Privacy](#data-privacy)
 - [Performance Optimization](#performance-optimization)
  - [Managing Token Limits](#managing-token-limits)
  - [Efficient Log Parsing](#efficient-log-parsing)
 - [Troubleshooting](#troubleshooting)
  - [Common Issues](#common-issues)
  - [Restarting Services](#restarting-services)
  - [Log Analysis](#log-analysis)
 - [Customization](#customization)
  - [Modifying the AI Prompt](#modifying-the-ai-prompt)
  - [Adjusting Buffer Limits](#adjusting-buffer-limits)
  - [Extending Log Parsing Capabilities](#extending-log-parsing-capabilities)
 - [Contributing](#contributing)
 - [License](#license)
 ## Introduction
 The AI Log Monitoring System is a powerful and extensible solution designed to enhance security monitoring by integrating AI-based analysis into traditional NGINX log management. By continuously tailing log files and utilizing an AI model to analyze log entries in real-time, this system automates the detection of security threats, sends alerts to designated channels (like Discord), and manages conversation history for ongoing log analysis.
 The solution is composed of two primary scripts:
 - `ai_log.js`: Handles log monitoring, buffering, and sending logs to a backend for AI processing.
 - `ai_log_backend.js`: Manages AI-based analysis, conversation history, and exposes an API for interacting with the AI service.
 ## Prerequisites
 Before setting up and running the AI Log Monitoring System, ensure you have the following components installed and configured on your machine:
 - **Node.js**: Version 14.x or higher is required to run the JavaScript code.
 - **npm**: Version 6.x or higher is needed to manage project dependencies.
 - **Docker**: Required for running the AI model, particularly if using a containerized GPT model for processing.
 - **NGINX**: The web server generating logs that the system will monitor.
 - **Discord**: A Discord webhook URL is necessary for sending security alerts.
 Ensure you have administrative privileges on the machine to install and configure these dependencies.
 ## Installation
-**Note**: A `llama-cpp-python` OpenAI Emulation Server is required alongside the backend server. The server in this code is configured to run on `127.0.0.1:8003`.
+Clone the repository to your local machine:
-1. Clone the repository:
+```bash
-   ```bash
+git clone git@git.ssh.surf:snxraven/ai-nginx-log-security.git
-   git clone git@git.ssh.surf:snxraven/ai-nginx-log-security.git
+cd ai-log-monitoring
-   cd ai-nginx-log-security
+```
   ```
-2. Install dependencies:
+Install the required Node.js dependencies for both applications:
   ```bash
   npm install
   ```
-3. Create a `.env` file in the project root with the following content:
+```bash
-   ```bash
+npm install
-   MAX_CONTENT_LENGTH=2000
+```
   ```
-4. Start the server:
+## Configuration
   ```bash
   ai_log.js
   ```
-## Endpoints
+### Environment Variables
-### 1. `/api/v1/chat`
+Environment variables play a critical role in configuring the behavior of the AI Log Monitoring System. They allow you to customize log directories, adjust token limits, and set up necessary credentials. Create a `.env` file in the root of your project and populate it with the following variables:
 - **Method**: `POST`
 - **Description**: Processes a user message, analyzes it for security threats, and returns a response.
 - **Request Body**:
  - `message`: The message to be analyzed.
 - **Response**: JSON object containing the response from the GPT model.
-### 2. `/api/v1/conversation-history`
+```bash
- **Method**: `GET`
+# General Configuration
- **Description**: Retrieves the conversation history for the requesting client's IP.
+DEBUG=true # Enable detailed logging output for debugging
- **Response**: JSON object containing the conversation history.
+WEBHOOKURL=https://discord.com/api/webhooks/your-webhook-id # Discord webhook URL for sending alerts
 MAX_CONTENT_LENGTH=2000 # Maximum length for content scraped from web pages
-### 3. `/api/v1/restart-core`
+# AI Service Configuration
- **Method**: `POST`
+TIMEOUT_DURATION=100000 # Maximum duration (in ms) for API requests to complete
- **Description**: Restarts the core GPT service running in a Docker container.
+MAX_TOKENS=8000 # Maximum tokens allowed in a conversation before trimming
- **Response**: JSON object confirming the restart or detailing any errors.
+TOLERANCE=100 # Extra tokens allowed before forcing a trim of conversation history
 ```
-### 4. `/api/v1/reset-conversation`
+**Explanation of Environment Variables**:
 - **Method**: `POST`
 - **Description**: Resets the conversation history for the requesting client's IP.
 - **Response**: JSON object confirming the reset.
-## Environment Variables
+- **DEBUG**: Enables verbose logging, including debug-level messages. Useful during development and troubleshooting.
 - **WEBHOOKURL**: The Discord webhook URL where alerts will be sent. This should be a valid and secure URL.
 - **MAX_CONTENT_LENGTH**: Limits the length of content extracted from web pages to avoid overloading the system or sending excessively long messages.
 - **TIMEOUT_DURATION**: Sets a timeout for requests to the AI service, preventing the system from hanging indefinitely.
 - **MAX_TOKENS**: Controls the total number of tokens (words + punctuation) allowed in a conversation with the AI. This prevents the AI from consuming too much memory or processing power.
 - **TOLERANCE**: A buffer to avoid hitting the MAX_TOKENS limit exactly, ensuring smoother operation by providing a cushion before trimming conversation history.
- `MAX_CONTENT_LENGTH`: The maximum length for the content extracted from web pages during scraping (default: 2000 characters).
+### Directory Structure
 The system monitors log files stored in a specific directory. Ensure this directory exists and is correctly set in the script:
 ```bash
 mkdir -p /dockerData/logs
 ```
 If your NGINX logs are stored in a different directory, update the `LOG_DIRECTORY` constant in `ai_log.js` accordingly:
 ```javascript
 const LOG_DIRECTORY = '/your/custom/path/to/logs';
 ```
 ### Ignored IPs and Subnets
 The system allows you to specify IP addresses and subnets that should be ignored during log monitoring. This is particularly useful for filtering out trusted sources or known harmless traffic (e.g., public DNS servers).
 - **Ignored IPs**: Directly listed IP addresses that should be skipped during processing.
 - **Ignored Subnets**: Subnets specified in CIDR notation that represent ranges of IP addresses to be ignored.
 To customize the ignored IPs and subnets, modify the `ignoredIPs` and `ignoredSubnets` arrays in `ai_log.js`:
 ```javascript
 const ignoredIPs = ['1.1.1.1', '1.0.0.1', '8.8.8.8', '8.8.4.4'];
 const ignoredSubnets = [
  '173.245.48.0/20', '103.21.244.0/22', '103.22.200.0/22',
  // Add more subnets as needed
 ];
 ```
 This ensures that traffic from these sources is not flagged or acted upon, reducing false positives and focusing the AI on more critical threats.
 ## Usage
-1. Send a POST request to `/api/v1/chat` with a message to analyze web traffic logs.
+### Running `ai_log.js`
-2. Use the `/api/v1/conversation-history` endpoint to fetch the chat history.
+
-3. Restart the core service using the `/api/v1/restart-core` endpoint if needed.
+The `ai_log.js` script is responsible for continuously monitoring NGINX logs, buffering log entries, and sending them to the backend for analysis. It also handles real-time actions, such as banning IP addresses and sending alerts.
-4. Reset conversation history using the `/api/v1/reset-conversation` endpoint.
+
 To start the log monitoring process, execute the following command:
 ```bash
 node ai_log.js
 ```
 The script will immediately begin reading logs from the specified directory and processing them according to the rules defined in the script. The logs will be buffered and periodically sent to the backend for AI-based analysis.
 ### Running `ai_log_backend.js`
 The `ai_log_backend.js` script sets up an Express server that interfaces with the AI model to analyze log data. It also manages conversation history and provides endpoints for interacting with the system.
 To start the backend server:
 ```bash
 node ai_log_backend.js
 ```
 By default, the server will be running on `http://localhost:3001`. This server handles incoming log data, processes it with the AI model, and returns actionable insights, including potential security alerts.
 ## How It Works
 ### Log Monitoring and Buffering
 The `ai_log.js` script uses the `Tail` module to monitor NGINX log files in real-time. As new lines are added to the logs, the script reads and buffers them. The buffer size is configurable, allowing the system to batch-process logs before sending them to the backend.
 **Key Features**:
 - **Real-Time Monitoring**: Continuously monitors specified log files for new entries.
 - **Buffering**: Collects log entries in a buffer to reduce the frequency of network requests to the backend.
 - **Ignored Entries**: Filters out log entries from specified IPs and subnets, as well as entries matching certain patterns.
 ### Sending Logs to Backend
 When the log buffer reaches a predefined size or a set time interval elapses, the
 buffered logs are sent to the backend for AI processing. The backend analyzes the logs to detect potential security threats, generate alerts, and manage conversation history.
 **Process Overview**:
 1. **Buffer Accumulation**: Logs are collected in a buffer until a threshold is met.
 2. **Buffer Flush**: The buffer is sent to the backend in one request, optimizing network usage.
 3. **Backend Analysis**: The AI model analyzes the logs for suspicious activity or patterns.
 ### AI-Based Log Analysis
 The backend server (`ai_log_backend.js`) leverages an AI model (e.g., GPT) to analyze the logs and detect potential security threats. The AI operates based on a custom prompt that instructs it on how to interpret the logs, which IPs to ignore, and what actions to take.
 **AI Model Usage**:
 - **Custom Prompt**: The AI is guided by a detailed prompt that defines its behavior and decision-making process.
 - **Log Parsing**: The AI processes log lines to identify malicious patterns, potential attacks, or other security concerns.
 - **Actionable Insights**: Based on the analysis, the AI generates alerts, suggests actions (like banning IPs), or provides general observations.
 ### Token Management and Conversation History
 To ensure efficient operation and prevent resource exhaustion, the system carefully manages the number of tokens used in AI conversations. Token management involves trimming older parts of the conversation history to stay within predefined limits.
 **Token Management Strategies**:
 - **Counting Tokens**: The system counts tokens for each message in the conversation history.
 - **Trimming History**: If the token count exceeds the maximum allowed, the oldest messages are removed.
 - **Tolerance Buffer**: A small buffer is maintained to avoid hitting the exact token limit, ensuring smoother performance.
 ### Security Alert Handling
 When the AI detects a potential security threat, it generates an alert. These alerts are processed by the backend and can trigger actions like banning an IP address or sending a notification to a Discord channel.
 **Alert Workflow**:
 1. **Detection**: The AI identifies a suspicious activity or pattern in the logs.
 2. **Alert Generation**: The AI creates an alert message, formatted for clarity and readability.
 3. **IP Banning**: If an IP is identified as malicious, the system can execute a ban command to block it.
 4. **Discord Notification**: Alerts are sent to a designated Discord channel for real-time monitoring and action.
 ### Discord Integration
 The system integrates with Discord to send alerts and notifications. This is particularly useful for real-time monitoring, allowing administrators to receive and act on security alerts instantly.
 **Integration Details**:
 - **Webhook-Based Alerts**: The system uses a Discord webhook to send alerts as embedded messages.
 - **Formatted Messages**: Alerts are formatted with titles, descriptions, and timestamps to ensure they are easy to read and understand.
 - **IP Banning Alerts**: When an IP is banned, the system includes the IP address in the Discord alert.
 ## API Endpoints
 The backend server (`ai_log_backend.js`) exposes several API endpoints for interacting with the AI service, managing conversation history, and controlling the system.
 ### POST /api/v1/chat
 This endpoint processes incoming NGINX logs by sending them to the AI model for analysis.
 - **Request Body**:
  - `message`: A string containing one or more NGINX log lines.
 - **Response**:
  - A JSON object with the AI's analysis and any detected alerts.
 **Example Request**:
 ```json
 {
  "message": "127.0.0.1 - - [12/Mar/2024:10:12:33 +0000] \"GET /index.html HTTP/1.1\" 200 3050"
 }
 ```
 **Example Response**:
 ```json
 {
  "role": "assistant",
  "content": "GENERAL: No suspicious activity detected. Routine request logged."
 }
 ```
 ### GET /api/v1/conversation-history
 This endpoint retrieves the conversation history for the IP address making the request. It is useful for reviewing the AI's past analyses and actions.
 - **Response**:
  - A JSON array containing the conversation history.
 **Example Response**:
 ```json
 [
  { "role": "system", "content": "You are a security AI..." },
  { "role": "user", "content": "127.0.0.1 - - [12/Mar/2024:10:12:33 +0000] ..." },
  { "role": "assistant", "content": "GENERAL: No suspicious activity detected..." }
 ]
 ```
 ### POST /api/v1/restart-core
 This endpoint restarts the core AI service running in a Docker container. This is useful if the AI service becomes unresponsive or needs to refresh its state.
 - **Response**:
  - A JSON object with the output of the Docker restart command.
 **Example Request**:
 ```bash
 curl -X POST http://localhost:3001/api/v1/restart-core
 ```
 **Example Response**:
 ```json
 {
  "stdout": "llama-gpu-server\n"
 }
 ```
 ### POST /api/v1/reset-conversation
 This endpoint resets the conversation history for the requesting IP address, effectively starting a new session with the AI. This can be useful for clearing outdated context and beginning a fresh analysis.
 - **Response**:
  - A JSON object confirming the reset action.
 **Example Request**:
 ```bash
 curl -X POST http://localhost:3001/api/v1/reset-conversation
 ```
 **Example Response**:
 ```json
 {
  "message": "Conversation history reset for IP: 127.0.0.1"
 }
 ```
 ## Logging
-The server logs key actions, including incoming requests, conversation history management, and errors. Logs are timestamped and include IP addresses for traceability.
+Logging is a critical component of the AI Log Monitoring System, providing insights into system operations, debugging information, and records of security actions.
-## Notes
+### Log Levels
- Ensure the `llama-gpu-server` Docker container is running before starting the Express.js server.
+The system categorizes logs into different levels to help you quickly identify the nature and severity of messages:
 - Conversation history is stored in memory and will be lost when the server restarts. Consider implementing persistent storage if long-term history is required.
-## Contributions
+- **INFO**: General information about the system's operations.
 - **WARN**: Indications of potential issues that may require attention.
 - **ERROR**: Logs generated when an error occurs during processing.
 - **SUCCESS**: Messages that indicate successful operations or actions.
 - **DEBUG**: Detailed messages intended for debugging purposes, enabled when `DEBUG=true`.
-Contributions are welcome! Please fork the repository and submit a pull request with your changes.
+### Log Structure
-This project leverages cutting-edge AI to enhance web security analysis, making it easier to identify and respond to threats in real-time. 
+Each log message includes a timestamp, log level, and message content. For example:
---
+```plaintext
 2024-03-12 10:12:33 [INFO] Starting to read log from: /dockerData/logs/access.log
 2024-03-12 10:12:35 [DEBUG] Read line: 127.0.0.1 - - [12/Mar/2024:10:12:33 +0000] "GET /index.html HTTP/1.1" 200 3050
 2024-03-12 10:12:35 [SUCCESS] Log buffer sent to backend successfully.
 ```
-Let me know if you need any further changes!
+### Debugging
 When `DEBUG=true`, the system provides detailed logs that include every step of the processing workflow. This includes reading log lines, checking for ignored IPs, sending data to the backend, and receiving responses from the AI.
 These logs are invaluable during development and troubleshooting, as they offer full visibility into the system's inner workings.
 ## Security Considerations
 Security is a paramount concern when monitoring logs and responding to potential threats. The AI Log Monitoring System includes several mechanisms to enhance security and minimize false positives.
 ### IP Whitelisting
 The system allows you to specify IP addresses and subnets that should be ignored during analysis. This is particularly useful for avoiding alerts from known and trusted sources, such as public DNS servers or internal IP ranges.
 ### Rate Limiting and Banning
 To protect your infrastructure from repeated attacks, the system can automatically ban IP addresses identified as malicious. The banning process is executed via shell commands, and the system includes a delay mechanism to prevent overloading the network with too many ban requests in a short period.
 ### Data Privacy
 All sensitive data, such as IP addresses and conversation history, is handled securely. The system ensures that only necessary data is stored and processed, with an emphasis on minimizing exposure to potential vulnerabilities.
 ## Performance Optimization
 The AI Log Monitoring System is designed to be efficient and scalable, handling large volumes of log data with minimal overhead. However, some optimizations can further enhance performance, especially in high-traffic environments.
 ### Managing Token Limits
 By carefully managing the number of tokens in the AI's conversation history, the system prevents memory overuse and ensures faster response times. The `MAX_TOKENS` and `TOLERANCE` variables allow you to fine-tune this behavior.
 ### Efficient Log Parsing
 The system uses regular expressions and filtering techniques to parse and analyze log files efficiently. By focusing only on relevant log entries and ignoring unnecessary ones, the system reduces processing time and improves accuracy.
 ## Troubleshooting
 If you encounter issues while using the AI Log Monitoring System, this section provides guidance on common problems and how to resolve them.
 ### Common Issues
 - **No Logs Detected**: Ensure that the log directory is correctly specified and that log files are present.
 - **AI Service Unresponsive**: Restart the AI service using the `/api/v1/restart-core` endpoint.
 - **Excessive False Positives**: Review the ignored IPs and subnets to ensure that known safe traffic is excluded.
 ### Restarting Services
 If the AI service becomes unresponsive or you need to apply changes, use the `/api/v1/restart-core` endpoint to restart the core AI Docker container. This refreshes the model and clears any stale states.
 ### Log Analysis
 Review the logs generated by the system to identify potential issues. Focus on `ERROR` and `WARN` level messages to spot critical problems quickly. Use the `DEBUG` logs for deeper investigation during development or when troubleshooting specific issues.
 ## Customization
 The AI Log Monitoring System is highly customizable, allowing you to tailor its behavior to your specific needs.
 ### Modifying the AI Prompt
 The AI's behavior is guided by a custom prompt that defines how it should interpret log data and what actions it should take. You can modify this prompt in `ai_log_backend.js` to adjust the AI's focus or add new rules.
 **Example Customization**:
 ```javascript
 const prompt = `
 You are a security AI responsible for analyzing web traffic...
 - Ignore any IPs within the range 10.0.0.0/8.
 - Flag any requests to /admin as potential threats.
 - Use emojis to convey the severity of alerts.
 `;
 ```
 ### Adjusting Buffer Limits
 The log buffer size determines how many log lines are collected before they are sent to the backend. Adjust this size to balance network usage and processing frequency.
 **Example Customization**:
 ```javascript
 const LOG_BUFFER_LIMIT = 30; // Increase the buffer size to 30 lines
 ```
 ### Extending Log Parsing Capabilities
 You can extend the log parsing functionality by adding new regular expressions or parsing logic to handle different log formats or detect new types of threats.
 **Example Customization**:
 ```javascript
 tail.on('line', async (line) => {
  // Add custom logic to detect SQL injection attempts
  if (/SELECT.*FROM/i.test(line)) {
    log.warn(`Potential SQL injection detected: ${line}`);
  }
 });
 ```
 ## Contributing
 Contributions are welcome! To contribute, fork the repository, create a new branch for your changes, and submit a pull request. Please ensure that your code adheres to the existing style and that you include tests for any new features.
--- a/ai_log.js
+++ b/ai_log.js
@@ -1,279 +1,332 @@
 // Import necessary modules for the application
-import express from 'express'; // Express framework for building web server applications and handling HTTP requests and responses
+const axios = require('axios'); // Axios is used to make HTTP requests to external APIs or services
-import axios from 'axios'; // Axios is used to make HTTP requests to external APIs or services
+const { exec } = require('child_process'); // exec is used to execute shell commands in a child process
-import bodyParser from 'body-parser'; // Middleware for parsing incoming request bodies, specifically for handling JSON data
+const moment = require('moment'); // Moment.js is a library for parsing, validating, manipulating, and formatting dates
-import cmd from 'cmd-promise'; // A module that allows execution of shell commands in a promise-based manner, making it easier to manage async operations
+const fs = require('fs'); // File system module for interacting with the file system
-import cors from 'cors'; // Middleware to enable Cross-Origin Resource Sharing, allowing resources to be requested from another domain
+const path = require('path'); // Path module for handling and transforming file paths
-import cheerio from 'cheerio'; // Cheerio is a server-side jQuery-like library for parsing and manipulating HTML content
+const Tail = require('tail').Tail; // Tail module is used for monitoring log files and reacting to new lines as they are added
 import 'dotenv/config'; // Loads environment variables from a .env file into process.env, allowing secure storage of sensitive information
 import llamaTokenizer from 'llama-tokenizer-js'; // A library for tokenizing text, which is crucial for managing the length of text inputs to the AI model
-// Define a prompt that will guide the AI's behavior when analyzing NGINX logs for potential security issues
+// Configuration constants
-const prompt = `
+const LOG_DIRECTORY = '/dockerData/logs'; // Directory where NGINX logs are stored
-You are a security AI responsible for analyzing web traffic from NGINX logs and blocking malicious IPs. Your task is to review the logs for potential attacks and issues. If you identify a verified problem, include [ALERT] followed by a detailed description of the issue in your response. Ensure your message is formatted in Markdown compatible with Discord.
+const BACKEND_URL = 'http://127.0.0.1:3001'; // URL for the backend process that handles log processing
 const DISCORD_WEBHOOK_URL = 'WEBHOOKURL'; // URL of the Discord webhook for sending alerts and notifications
-**Guidelines:**
+// Environment-dependent configuration
- Domains on the server: shells.lol, hehe.rest, dcord.rest, nodejs.lol, dht.rest, my-mc.link, ssh.surf, tcp.quest
+const DEBUG = process.env.DEBUG === 'true'; // Enable or disable debug logging based on environment variable
- Use emojis to enhance communication.
+const LOG_BUFFER_LIMIT = 15; // Number of log lines to accumulate before sending them to the backend
- Do not report IP addresses for scraping or crawling.
+const TIME_LIMIT = 10 * 60 * 1000; // Time interval (in milliseconds) to send logs even if buffer is not full (10 minutes)
 - Ignore IPs: x.x.x.x, x.x.x.x, x.x.x.x, x.x.x.x. Do not mention these in reports.
 - Ignore IP addresses with BOGONs such as 192.168.0.1 or 127.0.0.2, etc.
 - Avoid alerting for false positives or irregular activity.
 - If there are no alerts but you have interesting findings, write: GENERAL followed by your insights in Markdown.
 - Only send GENERAL messages for noteworthy events, not for routine traffic reports.
 - In a GENERAL message, feel free to provide a long explainer on your deductions.
 - Be decisive. If an IP is being malicious, block it. Do not monitor IPs for further traffic.
 - Do not ban an IP address without a corresponding log entry, provide this in your response.
 - Block all bot IPs and information scanners except Google.
 - Provide proof and reasoning for each ban.
 - DO NOT BAN AN IP THAT IS NOT IN A LOG EVER! YOU MUST SEE THE IP ADDRESS!
 - To ban an IP or flag it as a security risk, wrap it in a Markdown spoiler: ||IPHERE||
 `;
-// Initialize the Express application and define the port on which the server will run
+let logBuffer = []; // Array to store log lines temporarily before sending to backend
-const app = express(); // Create an instance of an Express application
+let logTails = []; // Array to store active Tail instances for each log file being monitored
-const port = 3001; // Define the port number for the server, 3001 is commonly used for development
+let isSendingLogs = false; // Flag to prevent multiple simultaneous log sending operations
-// Middleware to enable CORS for all routes
+// List of IP addresses to ignore in logs (e.g., trusted IPs, public DNS servers)
-app.use(cors()); // This allows the server to accept requests from any origin, useful for APIs that may be accessed by web applications from different domains
+const ignoredIPs = ['1.1.1.1', '1.0.0.1', '8.8.8.8', '8.8.4.4'];
-// Set a larger limit for the request body to handle large data payloads
+// List of IP subnets to ignore, commonly used to filter out traffic from known sources like Cloudflare
-app.use(bodyParser.json({ limit: '50mb' })); // The JSON body parser is configured with a 50MB limit, suitable for handling large JSON payloads
+const ignoredSubnets = [
  '173.245.48.0/20', '103.21.244.0/22', '103.22.200.0/22', '103.31.4.0/22',
  '141.101.64.0/18', '108.162.192.0/18', '190.93.240.0/20', '188.114.96.0/20',
  '197.234.240.0/22', '198.41.128.0/17', '162.158.0.0/15', '104.16.0.0/13',
  '104.24.0.0/14', '172.64.0.0/13', '131.0.72.0/22'
 ];
-// Define constants for the application, used to control various aspects of the server's behavior
+// List of specific log files to ignore (e.g., specific proxy logs)
-const TIMEOUT_DURATION = 100000; // The maximum time (in milliseconds) the server will wait before timing out a request, set to 100 seconds
+const ignoredFiles = ['proxy-host-149_access.log', 'proxy-host-2_access.log', 'proxy-host-99_access.log'];
 const MAX_TOKENS = 8000; // The maximum number of tokens (words and punctuation) allowed in a conversation, this limit helps manage API usage
 const TOLERANCE = 100; // A buffer value used to prevent exceeding the MAX_TOKENS limit, ensuring the conversation stays within safe bounds
 let conversationHistory = {}; // An object to store conversation history for each IP address, allowing the server to maintain context for each user
-// Helper function to get the current timestamp in a formatted string
+// Function to get current timestamp in a formatted string (e.g., 'YYYY-MM-DD HH:mm:ss')
-const getTimestamp = () => {
+const getTimestamp = () => moment().format('YYYY-MM-DD HH:mm:ss');
-    const now = new Date(); // Get the current date and time
+
-    const date = now.toLocaleDateString('en-US'); // Format the date in the US locale
+// Logging functions for different log levels (INFO, WARN, ERROR, SUCCESS, DEBUG)
-    const time = now.toLocaleTimeString('en-US'); // Format the time in the US locale
+const log = {
-    return `${date} [${time}]`; // Return the formatted date and time as a string
+  info: (message) => console.log(`[${getTimestamp()}] [INFO] ${message}`), // Log informational messages
  warn: (message) => console.log(`[${getTimestamp()}] [WARN] ${message}`), // Log warning messages
  error: (message) => console.log(`[${getTimestamp()}] [ERROR] ${message}`), // Log error messages
  success: (message) => console.log(`[${getTimestamp()}] [SUCCESS] ${message}`), // Log success messages
  debug: (message) => {
    if (DEBUG) { // Log debug messages only if DEBUG mode is enabled
      console.log(`[${getTimestamp()}] [DEBUG] ${message}`);
    }
  }
 };
-// Middleware to track conversation history based on the client's IP address
+// Function to check if an IP address is in the ignored list or subnets
-app.use((req, res, next) => {
+const isIgnoredIP = async (ip) => {
-    // Extract the client's IP address from various possible headers (CF-Connecting-IP, X-Forwarded-For, X-Real-IP) or fallback to req.ip
+  if (ignoredIPs.includes(ip)) {
-    const ip = req.headers['cf-connecting-ip'] || req.headers['x-forwarded-for'] || req.headers['x-real-ip'] || req.ip;
+    return true; // Immediately return true if the IP is in the ignored IPs list
-    req.clientIp = ip; // Store the client's IP address in the request object for easy access later
+  }
  const { default: CIDR } = await import('ip-cidr'); // Dynamically import the ip-cidr module for CIDR range checking
  return ignoredSubnets.some((subnet) => new CIDR(subnet).contains(ip)); // Check if the IP is within any ignored subnets
 };
-    // Log the incoming request along with the client's IP address and current timestamp
+// Function to read and monitor log files in the specified directory using the Tail module
-    console.log(`${getTimestamp()} [INFO] Incoming request from IP: ${req.clientIp}`);
+const readLogs = () => {
  log.info('Initiating log reading process...');
-    // If this IP address has not been seen before, initialize a new conversation history for it
+  // Stop and clear any existing Tail instances
-    if (!conversationHistory[req.clientIp]) {
+  logTails.forEach(tail => tail.unwatch());
-        console.log(`${getTimestamp()} [INFO] Initializing conversation history for new IP: ${req.clientIp}`);
+  logTails = [];
-        // Start the conversation with the predefined prompt that instructs the AI on how to analyze the logs
+
-        conversationHistory[req.clientIp] = [
+  // Read the directory to get all log files
-            { role: 'system', content: prompt }
+  fs.readdir(LOG_DIRECTORY, (err, files) => {
-        ];
+    if (err) {
      log.error(`Error reading directory: ${err}`); // Log an error if the directory cannot be read
      return;
    }
    next(); // Move on to the next middleware or route handler
 });
-// Function to count the number of tokens in a conversation history using the llama tokenizer
+    // Filter log files, excluding those in the ignoredFiles list
-async function countLlamaTokens(messages) {
+    const logFiles = files.filter(file => file.endsWith('.log') && !ignoredFiles.includes(file));
-    let totalTokens = 0; // Initialize a counter for the total number of tokens
+    if (logFiles.length === 0) {
-    for (const message of messages) {
+      log.warn(`No log files found in directory: ${LOG_DIRECTORY}`); // Warn if no log files are found
-        // Only count tokens for user and assistant messages, not system messages
+      return;
        if (message.role === 'user' || message.role === 'assistant') {
            const encodedTokens = llamaTokenizer.encode(message.content); // Tokenize the message content
            totalTokens += encodedTokens.length; // Add the number of tokens in the current message to the total
        }
    }
    return totalTokens; // Return the total number of tokens
 }
-// Function to trim the conversation history to fit within the token limit
+    log.info(`Found ${logFiles.length} log files to tail.`); // Log the number of log files to be monitored
 async function trimConversationHistory(messages, maxLength, tolerance) {
    let tokenLength = await countLlamaTokens(messages); // Get the total number of tokens in the conversation
    // Continue trimming the conversation history until it's within the allowed token limit
    while (tokenLength > maxLength - tolerance && messages.length > 1) {
        messages.splice(1, 1); // Remove the oldest user/assistant message (the second item in the array)
        tokenLength = await countLlamaTokens(messages); // Recalculate the total number of tokens after trimming
        console.log(`${getTimestamp()} [CLEANUP] Trimmed conversation history to ${tokenLength} tokens.`);
    }
 }
-// Function to scrape a web page and extract its content
+    // For each log file, start a new Tail instance to monitor the file
-async function scrapeWebPage(url) {
+    logFiles.forEach(file => {
-    console.log(`${getTimestamp()} [INFO] Starting to scrape URL: ${url}`);
+      const filePath = path.join(LOG_DIRECTORY, file); // Create the full path to the log file
-    try {
+      log.info(`Starting to read log from: ${filePath}`); // Log the start of monitoring for this file
-        // Perform an HTTP GET request to fetch the content of the specified URL
+      try {
-        const res = await axios.get(url);
+        const tail = new Tail(filePath); // Create a new Tail instance for the log file
        const html = res.data; // Extract the HTML content from the response
        const $ = cheerio.load(html); // Load the HTML into Cheerio for parsing and manipulation
-        // Extract specific elements from the HTML: the page title, meta description, and body content
+        // Event listener for new lines added to the log file
-        const pageTitle = $('head title').text().trim(); // Get the text of the <title> tag
+        tail.on('line', async (line) => {
-        const pageDescription = $('head meta[name="description"]').attr('content'); // Get the content of the meta description
+          if (line.includes('git.ssh.surf')) {
-        const pageContent = $('body').text().trim(); // Get all text content within the <body> tag
+            log.debug(`Ignoring line involving git.ssh.surf: ${line}`); // Ignore lines related to specific domains
        // Construct a response message with the extracted details
        let response = `Title: ${pageTitle}\n`; // Start with the page title
        if (pageDescription) {
            response += `Description: ${pageDescription}\n`; // Add the meta description if it exists
        }
        if (pageContent) {
            const MAX_CONTENT_LENGTH = process.env.MAX_CONTENT_LENGTH || 2000; // Set a maximum length for the content
            // Clean the page content to remove unnecessary whitespace and special characters
            let plainTextContent = $('<div>').html(pageContent).text().trim().replace(/[\r\n\t]+/g, ' ');
            // Define a regular expression pattern to identify code-like content
            const codePattern = /\/\/|\/\*|\*\/|\{|\}|\[|\]|\bfunction\b|\bclass\b|\b0x[0-9A-Fa-f]+\b|\b0b[01]+\b/;
            const isCode = codePattern.test(plainTextContent); // Check if the content resembles code
            if (isCode) {
                plainTextContent = plainTextContent.replace(codePattern, ''); // Remove code-like patterns if detected
            }
            // Further clean the content by removing text within parentheses
            plainTextContent = plainTextContent.replace(/ *\([^)]*\) */g, '');
            // If the content is too long, truncate it and add an ellipsis
            if (plainTextContent.length > MAX_CONTENT_LENGTH) {
                plainTextContent = plainTextContent.substring(0, MAX_CONTENT_LENGTH) + '...';
            }
            response += `Content: ${plainTextContent.trim()}`; // Add the cleaned and possibly truncated content to the response
        }
        response += `\nURL: ${url}`; // Include the original URL in the response
        console.log(`${getTimestamp()} [INFO] Successfully scraped URL: ${url}`);
        return response; // Return the constructed response
    } catch (err) {
        // If the scraping process fails, log an error with details and return null
        console.error(`${getTimestamp()} [ERROR] Failed to scrape URL: ${url}`, err);
        return null;
    }
 }
 // Function to process incoming requests, handle AI interactions, and return a response
 async function processRequest(req, res) {
    const startTime = Date.now(); // Record the start time of the request processing for performance tracking
    const ip = req.clientIp; // Retrieve the client's IP address from the request object
    console.log(`${getTimestamp()} [INFO] Handling chat request from IP: ${ip}`); // Log the request details
    // Set a timeout for the request processing, ensuring it doesn't hang indefinitely
    const timeout = setTimeout(() => {
        console.error(`${getTimestamp()} [ERROR] Request timed out for IP: ${ip}`);
        res.status(408).json({ message: "Request timed out" }); // Send a timeout response if the processing takes too long
    }, TIMEOUT_DURATION);
    try {
        let userMessage = req.body.message; // Extract the user's message from the request body
        console.log(`${getTimestamp()} [INFO] Received user message: ${userMessage}`);
        userMessage = req.body.message + `\nDate/Time:${getTimestamp()}`; // Append the current date and time to the user's message
        // Initialize conversation history if it doesn't exist for the IP
        if (!conversationHistory[ip]) {
            console.log(`${getTimestamp()} [INFO] Initializing conversation history for new IP: ${ip}`);
            conversationHistory[ip] = [{ role: 'system', content: prompt }]; // Start the conversation with the predefined prompt
        }
        // Add the user's message to the conversation history for the IP
        conversationHistory[ip].push({ role: 'user', content: userMessage });
        // Trim the conversation history if it exceeds the token limit
        await trimConversationHistory(conversationHistory[ip], MAX_TOKENS, TOLERANCE);
        // Split the user's message into individual log lines
        const logLines = userMessage.split('\n');
        // Define a regex pattern to identify lines containing client IP addresses
        const clientIpRegex = /\[Client (\d{1,3}\.){3}\d{1,3}\]/;
        // Filter the log lines to only include those with valid client IP addresses
        const filteredLogLines = logLines.filter(line => clientIpRegex.test(line));
        // If no valid IP addresses are found in the log lines, send a response indicating this
        if (filteredLogLines.length === 0) {
            console.log(`${getTimestamp()} [INFO] No valid client IP addresses found in the log.`);
            res.json({ message: "No valid client IP addresses found in the log." });
            return;
-        }
+          }
-        // Join the filtered log lines back into a single string for processing
+          log.debug(`Read line: ${line}`); // Debug log for each line read
-        const filteredMessage = filteredLogLines.join('\n');
+          const ipMatch = line.match(/\[Client (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\]/); // Regex to extract client IP from the log line
          if (ipMatch) {
            const ip = ipMatch[1];
            const isIgnored = await isIgnoredIP(ip); // Check if the IP should be ignored
            if (isIgnored) {
              log.debug(`Ignored line with IP: ${ip}`); // Debug log for ignored IPs
              return;
            }
          }
-        // Send the request to the llama API for processing and response generation
+          // Add the line to the log buffer
-        console.log(`${getTimestamp()} [INFO] Sending request to llama API for response`);
+          logBuffer.push(line);
-        const response = await axios.post('http://127.0.0.1:8003/v1/chat/completions', {
+          // If buffer reaches the limit, send logs to the backend
-            model: 'gpt-3.5-turbo', // Specify the AI model to use
+          if (logBuffer.length >= LOG_BUFFER_LIMIT) {
-            messages: [...conversationHistory[ip], { role: 'user', content: filteredMessage }] // Include the conversation history and the filtered message
+            await sendLogsToBackend();
          }
        });
        // Extract the AI's response from the API's response data
        const assistantMessage = response.data.choices[0].message;
        // Add the AI's response to the conversation history
        conversationHistory[ip].push(assistantMessage);
-        // Log the AI's response and additional details like the finish reason and token usage
+        // Event listener for errors in Tail instance
-        console.log(`${getTimestamp()} [INFO] Received response from llama API: ${assistantMessage.content}`);
+        tail.on('error', (error) => {
-        console.log(`${getTimestamp()} [DEBUG] Finish Reason: ${response.data.choices[0].finish_reason}`);
+          log.error(`Tail error: ${error}`); // Log errors that occur while tailing the file
-        console.log(`${getTimestamp()} [STATS] Usage: prompt_tokens=${response.data.usage.prompt_tokens}, completion_tokens=${response.data.usage.completion_tokens}, total_tokens=${response.data.usage.total_tokens}`);
+        });
-        clearTimeout(timeout); // Clear the timeout to prevent it from triggering
+        tail.watch(); // Start watching the log file for new lines
-        res.json(assistantMessage); // Send the AI's response back to the client
+
-    } catch (error) {
+        log.debug(`Started tailing file: ${filePath}`); // Debug log indicating the file is being monitored
-        // If an error occurs during request processing, log the error and send a 500 response
+        logTails.push(tail); // Add the Tail instance to the list of active Tails
-        console.error(`${getTimestamp()} [ERROR] An error occurred while handling chat request`, error);
+      } catch (ex) {
-        clearTimeout(timeout); // Clear the timeout to prevent it from triggering
+        log.error(`Failed to tail file ${filePath}: ${ex}`); // Log any exceptions that occur while starting the Tail
-        res.status(500).json({ message: "An error occurred", error: error.message }); // Send an error response
+      }
-    } finally {
+    });
-        // Record the end time and calculate the total processing time in seconds
+  });
-        const endTime = Date.now();
+};
-        const processingTime = ((endTime - startTime) / 1000).toFixed(2); // Convert milliseconds to seconds
+
-        console.log(`${getTimestamp()} [STATS] Processing Time: ${processingTime} seconds`); // Log the processing time
+// Function to count the number of tokens in a message using llama-tokenizer-js
-        console.log(`${getTimestamp()} [INFO] Finished processing chat request for IP: ${ip}`);
+async function countLlamaTokens(messages) {
  const llamaTokenizer = await import('llama-tokenizer-js'); // Dynamically import the tokenizer module
  let totalTokens = 0; // Initialize token counter
  for (const message of messages) {
    if (message.role === 'user' || message.role === 'assistant') {
      const encodedTokens = llamaTokenizer.default.encode(message.content); // Encode message content to count tokens
      totalTokens += encodedTokens.length; // Accumulate the total number of tokens
    }
  }
  return totalTokens; // Return the total token count
 }
-// Route to handle incoming chat requests, trim the message content, and process the request
+// Function to trim conversation history to fit within token limits
-app.post('/api/v1/chat', async (req, res) => {
+async function trimConversationHistory(messages, maxLength, tolerance) {
-    // Trim the incoming message to fit within token limits
+  let tokenLength = await countLlamaTokens(messages); // Get the current token length
-    const messageContent = req.body.message; // Get the user's message from the request body
+  if (tokenLength > maxLength + tolerance) {
-    const encodedTokens = llamaTokenizer.encode(messageContent); // Tokenize the message to determine its length in tokens
+    const diff = tokenLength - (maxLength + tolerance); // Calculate how many tokens need to be removed
-    const MAX_MESSAGE_TOKENS = MAX_TOKENS - (await countLlamaTokens([{ role: 'system', content: prompt }])) - TOLERANCE; // Calculate the maximum allowed tokens for the user's message
+    let removedTokens = 0;
-    // If the message exceeds the allowed token limit, trim it to fit
+    // Iterate over the messages in reverse order to remove older messages first
-    let trimmedMessageContent = messageContent;
+    for (let i = messages.length - 1; i >= 0; i--) {
-    if (encodedTokens.length > MAX_MESSAGE_TOKENS) {
+      const message = messages[i];
-        trimmedMessageContent = llamaTokenizer.decode(encodedTokens.slice(0, MAX_MESSAGE_TOKENS)); // Truncate the message and decode it back to a string
+      const messageTokens = await countLlamaTokens([message]); // Count tokens in the current message
      if (removedTokens + messageTokens <= diff) {
        messages.splice(i, 1); // Remove the message if it helps reduce the token count sufficiently
        removedTokens += messageTokens;
        console.log(`${getTimestamp()} [CLEANUP] ${removedTokens} removed | After Resize: ${await countLlamaTokens(messages)}`);
      } else {
        const messagesToRemove = Math.floor(diff / messageTokens); // Determine how many messages need to be removed
        for (let j = 0; j < messagesToRemove; j++) {
          messages.splice(i, 1); // Remove the determined number of messages
          removedTokens += messageTokens;
        }
        break; // Exit the loop once enough tokens have been removed
      }
    }
  }
 }
 // Function to send accumulated log buffer to the backend server
 const sendLogsToBackend = async () => {
  if (logBuffer.length === 0) {
    log.info('Log buffer is empty, skipping sending to backend'); // Log if there are no logs to send
    return;
  }
  if (isSendingLogs) {
    log.info('Log sending is already in progress, skipping...'); // Prevent concurrent log sending operations
    return;
  }
  isSendingLogs = true; // Set the flag to indicate logs are being sent
  log.info('Sending logs to backend...'); // Log the start of the log sending process
  try {
    const messages = [{ role: 'user', content: logBuffer.join('\n') }]; // Combine the log buffer into a single message
    await trimConversationHistory(messages, 2000, 100); // Trim the message if it exceeds token limits
    const response = await axios.post(BACKEND_URL, { message: messages.map(msg => msg.content).join('\n') }); // Send the logs to the backend
    // Check the response for any alerts, actions, or reports
    if (response.data.content.includes('ALERT') || response.data.content.includes('ACTION') || response.data.content.includes('REPORT')) {
      log.warn('ALERT detected in response'); // Log if an alert is detected
      const ips = extractIPsFromAlert(response.data.content); // Extract IP addresses from the alert message
      if (ips.length > 0) {
        const nonIgnoredIPs = [];
        for (const ip of ips) {
          if (await isIgnoredIP(ip)) {
            log.debug(`Skipping banning for ignored IP: ${ip}`); // Skip banning if the IP is in the ignored list
            continue;
          }
          log.info(`Detected IP for banning: ${ip}`); // Log the IP address that will be banned
          await banIP(ip); // Execute the ban command for the IP
          await delay(3000); // Add a 3-second delay between bans to avoid overloading the system
          nonIgnoredIPs.push(ip); // Keep track of banned IPs that are not ignored
        }
        await sendAlertToDiscord(response.data.content, nonIgnoredIPs); // Send the alert message to Discord
      } else {
        log.warn('No IPs detected for banning.'); // Log if no IPs were found for banning
        await sendAlertToDiscord(response.data.content, []); // Still send the alert to Discord, even without IPs
      }
    } else if (response.data.content.includes('GENERAL')) {
      await sendGeneralToDiscord(response.data.content); // Send general information to Discord if present
    } else {
      log.info('No alerts detected in response'); // Log if no significant alerts are found
      log.info(`Response:\n ${response.data.content}`); // Log the response content for review
    }
-    // Process the trimmed message and send the response
+    // Clear the log buffer after successful sending
-    await processRequest({ ...req, body: { message: trimmedMessageContent } }, res);
+    logBuffer = [];
-});
+    log.info('Log buffer cleared');
-// Route to fetch the conversation history for a specific IP address
+    // Reset the conversation history on the backend to start fresh
-app.get('/api/v1/conversation-history', (req, res) => {
+    await resetConversationHistory();
-    const ip = req.clientIp; // Get the client's IP address from the request object
+  } catch (error) {
-    console.log(`${getTimestamp()} [INFO] Fetching conversation history for IP: ${ip}`); // Log the request details
+    log.error(`Error sending logs to backend: ${error.message}`); // Log any errors that occur during the process
-    res.json(conversationHistory[ip]); // Send the conversation history for the IP as a JSON response
+  } finally {
-});
+    isSendingLogs = false; // Reset the flag to allow new log sending operations
  }
 };
-// Route to restart the core AI service via Docker, typically used to refresh the model or resolve issues
+// Function to extract IP addresses from the alert message using a regular expression
-app.post('/api/v1/restart-core', (req, res) => {
+const extractIPsFromAlert = (message) => {
-    console.log(`${getTimestamp()} [INFO] Restarting core service`); // Log the restart action
+  const ipPattern = /\|\|(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\|\|/g; // Regex to find IPs wrapped in ||...||
-    cmd(`docker restart llama-gpu-server`).then(out => { // Execute a shell command to restart the Docker container running the AI model
+  let matches;
-        console.log(`${getTimestamp()} [INFO] Core service restarted`); // Log the successful restart
+  const uniqueIPs = new Set(); // Use a Set to store unique IP addresses
-        res.json(out.stdout); // Send the output of the restart command back to the client
+
-    }).catch(err => { // Handle any errors that occur during the restart
+  while ((matches = ipPattern.exec(message)) !== null) {
-        console.error(`${getTimestamp()} [ERROR] Failed to restart core service`, err); // Log the error
+    uniqueIPs.add(matches[1]); // Add each found IP to the Set
-        res.status(500).json({ message: "An error occurred while restarting the core service", error: err.message }); // Send an error response
+  }
  return Array.from(uniqueIPs); // Convert the Set to an array of unique IPs
 };
 // Function to ban an IP address using a shell command
 const banIP = (ip) => {
  return new Promise((resolve, reject) => {
    log.info(`Banning IP address: ${ip}`); // Log the IP address being banned
    exec(`/usr/bin/banIP ${ip}`, (error, stdout, stderr) => {
      if (error) {
        log.error(`Error banning IP address: ${error.message}`); // Log any errors that occur during the ban
        reject(error); // Reject the promise if an error occurs
        return;
      }
      if (stderr) {
        log.warn(`stderr: ${stderr}`); // Log any warnings or errors from the command's stderr
      }
      log.success(`IP address ${ip} has been banned`); // Log a success message if the ban was successful
      resolve(); // Resolve the promise to indicate success
    });
-});
+  });
 };
-// Route to reset the conversation history for a specific IP address, effectively starting a new session
+// Function to send alert messages to a Discord channel via webhook
-app.post('/api/v1/reset-conversation', (req, res) => {
+const sendAlertToDiscord = async (alertMessage, ips) => {
-    const ip = req.clientIp; // Get the client's IP address from the request object
+  log.info('Sending alert to Discord...'); // Log the start of the Discord alert sending process
-    console.log(`${getTimestamp()} [INFO] Resetting conversation history for IP: ${ip}`); // Log the reset action
+  try {
    await axios.post(DISCORD_WEBHOOK_URL, {
      embeds: [
        {
          title: 'Alert Detected', // Title for the Discord embed
          description: alertMessage, // The alert message content
          color: 15158332, // Red color for alerts
          fields: ips.filter(ip => !ignoredIPs.includes(ip)).map(ip => ({
            name: 'Banned IP', // Field name in the Discord embed
            value: ip, // The banned IP address
            inline: true // Display fields inline for better readability
          })),
          timestamp: new Date() // Timestamp for when the alert was sent
        }
      ]
    });
    log.success('Alert sent to Discord'); // Log a success message if the alert was successfully sent
  } catch (error) {
    log.error(`Error sending alert to Discord: ${error.message}`); // Log any errors that occur during the Discord alert sending process
  }
 };
-    // Reset the conversation history to its initial state for the given IP
+// Function to send general information messages to a Discord channel via webhook
-    conversationHistory[ip] = [
+const sendGeneralToDiscord = async (generalMessage) => {
-        { role: 'system', content: prompt }
+  log.info('Sending general information to Discord...'); // Log the start of the general information sending process
-    ];
+  try {
-    console.log(`${getTimestamp()} [INFO] Conversation history reset for IP: ${ip}`); // Log the successful reset
+    await axios.post(DISCORD_WEBHOOK_URL, {
-    res.json({ message: "Conversation history reset for IP: " + ip }); // Send a confirmation message back to the client
+      embeds: [
-});
+        {
          title: 'General Information', // Title for the Discord embed
          description: generalMessage, // The general information content
          color: 3066993, // Blue color for general information
          timestamp: new Date() // Timestamp for when the information was sent
        }
      ]
    });
    log.success('General information sent to Discord'); // Log a success message if the general information was successfully sent
  } catch (error) {
    log.error(`Error sending general information to Discord: ${error.message}`); // Log any errors that occur during the Discord general information sending process
  }
 };
-// Start the Express server on the defined port, making the API available for requests
+// Function to reset the conversation history in the backend
-app.listen(port, () => {
+const resetConversationHistory = async () => {
-    console.log(`${getTimestamp()} [INFO] Server running at http://localhost:${port}`); // Log the server startup and its URL
+  log.info('Resetting conversation history...'); // Log the start of the conversation history reset process
-});
+  try {
    await axios.post(`${BACKEND_URL.replace('/api/v1/chat', '')}/api/v1/reset-conversation`); // Send a request to reset the conversation history
    log.success('Conversation history reset'); // Log a success message if the reset was successful
  } catch (error) {
    log.error(`Error resetting conversation history: ${error.message}`); // Log any errors that occur during the conversation history reset process
  }
 };
 // Utility function to introduce a delay between operations, useful for rate limiting
 const delay = (ms) => new Promise((resolve) => setTimeout(resolve, ms)); // Create a promise that resolves after the specified delay
 // Start reading logs continuously from the specified directory
 readLogs();
 // Set up an interval to send logs to the backend if buffer limit is reached or every 10 minutes
 setInterval(sendLogsToBackend, TIME_LIMIT);
--- a/ai_log_backend.js
+++ b/ai_log_backend.js
@@ -0,0 +1,279 @@
 // Import necessary modules for the application
 import express from 'express'; // Express framework for building web server applications and handling HTTP requests and responses
 import axios from 'axios'; // Axios is used to make HTTP requests to external APIs or services
 import bodyParser from 'body-parser'; // Middleware for parsing incoming request bodies, specifically for handling JSON data
 import cmd from 'cmd-promise'; // A module that allows execution of shell commands in a promise-based manner, making it easier to manage async operations
 import cors from 'cors'; // Middleware to enable Cross-Origin Resource Sharing, allowing resources to be requested from another domain
 import cheerio from 'cheerio'; // Cheerio is a server-side jQuery-like library for parsing and manipulating HTML content
 import 'dotenv/config'; // Loads environment variables from a .env file into process.env, allowing secure storage of sensitive information
 import llamaTokenizer from 'llama-tokenizer-js'; // A library for tokenizing text, which is crucial for managing the length of text inputs to the AI model
 // Define a prompt that will guide the AI's behavior when analyzing NGINX logs for potential security issues
 const prompt = `
 You are a security AI responsible for analyzing web traffic from NGINX logs and blocking malicious IPs. Your task is to review the logs for potential attacks and issues. If you identify a verified problem, include [ALERT] followed by a detailed description of the issue in your response. Ensure your message is formatted in Markdown compatible with Discord.
 **Guidelines:**
 - Domains on the server: shells.lol, hehe.rest, dcord.rest, nodejs.lol, dht.rest, my-mc.link, ssh.surf, tcp.quest
 - Use emojis to enhance communication.
 - Do not report IP addresses for scraping or crawling.
 - Ignore IPs: x.x.x.x, x.x.x.x, x.x.x.x, x.x.x.x. Do not mention these in reports.
 - Ignore IP addresses with BOGONs such as 192.168.0.1 or 127.0.0.2, etc.
 - Avoid alerting for false positives or irregular activity.
 - If there are no alerts but you have interesting findings, write: GENERAL followed by your insights in Markdown.
 - Only send GENERAL messages for noteworthy events, not for routine traffic reports.
 - In a GENERAL message, feel free to provide a long explainer on your deductions.
 - Be decisive. If an IP is being malicious, block it. Do not monitor IPs for further traffic.
 - Do not ban an IP address without a corresponding log entry, provide this in your response.
 - Block all bot IPs and information scanners except Google.
 - Provide proof and reasoning for each ban.
 - DO NOT BAN AN IP THAT IS NOT IN A LOG EVER! YOU MUST SEE THE IP ADDRESS!
 - To ban an IP or flag it as a security risk, wrap it in a Markdown spoiler: ||IPHERE||
 `;
 // Initialize the Express application and define the port on which the server will run
 const app = express(); // Create an instance of an Express application
 const port = 3001; // Define the port number for the server, 3001 is commonly used for development
 // Middleware to enable CORS for all routes
 app.use(cors()); // This allows the server to accept requests from any origin, useful for APIs that may be accessed by web applications from different domains
 // Set a larger limit for the request body to handle large data payloads
 app.use(bodyParser.json({ limit: '50mb' })); // The JSON body parser is configured with a 50MB limit, suitable for handling large JSON payloads
 // Define constants for the application, used to control various aspects of the server's behavior
 const TIMEOUT_DURATION = 100000; // The maximum time (in milliseconds) the server will wait before timing out a request, set to 100 seconds
 const MAX_TOKENS = 8000; // The maximum number of tokens (words and punctuation) allowed in a conversation, this limit helps manage API usage
 const TOLERANCE = 100; // A buffer value used to prevent exceeding the MAX_TOKENS limit, ensuring the conversation stays within safe bounds
 let conversationHistory = {}; // An object to store conversation history for each IP address, allowing the server to maintain context for each user
 // Helper function to get the current timestamp in a formatted string
 const getTimestamp = () => {
    const now = new Date(); // Get the current date and time
    const date = now.toLocaleDateString('en-US'); // Format the date in the US locale
    const time = now.toLocaleTimeString('en-US'); // Format the time in the US locale
    return `${date} [${time}]`; // Return the formatted date and time as a string
 };
 // Middleware to track conversation history based on the client's IP address
 app.use((req, res, next) => {
    // Extract the client's IP address from various possible headers (CF-Connecting-IP, X-Forwarded-For, X-Real-IP) or fallback to req.ip
    const ip = req.headers['cf-connecting-ip'] || req.headers['x-forwarded-for'] || req.headers['x-real-ip'] || req.ip;
    req.clientIp = ip; // Store the client's IP address in the request object for easy access later
    // Log the incoming request along with the client's IP address and current timestamp
    console.log(`${getTimestamp()} [INFO] Incoming request from IP: ${req.clientIp}`);
    // If this IP address has not been seen before, initialize a new conversation history for it
    if (!conversationHistory[req.clientIp]) {
        console.log(`${getTimestamp()} [INFO] Initializing conversation history for new IP: ${req.clientIp}`);
        // Start the conversation with the predefined prompt that instructs the AI on how to analyze the logs
        conversationHistory[req.clientIp] = [
            { role: 'system', content: prompt }
        ];
    }
    next(); // Move on to the next middleware or route handler
 });
 // Function to count the number of tokens in a conversation history using the llama tokenizer
 async function countLlamaTokens(messages) {
    let totalTokens = 0; // Initialize a counter for the total number of tokens
    for (const message of messages) {
        // Only count tokens for user and assistant messages, not system messages
        if (message.role === 'user' || message.role === 'assistant') {
            const encodedTokens = llamaTokenizer.encode(message.content); // Tokenize the message content
            totalTokens += encodedTokens.length; // Add the number of tokens in the current message to the total
        }
    }
    return totalTokens; // Return the total number of tokens
 }
 // Function to trim the conversation history to fit within the token limit
 async function trimConversationHistory(messages, maxLength, tolerance) {
    let tokenLength = await countLlamaTokens(messages); // Get the total number of tokens in the conversation
    // Continue trimming the conversation history until it's within the allowed token limit
    while (tokenLength > maxLength - tolerance && messages.length > 1) {
        messages.splice(1, 1); // Remove the oldest user/assistant message (the second item in the array)
        tokenLength = await countLlamaTokens(messages); // Recalculate the total number of tokens after trimming
        console.log(`${getTimestamp()} [CLEANUP] Trimmed conversation history to ${tokenLength} tokens.`);
    }
 }
 // Function to scrape a web page and extract its content
 async function scrapeWebPage(url) {
    console.log(`${getTimestamp()} [INFO] Starting to scrape URL: ${url}`);
    try {
        // Perform an HTTP GET request to fetch the content of the specified URL
        const res = await axios.get(url);
        const html = res.data; // Extract the HTML content from the response
        const $ = cheerio.load(html); // Load the HTML into Cheerio for parsing and manipulation
        // Extract specific elements from the HTML: the page title, meta description, and body content
        const pageTitle = $('head title').text().trim(); // Get the text of the <title> tag
        const pageDescription = $('head meta[name="description"]').attr('content'); // Get the content of the meta description
        const pageContent = $('body').text().trim(); // Get all text content within the <body> tag
        // Construct a response message with the extracted details
        let response = `Title: ${pageTitle}\n`; // Start with the page title
        if (pageDescription) {
            response += `Description: ${pageDescription}\n`; // Add the meta description if it exists
        }
        if (pageContent) {
            const MAX_CONTENT_LENGTH = process.env.MAX_CONTENT_LENGTH || 2000; // Set a maximum length for the content
            // Clean the page content to remove unnecessary whitespace and special characters
            let plainTextContent = $('<div>').html(pageContent).text().trim().replace(/[\r\n\t]+/g, ' ');
            // Define a regular expression pattern to identify code-like content
            const codePattern = /\/\/|\/\*|\*\/|\{|\}|\[|\]|\bfunction\b|\bclass\b|\b0x[0-9A-Fa-f]+\b|\b0b[01]+\b/;
            const isCode = codePattern.test(plainTextContent); // Check if the content resembles code
            if (isCode) {
                plainTextContent = plainTextContent.replace(codePattern, ''); // Remove code-like patterns if detected
            }
            // Further clean the content by removing text within parentheses
            plainTextContent = plainTextContent.replace(/ *\([^)]*\) */g, '');
            // If the content is too long, truncate it and add an ellipsis
            if (plainTextContent.length > MAX_CONTENT_LENGTH) {
                plainTextContent = plainTextContent.substring(0, MAX_CONTENT_LENGTH) + '...';
            }
            response += `Content: ${plainTextContent.trim()}`; // Add the cleaned and possibly truncated content to the response
        }
        response += `\nURL: ${url}`; // Include the original URL in the response
        console.log(`${getTimestamp()} [INFO] Successfully scraped URL: ${url}`);
        return response; // Return the constructed response
    } catch (err) {
        // If the scraping process fails, log an error with details and return null
        console.error(`${getTimestamp()} [ERROR] Failed to scrape URL: ${url}`, err);
        return null;
    }
 }
 // Function to process incoming requests, handle AI interactions, and return a response
 async function processRequest(req, res) {
    const startTime = Date.now(); // Record the start time of the request processing for performance tracking
    const ip = req.clientIp; // Retrieve the client's IP address from the request object
    console.log(`${getTimestamp()} [INFO] Handling chat request from IP: ${ip}`); // Log the request details
    // Set a timeout for the request processing, ensuring it doesn't hang indefinitely
    const timeout = setTimeout(() => {
        console.error(`${getTimestamp()} [ERROR] Request timed out for IP: ${ip}`);
        res.status(408).json({ message: "Request timed out" }); // Send a timeout response if the processing takes too long
    }, TIMEOUT_DURATION);
    try {
        let userMessage = req.body.message; // Extract the user's message from the request body
        console.log(`${getTimestamp()} [INFO] Received user message: ${userMessage}`);
        userMessage = req.body.message + `\nDate/Time:${getTimestamp()}`; // Append the current date and time to the user's message
        // Initialize conversation history if it doesn't exist for the IP
        if (!conversationHistory[ip]) {
            console.log(`${getTimestamp()} [INFO] Initializing conversation history for new IP: ${ip}`);
            conversationHistory[ip] = [{ role: 'system', content: prompt }]; // Start the conversation with the predefined prompt
        }
        // Add the user's message to the conversation history for the IP
        conversationHistory[ip].push({ role: 'user', content: userMessage });
        // Trim the conversation history if it exceeds the token limit
        await trimConversationHistory(conversationHistory[ip], MAX_TOKENS, TOLERANCE);
        // Split the user's message into individual log lines
        const logLines = userMessage.split('\n');
        // Define a regex pattern to identify lines containing client IP addresses
        const clientIpRegex = /\[Client (\d{1,3}\.){3}\d{1,3}\]/;
        // Filter the log lines to only include those with valid client IP addresses
        const filteredLogLines = logLines.filter(line => clientIpRegex.test(line));
        // If no valid IP addresses are found in the log lines, send a response indicating this
        if (filteredLogLines.length === 0) {
            console.log(`${getTimestamp()} [INFO] No valid client IP addresses found in the log.`);
            res.json({ message: "No valid client IP addresses found in the log." });
            return;
        }
        // Join the filtered log lines back into a single string for processing
        const filteredMessage = filteredLogLines.join('\n');
        // Send the request to the llama API for processing and response generation
        console.log(`${getTimestamp()} [INFO] Sending request to llama API for response`);
        const response = await axios.post('http://127.0.0.1:8003/v1/chat/completions', {
            model: 'gpt-3.5-turbo', // Specify the AI model to use
            messages: [...conversationHistory[ip], { role: 'user', content: filteredMessage }] // Include the conversation history and the filtered message
        });
        // Extract the AI's response from the API's response data
        const assistantMessage = response.data.choices[0].message;
        // Add the AI's response to the conversation history
        conversationHistory[ip].push(assistantMessage);
        // Log the AI's response and additional details like the finish reason and token usage
        console.log(`${getTimestamp()} [INFO] Received response from llama API: ${assistantMessage.content}`);
        console.log(`${getTimestamp()} [DEBUG] Finish Reason: ${response.data.choices[0].finish_reason}`);
        console.log(`${getTimestamp()} [STATS] Usage: prompt_tokens=${response.data.usage.prompt_tokens}, completion_tokens=${response.data.usage.completion_tokens}, total_tokens=${response.data.usage.total_tokens}`);
        clearTimeout(timeout); // Clear the timeout to prevent it from triggering
        res.json(assistantMessage); // Send the AI's response back to the client
    } catch (error) {
        // If an error occurs during request processing, log the error and send a 500 response
        console.error(`${getTimestamp()} [ERROR] An error occurred while handling chat request`, error);
        clearTimeout(timeout); // Clear the timeout to prevent it from triggering
        res.status(500).json({ message: "An error occurred", error: error.message }); // Send an error response
    } finally {
        // Record the end time and calculate the total processing time in seconds
        const endTime = Date.now();
        const processingTime = ((endTime - startTime) / 1000).toFixed(2); // Convert milliseconds to seconds
        console.log(`${getTimestamp()} [STATS] Processing Time: ${processingTime} seconds`); // Log the processing time
        console.log(`${getTimestamp()} [INFO] Finished processing chat request for IP: ${ip}`);
    }
 }
 // Route to handle incoming chat requests, trim the message content, and process the request
 app.post('/api/v1/chat', async (req, res) => {
    // Trim the incoming message to fit within token limits
    const messageContent = req.body.message; // Get the user's message from the request body
    const encodedTokens = llamaTokenizer.encode(messageContent); // Tokenize the message to determine its length in tokens
    const MAX_MESSAGE_TOKENS = MAX_TOKENS - (await countLlamaTokens([{ role: 'system', content: prompt }])) - TOLERANCE; // Calculate the maximum allowed tokens for the user's message
    // If the message exceeds the allowed token limit, trim it to fit
    let trimmedMessageContent = messageContent;
    if (encodedTokens.length > MAX_MESSAGE_TOKENS) {
        trimmedMessageContent = llamaTokenizer.decode(encodedTokens.slice(0, MAX_MESSAGE_TOKENS)); // Truncate the message and decode it back to a string
    }
    // Process the trimmed message and send the response
    await processRequest({ ...req, body: { message: trimmedMessageContent } }, res);
 });
 // Route to fetch the conversation history for a specific IP address
 app.get('/api/v1/conversation-history', (req, res) => {
    const ip = req.clientIp; // Get the client's IP address from the request object
    console.log(`${getTimestamp()} [INFO] Fetching conversation history for IP: ${ip}`); // Log the request details
    res.json(conversationHistory[ip]); // Send the conversation history for the IP as a JSON response
 });
 // Route to restart the core AI service via Docker, typically used to refresh the model or resolve issues
 app.post('/api/v1/restart-core', (req, res) => {
    console.log(`${getTimestamp()} [INFO] Restarting core service`); // Log the restart action
    cmd(`docker restart llama-gpu-server`).then(out => { // Execute a shell command to restart the Docker container running the AI model
        console.log(`${getTimestamp()} [INFO] Core service restarted`); // Log the successful restart
        res.json(out.stdout); // Send the output of the restart command back to the client
    }).catch(err => { // Handle any errors that occur during the restart
        console.error(`${getTimestamp()} [ERROR] Failed to restart core service`, err); // Log the error
        res.status(500).json({ message: "An error occurred while restarting the core service", error: err.message }); // Send an error response
    });
 });
 // Route to reset the conversation history for a specific IP address, effectively starting a new session
 app.post('/api/v1/reset-conversation', (req, res) => {
    const ip = req.clientIp; // Get the client's IP address from the request object
    console.log(`${getTimestamp()} [INFO] Resetting conversation history for IP: ${ip}`); // Log the reset action
    // Reset the conversation history to its initial state for the given IP
    conversationHistory[ip] = [
        { role: 'system', content: prompt }
    ];
    console.log(`${getTimestamp()} [INFO] Conversation history reset for IP: ${ip}`); // Log the successful reset
    res.json({ message: "Conversation history reset for IP: " + ip }); // Send a confirmation message back to the client
 });
 // Start the Express server on the defined port, making the API available for requests
 app.listen(port, () => {
    console.log(`${getTimestamp()} [INFO] Server running at http://localhost:${port}`); // Log the server startup and its URL
 });