This commit is contained in:
Josh Fraser 2020-02-24 18:06:29 -08:00
commit 516599ce9e
6 changed files with 99 additions and 35 deletions

View File

@ -1,40 +1,46 @@
![origin_github_banner](https://user-images.githubusercontent.com/673455/37314301-f8db9a90-2618-11e8-8fee-b44f38febf38.png) ![origin_github_banner](https://user-images.githubusercontent.com/673455/37314301-f8db9a90-2618-11e8-8fee-b44f38febf38.png)
Head to https://www.originprotocol.com/developers to learn more about what we're building and how to get involved. Head to https://www.originprotocol.com/developers to learn more about what we're building and how to get involved.
# Telegram Bot # Telegram Bot
- Deletes messages matching specified patterns - Deletes messages matching specified patterns
- Bans users for posting messagses matching specified patterns - Bans users for posting messages matching specified patterns
- Bans users with usernames matching specified patterns - Bans users with usernames matching specified patterns
- Records logs of converstations - Records logs of conversations
- Logs an English translation of any foreign languages using Google Translate
- Uses textblob for basic sentiment analysis of both polarity and subjectivity
## Installation ## Installation
- Required: Python 3.x, pip, PostgreSQL - Required: Python 3.x, pip, PostgreSQL
- Create virtualenv - Create virtualenv
- Clone this repo - Clone this repo
- `pip install --upgrade -r requirements.txt` - `pip install --upgrade -r requirements.txt`
## Database setup ## Database setup
- Store database URL in environment variable.
``` - Store database URL in environment variable.
export TELEGRAM_BOT_POSTGRES_URL="postgresql://<user>:<password>@localhost:5432/<databasename>"
``` ```
- Run: `python model.py` to setup the DB tables. export TELEGRAM_BOT_POSTGRES_URL="postgresql://<user>:<password>@localhost:5432/<databasename>"
```
- Run: `python model.py` to setup the DB tables.
## Setup ## Setup
- Create a Telegram bot by talking to `@BotFather` : https://core.telegram.org/bots#creating-a-new-bot - Create a Telegram bot by talking to `@BotFather` : https://core.telegram.org/bots#creating-a-new-bot
- Use `/setprivacy` with `@BotFather` in order to allow it to see all messages in a group. - Use `/setprivacy` with `@BotFather` in order to allow it to see all messages in a group.
- Store your Telegram Bot Token in environment variable `TELEGRAM_BOT_TOKEN`. It will look similar to this: - Store your Telegram Bot Token in environment variable `TELEGRAM_BOT_TOKEN`. It will look similar to this:
``` ```
export TELEGRAM_BOT_TOKEN="4813829027:ADJFKAf0plousH2EZ2jBfxxRWFld3oK34ya" export TELEGRAM_BOT_TOKEN="4813829027:ADJFKAf0plousH2EZ2jBfxxRWFld3oK34ya"
``` ```
- Create your Telegram group.
- Add your bot to the group like so: https://stackoverflow.com/questions/37338101/how-to-add-a-bot-to-a-telegram-group - Create your Telegram group.
- Make your bot an admin in the group - Add your bot to the group like so: https://stackoverflow.com/questions/37338101/how-to-add-a-bot-to-a-telegram-group
- Make your bot an admin in the group
## Configuration with ENV vars ## Configuration with ENV vars
@ -44,11 +50,28 @@ Head to https://www.originprotocol.com/developers to learn more about what we're
- `CHAT_IDS` : **REQUIRED**. Comma-seperated list of IDs of chat(s) that should be monitored. To find out the ID of a chat, add the bot to a chat and type some messages there. The bot log will report an error that it got messages `from chat_id not being monitored: XXX` where XXX is the chat ID. e.g. `-240532994,-150531679` - `CHAT_IDS` : **REQUIRED**. Comma-seperated list of IDs of chat(s) that should be monitored. To find out the ID of a chat, add the bot to a chat and type some messages there. The bot log will report an error that it got messages `from chat_id not being monitored: XXX` where XXX is the chat ID. e.g. `-240532994,-150531679`
- `TELEGRAM_BOT_TOKEN` : **REQUIRED**. Token for bot to control. e.g. `4813829027:ADJFKAf0plousH2EZ2jBfxxRWFld3oK34ya` - `TELEGRAM_BOT_TOKEN` : **REQUIRED**. Token for bot to control. e.g. `4813829027:ADJFKAf0plousH2EZ2jBfxxRWFld3oK34ya`
- `TELEGRAM_BOT_POSTGRES_URL` : **REQUIRED**. URI for postgres instance to log activity to. e.g. `postgresql://localhost/postgres` - `TELEGRAM_BOT_POSTGRES_URL` : **REQUIRED**. URI for postgres instance to log activity to. e.g. `postgresql://localhost/postgres`
- `DEBUG` : If set to anything except `false`, will put bot into debug mode. This means that all actions will be logged into the chat itself, and more things will be logged. - `DEBUG` : If set to anything except `false`, will put bot into debug mode. This means that all actions will be logged into the chat itself, and more things will be logged.
- `ADMIN_EXEMPT` : If set to anything except `false`, admin users will be exempt from monitoring. Reccomended to be set, but useful to turn off for debugging. - `ADMIN_EXEMPT` : If set to anything except `false`, admin users will be exempt from monitoring. Reccomended to be set, but useful to turn off for debugging.
- `NOTIFY_CHAT` : ID of chat to report actions. Can be useful if you have an admin-only chat where you want to monitor the bot's activity. E.g. `-140532994` - `NOTIFY_CHAT` : ID of chat to report actions. Can be useful if you have an admin-only chat where you want to monitor the bot's activity. E.g. `-140532994`
## Download the corpus for Textblob
For sentiment analysis to work, you'll need to download the latest corpus file for textblob. You can do this by running:
```
python -m textblob.download_corpora
```
If you're running the bot on Heroku, set an environment variable named `NLTK_DATA` to `/app/nltk_data` by running:
```
heroku config:set NLTK_DATA='/app/nltk_data'
```
## Message ban patterns
Sample bash file to set `MESSAGE_BAN_PATTERNS`: Sample bash file to set `MESSAGE_BAN_PATTERNS`:
``` ```
read -r -d '' MESSAGE_BAN_PATTERNS << 'EOF' read -r -d '' MESSAGE_BAN_PATTERNS << 'EOF'
# ETH Address # ETH Address
@ -60,15 +83,17 @@ read -r -d '' MESSAGE_BAN_PATTERNS << 'EOF'
EOF EOF
``` ```
## Attachements ## Attachments
By default, any attachments other than images or animations will cause the message to be hidden. By default, any attachments other than images or animations will cause the message to be hidden.
## Running ## Running
### Locally ### Locally
- Run: `python bot.py` to start logger
- Messages will be displayed on `stdout` as they are logged. - Run: `python bot.py` to start logger
- Messages will be displayed on `stdout` as they are logged.
### On Heroku ### On Heroku
- You must enable the worker on Heroku app dashboard. (By default it is off.)
- You must enable the worker on Heroku app dashboard. (By default it is off.)

View File

@ -0,0 +1,19 @@
#!/usr/bin/env bash
source $BIN_DIR/utils
echo "-----> Starting corpora installation"
# Assumes NLTK_DATA environment variable is already set
# $ heroku config:set NLTK_DATA='/app/nltk_data'
# Install the default corpora to NLTK_DATA directory
python -m textblob.download_corpora
# Open the NLTK_DATA directory
cd ${NLTK_DATA}
# Delete all of the zip files in the NLTK DATA directory
find . -name "*.zip" -type f -delete
echo "-----> Finished corpora installatio"

9
bin/post_compile Normal file
View File

@ -0,0 +1,9 @@
#!/usr/bin/env bash
if [ -f bin/install_textblob_corpora ]; then
echo "-----> Running install_textblob_corpora"
chmod +x bin/install_textblob_corpora
bin/install_textblob_corpora
fi
echo "-----> Post-compile done"

17
bot.py
View File

@ -19,6 +19,7 @@ import re
import unidecode import unidecode
from mwt import MWT from mwt import MWT
from googletrans import Translator from googletrans import Translator
from textblob import TextBlob
class TelegramMonitorBot: class TelegramMonitorBot:
@ -26,12 +27,12 @@ class TelegramMonitorBot:
def __init__(self): def __init__(self):
self.debug = ( self.debug = (
(os.environ.get('DEBUG') is not None) and (os.environ.get('DEBUG') is not None) and
(os.environ.get('DEBUG').upper() != "false")) (os.environ.get('DEBUG').lower() != "false"))
# Are admins exempt from having messages checked? # Are admins exempt from having messages checked?
self.admin_exempt = ( self.admin_exempt = (
(os.environ.get('ADMIN_EXEMPT') is not None) and (os.environ.get('ADMIN_EXEMPT') is not None) and
(os.environ.get('ADMIN_EXEMPT').upper() != "false")) (os.environ.get('ADMIN_EXEMPT').lower() != "false"))
if (self.debug): if (self.debug):
print("🔵 debug:", self.debug) print("🔵 debug:", self.debug)
@ -304,20 +305,26 @@ class TelegramMonitorBot:
return bool_set return bool_set
def log_message(self, user_id, user_message, chat_id): def log_message(self, user_id, user_message, chat_id):
try: try:
s = session() s = session()
language_code = english_message = "" language_code = english_message = ""
polarity = subjectivity = 0.0
try: try:
# translate to English & log the original language
translator = Translator() translator = Translator()
translated = translator.translate(user_message) translated = translator.translate(user_message)
language_code = translated.src language_code = translated.src
english_message = translated.text english_message = translated.text
# run basic sentiment analysis on the translated English string
analysis = TextBlob(english_message)
polarity = analysis.sentiment.polarity
subjectivity = analysis.sentiment.subjectivity
except Exception as e: except Exception as e:
print(e.message) print(e.message)
msg1 = Message(user_id=user_id, message=user_message, msg1 = Message(user_id=user_id, message=user_message, chat_id=chat_id,
chat_id=chat_id, language_code=language_code, english_message=english_message) language_code=language_code, english_message=english_message, polarity=polarity,
subjectivity=subjectivity)
s.add(msg1) s.add(msg1)
s.commit() s.commit()
s.close() s.close()

View File

@ -1,4 +1,4 @@
from sqlalchemy import Column, DateTime, BigInteger, String, Integer, ForeignKey, func from sqlalchemy import Column, DateTime, BigInteger, String, Integer, Numeric, ForeignKey, func
from sqlalchemy.orm import relationship, backref from sqlalchemy.orm import relationship, backref
from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.ext.declarative import declarative_base
import os import os
@ -30,9 +30,10 @@ class Message(Base):
language_code = Column(String) language_code = Column(String)
english_message = Column(String) english_message = Column(String)
chat_id = Column(BigInteger) chat_id = Column(BigInteger)
polarity = Column(Numeric)
subjectivity = Column(Numeric)
time = Column(DateTime, default=func.now()) time = Column(DateTime, default=func.now())
class MessageHide(Base): class MessageHide(Base):
__tablename__ = 'telegram_message_hides' __tablename__ = 'telegram_message_hides'
id = Column(Integer, primary_key=True) id = Column(Integer, primary_key=True)

View File

@ -4,3 +4,6 @@ SQLAlchemy==1.2.2
configparser==3.5.0 configparser==3.5.0
Unidecode==1.0.22 Unidecode==1.0.22
googletrans==2.4.0 googletrans==2.4.0
textblob==0.15.3
ipython==5.5.0