22 Commits

Author SHA1 Message Date
6fd5bd1020 Fix disabling of DEBUG and ADMIN_EXEMPT
As written (by unknown idiot), there was no way to actually set `DEBUG` and `ADMIN_EXEMPT` env vars in a way that disabled them (as claimed by docs), as their value was converted to uppercase and then tested for a lower case `false`. Fixed to forcing to _lowercase_ with `.lower()`.
2020-01-27 20:14:20 -08:00
4e1e586123 setup textblob for sentiment analysis 2020-01-27 20:08:00 -08:00
82787cc428 update readme 2020-01-27 19:52:52 -08:00
0e94b860e4 translate telegram messages 2020-01-27 18:24:25 -08:00
b1a57bb917 log chat IDs 2020-01-24 17:51:34 -08:00
5e6f9b4eba Merge branch 'master' of github.com:OriginProtocol/telegram-moderator 2018-10-22 15:36:25 -06:00
400f51b381 Facepalm. Backwards logic for detecting admins.
🤦‍♀️
2018-10-22 15:35:45 -06:00
c272b88809 Readme typo 2018-10-19 20:47:41 +02:00
13e6ce7847 Clarify readme about env vars 2018-10-19 20:46:41 +02:00
79678abd72 ENV vars to readme 2018-10-19 20:39:01 +02:00
92a8426261 Fix bug when user has no first or last name
Fixing: https://github.com/OriginProtocol/telegram-moderator/issues/15

Using `.format()` to handle `None` instead of the conditional check.
2018-10-19 18:30:28 +02:00
a398742b96 Fixed checking of attachments.
Plus cleanup of old memorize util
2018-10-19 13:05:17 +02:00
34ba0c4d04 Hide messages with attachements 2018-10-19 02:30:28 +02:00
a5a7f30c93 Merge pull request #19 from OriginProtocol/stan/mainnet
Stan/mainnet
2018-10-16 19:13:04 +03:00
e1a29f06a9 Changes for mainnet.
Hide forwarded messages.
2018-10-16 17:54:47 +02:00
ffe3790946 Merge branch 'master' of https://github.com/OriginProtocol/telegram-moderator 2018-10-03 06:18:41 -07:00
8b8018a93a Merge pull request #13 from OriginProtocol/dev-page-link
Link to developer landing page from README
2018-07-01 17:00:14 -07:00
c5c1ae5140 Link to developer landing page from README 2018-07-01 16:48:20 -07:00
68aa8b85b8 Move notification chat to end of function
...So if notification channel is not set up right, it doesn't stop blocking from happening.
2018-06-28 11:54:17 -06:00
7874dbfbfe syntax fix 2018-06-28 11:23:34 -06:00
b2460d8e3a better handling of debug setting 2018-06-28 11:22:23 -06:00
a0e3922bed Added better logging output. 2018-06-28 10:45:17 -06:00
9 changed files with 218 additions and 73 deletions

3
.gitignore vendored
View File

@ -1,6 +1,9 @@
# Never commit config data
config.cnf
# Env vars for "real" installation
env.sh
# Byte-compiled / optimized / DLL files
*.py[cod]

View File

@ -1,9 +1,14 @@
![origin_github_banner](https://user-images.githubusercontent.com/673455/37314301-f8db9a90-2618-11e8-8fee-b44f38febf38.png)
Head to https://www.originprotocol.com/developers to learn more about what we're building and how to get involved.
# Telegram Bot
- Deletes messages matching specified patterns
- Bans users for posting messagses matching specified patterns
- Bans users with usernames matching specified patterns
- Records logs of converstations
- Logs an English translation of any foreign languages using Google Translate
## Installation
@ -13,10 +18,13 @@
- `pip install --upgrade -r requirements.txt`
## Database setup
- Store database URL in environment variable.
```
export TELEGRAM_BOT_POSTGRES_URL="postgresql://<user>:<password>@localhost:5432/<databasename>"
```
- Run: `python model.py` to setup the DB tables.
## Setup
@ -28,19 +36,25 @@
```
export TELEGRAM_BOT_TOKEN="4813829027:ADJFKAf0plousH2EZ2jBfxxRWFld3oK34ya"
```
- Create your Telegram group.
- Add your bot to the group like so: https://stackoverflow.com/questions/37338101/how-to-add-a-bot-to-a-telegram-group
- Make your bot an admin in the group
## Configuring patterns
## Configuration with ENV vars
- Regex patterns will be read from the following env variables
- `MESSAGE_BAN_PATTERNS` Messages matching this will ban the user.
- `MESSAGE_HIDE_PATTERNS` Messages matching this will be hidden/deleted
- `NAME_BAN_PATTERNS` Users with usernames or first/last names maching this will be banned from the group.
- `SAFE_USER_IDS` User ID's that are except from these checkes. Note that the bot cannot ban admin users, but can delete their messages.
- `MESSAGE_BAN_PATTERNS` : **REQUIRED** Regex pattern. Messages matching this will ban the user.
- `MESSAGE_HIDE_PATTERNS` : **REQUIRED** Regex pattern. Messages matching this will be hidden/deleted
- `NAME_BAN_PATTERNS` **REQUIRED** Regex pattern. Users with usernames or first/last names maching this will be banned from the group.
- `CHAT_IDS` : **REQUIRED**. Comma-seperated list of IDs of chat(s) that should be monitored. To find out the ID of a chat, add the bot to a chat and type some messages there. The bot log will report an error that it got messages `from chat_id not being monitored: XXX` where XXX is the chat ID. e.g. `-240532994,-150531679`
- `TELEGRAM_BOT_TOKEN` : **REQUIRED**. Token for bot to control. e.g. `4813829027:ADJFKAf0plousH2EZ2jBfxxRWFld3oK34ya`
- `TELEGRAM_BOT_POSTGRES_URL` : **REQUIRED**. URI for postgres instance to log activity to. e.g. `postgresql://localhost/postgres`
- `DEBUG` : If set to anything except `false`, will put bot into debug mode. This means that all actions will be logged into the chat itself, and more things will be logged.
- `ADMIN_EXEMPT` : If set to anything except `false`, admin users will be exempt from monitoring. Reccomended to be set, but useful to turn off for debugging.
- `NOTIFY_CHAT` : ID of chat to report actions. Can be useful if you have an admin-only chat where you want to monitor the bot's activity. E.g. `-140532994`
Sample bash file to set `MESSAGE_BAN_PATTERNS`:
```
read -r -d '' MESSAGE_BAN_PATTERNS << 'EOF'
# ETH Address
@ -52,11 +66,17 @@ read -r -d '' MESSAGE_BAN_PATTERNS << 'EOF'
EOF
```
## Attachements
By default, any attachments other than images or animations will cause the message to be hidden.
## Running
### Locally
- Run: `python bot.py` to start logger
- Messages will be displayed on `stdout` as they are logged.
### On Heroku
- You must enable the worker on Heroku app dashboard. (By default it is off.)

View File

@ -0,0 +1,19 @@
#!/usr/bin/env bash
source $BIN_DIR/utils
echo "-----> Starting corpora installation"
# Assumes NLTK_DATA environment variable is already set
# $ heroku config:set NLTK_DATA='/app/nltk_data'
# Install the default corpora to NLTK_DATA directory
python -m textblob.download_corpora
# Open the NLTK_DATA directory
cd ${NLTK_DATA}
# Delete all of the zip files in the NLTK DATA directory
find . -name "*.zip" -type f -delete
echo "-----> Finished corpora installatio"

9
bin/post_compile Normal file
View File

@ -0,0 +1,9 @@
#!/usr/bin/env bash
if [ -f bin/install_textblob_corpora ]; then
echo "-----> Running install_textblob_corpora"
chmod +x bin/install_textblob_corpora
bin/install_textblob_corpora
fi
echo "-----> Post-compile done"

162
bot.py
View File

@ -18,17 +18,33 @@ from time import strftime
import re
import unidecode
from mwt import MWT
from googletrans import Translator
class TelegramMonitorBot:
def __init__(self):
self.debug = os.environ.get('DEBUG') is not None
self.debug = (
(os.environ.get('DEBUG') is not None) and
(os.environ.get('DEBUG').lower() != "false"))
# Users to notify of violoations
self.notify_user_ids = (
list(map(int, os.environ['NOTIFY_USER_IDS'].split(',')))
if "NOTIFY_USER_IDS" in os.environ else [])
# Are admins exempt from having messages checked?
self.admin_exempt = (
(os.environ.get('ADMIN_EXEMPT') is not None) and
(os.environ.get('ADMIN_EXEMPT').lower() != "false"))
if (self.debug):
print("🔵 debug:", self.debug)
print("🔵 admin_exempt:", self.admin_exempt)
print("🔵 TELEGRAM_BOT_POSTGRES_URL:", os.environ["TELEGRAM_BOT_POSTGRES_URL"])
print("🔵 TELEGRAM_BOT_TOKEN:", os.environ["TELEGRAM_BOT_TOKEN"])
print("🔵 NOTIFY_CHAT:", os.environ['NOTIFY_CHAT'] if 'NOTIFY_CHAT' in os.environ else "<undefined>")
print("🔵 MESSAGE_BAN_PATTERNS:\n", os.environ['MESSAGE_BAN_PATTERNS'])
print("🔵 MESSAGE_HIDE_PATTERNS:\n", os.environ['MESSAGE_HIDE_PATTERNS'])
print("🔵 NAME_BAN_PATTERNS:\n", os.environ['NAME_BAN_PATTERNS'])
# Channel to notify of violoations, e.g. '@channelname'
self.notify_chat = os.environ['NOTIFY_CHAT'] if 'NOTIFY_CHAT' in os.environ else None
# List of chat ids that bot should monitor
self.chat_ids = (
@ -71,19 +87,15 @@ class TelegramMonitorBot:
def security_check_username(self, bot, update):
""" Test username for security violations """
full_name = (update.message.from_user.first_name + " "
+ update.message.from_user.last_name)
full_name = "{} {}".format(
update.message.from_user.first_name,
update.message.from_user.last_name)
if self.name_ban_re and self.name_ban_re.search(full_name):
# Logging
log_message = "Ban match full name: {}".format(full_name.encode('utf-8'))
log_message = "❌ 🙅‍♂️ BAN MATCH FULL NAME: {}".format(full_name.encode('utf-8'))
if self.debug:
update.message.reply_text(log_message)
print(log_message)
for notify_user_id in self.notify_user_ids:
print (notify_user_id,"gets notified")
bot.send_message(
chat_id=notify_user_id,
text=log_message)
# Ban the user
self.ban_user(update)
# Log in database
@ -94,17 +106,15 @@ class TelegramMonitorBot:
s.add(userBan)
s.commit()
s.close()
# Notify channel
bot.sendMessage(chat_id=self.notify_chat, text=log_message)
if self.name_ban_re and self.name_ban_re.search(update.message.from_user.username or ''):
# Logging
log_message = "Ban match username: {}".format(update.message.from_user.username.encode('utf-8'))
log_message = "❌ 🙅‍♂️ BAN MATCH USERNAME: {}".format(update.message.from_user.username.encode('utf-8'))
if self.debug:
update.message.reply_text(log_message)
print(log_message)
for notify_user_id in self.notify_user_ids:
bot.send_message(
chat_id=notify_user_id,
text=log_message)
# Ban the user
self.ban_user(update)
# Log in database
@ -115,26 +125,47 @@ class TelegramMonitorBot:
s.add(userBan)
s.commit()
s.close()
# Notify channel
bot.sendMessage(chat_id=self.notify_chat, text=log_message)
def security_check_message(self, bot, update):
""" Test message for security violations """
if not update.message.text:
return
# Remove accents from letters (é->e, ñ->n, etc...)
message = unidecode.unidecode(update.message.text)
# TODO: Replace lookalike unicode characters:
# https://github.com/wanderingstan/Confusables
if self.message_ban_re and self.message_ban_re.search(message):
# Hide forwarded messages
if update.message.forward_date is not None:
# Logging
log_message = "Ban message match: {}".format(update.message.text.encode('utf-8'))
log_message = "❌ HIDE FORWARDED: {}".format(update.message.text.encode('utf-8'))
if self.debug:
update.message.reply_text(log_message)
print(log_message)
# Delete the message
update.message.delete()
# Log in database
s = session()
messageHide = MessageHide(
user_id=update.message.from_user.id,
message=update.message.text)
s.add(messageHide)
s.commit()
s.close()
# Notify channel
bot.sendMessage(chat_id=self.notify_chat, text=log_message)
if self.message_ban_re and self.message_ban_re.search(message):
# Logging
log_message = "❌ 🙅‍♂️ BAN MATCH: {}".format(update.message.text.encode('utf-8'))
if self.debug:
update.message.reply_text(log_message)
print(log_message)
for notify_user_id in self.notify_user_ids:
bot.send_message(
chat_id=notify_user_id,
text=log_message)
# Any message that causes a ban gets deleted
update.message.delete()
# Ban the user
@ -147,17 +178,15 @@ class TelegramMonitorBot:
s.add(userBan)
s.commit()
s.close()
# Notify channel
bot.sendMessage(chat_id=self.notify_chat, text=log_message)
elif self.message_hide_re and self.message_hide_re.search(message):
# Logging
log_message = "Hide match: {}".format(update.message.text.encode('utf-8'))
log_message = "❌ 🙈 HIDE MATCH: {}".format(update.message.text.encode('utf-8'))
if self.debug:
update.message.reply_text(log_message)
print(log_message)
for notify_user_id in self.notify_user_ids:
bot.send_message(
chat_id=notify_user_id,
text=log_message)
# Delete the message
update.message.delete()
# Log in database
@ -168,6 +197,36 @@ class TelegramMonitorBot:
s.add(messageHide)
s.commit()
s.close()
# Notify channel
bot.sendMessage(chat_id=self.notify_chat, text=log_message)
def attachment_check(self, bot, update):
""" Hide messages with attachments (except photo or video) """
if (update.message.audio or
update.message.document or
update.message.game or
update.message.voice):
# Logging
if update.message.document:
log_message = "❌ HIDE DOCUMENT: {}".format(update.message.document.__dict__)
else:
log_message = "❌ HIDE NON-DOCUMENT ATTACHMENT"
if self.debug:
update.message.reply_text(log_message)
print(log_message)
# Delete the message
update.message.delete()
# Log in database
s = session()
messageHide = MessageHide(
user_id=update.message.from_user.id,
message=update.message.text)
s.add(messageHide)
s.commit()
s.close()
# Notify channel
bot.sendMessage(chat_id=self.notify_chat, text=log_message)
def logger(self, bot, update):
@ -184,7 +243,8 @@ class TelegramMonitorBot:
return
if self.id_exists(user.id):
self.log_message(user.id, update.message.text)
self.log_message(user.id, update.message.text,
update.message.chat_id)
else:
add_user_success = self.add_user(
user.id,
@ -193,30 +253,43 @@ class TelegramMonitorBot:
user.username)
if add_user_success:
self.log_message(user.id, update.message.text)
self.log_message(
user.id, update.message.text, update.message.chat_id)
print("User added: {}".format(user.id))
else:
print("Something went wrong adding the user {}".format(user.id), file=sys.stderr)
user_name = (
user.username or
"{} {}".format(user.first_name, user.last_name) or
"<none>").encode('utf-8')
if update.message.text:
print("{} {} ({}) : {}".format(
strftime("%Y-%m-%dT%H:%M:%S"),
user.id,
(user.username or (user.first_name + " " + user.last_name) or "").encode('utf-8'),
user_name,
update.message.text.encode('utf-8'))
)
else:
print("{} {} ({}) : non-message".format(
strftime("%Y-%m-%dT%H:%M:%S"),
user.id,
user_name)
)
if (self.debug or
update.message.from_user.id not in self.get_admin_ids(bot, update.message.chat_id)):
# Don't check admin activity
is_admin = update.message.from_user.id in self.get_admin_ids(bot, update.message.chat_id)
if is_admin and self.admin_exempt:
print("👮‍♂️ Skipping checks. User is admin: {}".format(user.id))
else:
# Security checks
self.attachment_check(bot, update)
self.security_check_username(bot, update)
self.security_check_message(bot, update)
else:
print("Skipping checks. User is admin: {}".format(user.id))
except Exception as e:
print("Error: {}".format(e))
print('Error on line {}'.format(sys.exc_info()[-1].tb_lineno), type(e).__name__, e)
# DB queries
def id_exists(self, id_value):
@ -231,10 +304,19 @@ class TelegramMonitorBot:
return bool_set
def log_message(self, user_id, user_message):
def log_message(self, user_id, user_message, chat_id):
try:
s = session()
msg1 = Message(user_id=user_id, message=user_message)
language_code = english_message = ""
try:
translator = Translator()
translated = translator.translate(user_message)
language_code = translated.src
english_message = translated.text
except Exception as e:
print(e.message)
msg1 = Message(user_id=user_id, message=user_message,
chat_id=chat_id, language_code=language_code, english_message=english_message)
s.add(msg1)
s.commit()
s.close()
@ -277,7 +359,7 @@ class TelegramMonitorBot:
# on noncommand i.e message - echo the message on Telegram
dp.add_handler(MessageHandler(
Filters.text,
Filters.all,
lambda bot, update : self.logger(bot, update)
))

View File

@ -1,6 +1,7 @@
# Example env vars for bot
# Copy this to `env.sh` and edit with your real vars -- it is ignored by git
export TELEGRAM_BOT_POSTGRES_URL="postgresql://postgres:postgres@localhost/origindb"
export TELEGRAM_BOT_POSTGRES_URL="postgresql://localhost/postgres"
read -r -d '' MESSAGE_BAN_PATTERNS << 'EOF'
# ETH
@ -25,3 +26,7 @@ export TELEGRAM_BOT_TOKEN="XXXXXXXXX:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
export NAME_BAN_PATTERNS="admin$"
export CHAT_IDS="-250531994"
# Needed to make these env vars visible to python
export MESSAGE_BAN_PATTERNS=$MESSAGE_BAN_PATTERNS
export MESSAGE_HIDE_PATTERNS=$MESSAGE_HIDE_PATTERNS

View File

@ -1,8 +1,9 @@
from sqlalchemy import Column, DateTime, String, Integer, ForeignKey, func
from sqlalchemy import Column, DateTime, BigInteger, String, Integer, ForeignKey, func
from sqlalchemy.orm import relationship, backref
from sqlalchemy.ext.declarative import declarative_base
import os
# Localhost url: postgresql://localhost/postgres
postgres_url = os.environ["TELEGRAM_BOT_POSTGRES_URL"]
@ -26,9 +27,13 @@ class Message(Base):
id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey('telegram_users.id'), nullable=False)
message = Column(String)
language_code = Column(String)
english_message = Column(String)
chat_id = Column(BigInteger)
polarity = Column(Numeric)
subjectivity = Column(Numeric)
time = Column(DateTime, default=func.now())
class MessageHide(Base):
__tablename__ = 'telegram_message_hides'
id = Column(Integer, primary_key=True)

6
mwt.py
View File

@ -1,7 +1,7 @@
import time
class MWT(object):
"""Memoize With Timeout"""
"""Memorize With Timeout"""
_caches = {}
_timeouts = {}
@ -26,11 +26,11 @@ class MWT(object):
key = (args, tuple(kw))
try:
v = self.cache[key]
print("cache")
# print("cache")
if (time.time() - v[1]) > self.timeout:
raise KeyError
except KeyError:
print("new")
# print("new")
v = self.cache[key] = f(*args,**kwargs),time.time()
return v[0]
func.func_name = f.__name__

View File

@ -3,3 +3,5 @@ python-telegram-bot==9.0.0
SQLAlchemy==1.2.2
configparser==3.5.0
Unidecode==1.0.22
googletrans==2.4.0
textblob