Fix disabling of DEBUG and ADMIN_EXEMPT

As written (by unknown idiot), there was no way to actually set `DEBUG` and `ADMIN_EXEMPT` env vars in a way that disabled them (as claimed by docs), as their value was converted to uppercase and then tested for a lower case `false`. Fixed to forcing to _lowercase_ with `.lower()`.
setup textblob for sentiment analysis
2020-01-27 20:14:20 -08:00 · 2020-01-27 20:08:00 -08:00 · 2020-01-27 19:52:52 -08:00 · 2020-01-27 18:24:25 -08:00 · 2020-01-24 17:51:34 -08:00 · 2018-10-22 15:36:25 -06:00
9 changed files with 218 additions and 73 deletions
--- a/.gitignore
+++ b/.gitignore
@ -1,6 +1,9 @@
 # Never commit config data
 config.cnf

+# Env vars for "real" installation
+env.sh
+
 # Byte-compiled / optimized / DLL files
 *.py[cod]

--- a/README.md
+++ b/README.md
@ -1,9 +1,14 @@
+![origin_github_banner](https://user-images.githubusercontent.com/673455/37314301-f8db9a90-2618-11e8-8fee-b44f38febf38.png)
+
+Head to https://www.originprotocol.com/developers to learn more about what we're building and how to get involved.
+
 # Telegram Bot

 - Deletes messages matching specified patterns
 - Bans users for posting messagses matching specified patterns
 - Bans users with usernames matching specified patterns
 - Records logs of converstations
+- Logs an English translation of any foreign languages using Google Translate

 ## Installation

@ -13,10 +18,13 @@
 - `pip install --upgrade -r requirements.txt`

 ## Database setup
+
 - Store database URL in environment variable.
+
 ```
 export TELEGRAM_BOT_POSTGRES_URL="postgresql://<user>:<password>@localhost:5432/<databasename>"
 ```
+
 - Run: `python model.py` to setup the DB tables.

 ## Setup
@ -28,19 +36,25 @@
 ```
 export TELEGRAM_BOT_TOKEN="4813829027:ADJFKAf0plousH2EZ2jBfxxRWFld3oK34ya"
 ```
+
 - Create your Telegram group.
 - Add your bot to the group like so: https://stackoverflow.com/questions/37338101/how-to-add-a-bot-to-a-telegram-group
 - Make your bot an admin in the group

-## Configuring patterns
+## Configuration with ENV vars

- Regex patterns will be read from the following env variables
-	- `MESSAGE_BAN_PATTERNS` Messages matching this will ban the user.
-	- `MESSAGE_HIDE_PATTERNS` Messages matching this will be hidden/deleted
-	- `NAME_BAN_PATTERNS` Users with usernames or first/last names maching this will be banned from the group.
-	- `SAFE_USER_IDS` User ID's that are except from these checkes. Note that the bot cannot ban admin users, but can delete their messages.
+- `MESSAGE_BAN_PATTERNS` : **REQUIRED** Regex pattern. Messages matching this will ban the user.
+- `MESSAGE_HIDE_PATTERNS` : **REQUIRED** Regex pattern. Messages matching this will be hidden/deleted
+- `NAME_BAN_PATTERNS` **REQUIRED** Regex pattern. Users with usernames or first/last names maching this will be banned from the group.
+- `CHAT_IDS` : **REQUIRED**. Comma-seperated list of IDs of chat(s) that should be monitored. To find out the ID of a chat, add the bot to a chat and type some messages there. The bot log will report an error that it got messages `from chat_id not being monitored: XXX` where XXX is the chat ID. e.g. `-240532994,-150531679`
+- `TELEGRAM_BOT_TOKEN` : **REQUIRED**. Token for bot to control. e.g. `4813829027:ADJFKAf0plousH2EZ2jBfxxRWFld3oK34ya`
+- `TELEGRAM_BOT_POSTGRES_URL` : **REQUIRED**. URI for postgres instance to log activity to. e.g. `postgresql://localhost/postgres`
+- `DEBUG` : If set to anything except `false`, will put bot into debug mode. This means that all actions will be logged into the chat itself, and more things will be logged.
+- `ADMIN_EXEMPT` : If set to anything except `false`, admin users will be exempt from monitoring. Reccomended to be set, but useful to turn off for debugging.
+- `NOTIFY_CHAT` : ID of chat to report actions. Can be useful if you have an admin-only chat where you want to monitor the bot's activity. E.g. `-140532994`

 Sample bash file to set `MESSAGE_BAN_PATTERNS`:
+
 ```
 read -r -d '' MESSAGE_BAN_PATTERNS << 'EOF'
 # ETH Address
@ -52,11 +66,17 @@ read -r -d '' MESSAGE_BAN_PATTERNS << 'EOF'
 EOF
 ```

+## Attachements
+
+By default, any attachments other than images or animations will cause the message to be hidden.
+
 ## Running

 ### Locally
+
 - Run: `python bot.py` to start logger
 - Messages will be displayed on `stdout` as they are logged.

 ### On Heroku
+
 - You must enable the worker on Heroku app dashboard. (By default it is off.)
--- a/bin/install_textblob_corpora
+++ b/bin/install_textblob_corpora
@ -0,0 +1,19 @@
+#!/usr/bin/env bash
+
+source $BIN_DIR/utils
+
+echo "-----> Starting corpora installation"
+
+# Assumes NLTK_DATA environment variable is already set
+# $ heroku config:set NLTK_DATA='/app/nltk_data'
+
+# Install the default corpora to NLTK_DATA directory
+python -m textblob.download_corpora
+
+# Open the NLTK_DATA directory
+cd ${NLTK_DATA}
+
+# Delete all of the zip files in the NLTK DATA directory
+find . -name "*.zip" -type f -delete
+
+echo "-----> Finished corpora installatio"
--- a/bin/post_compile
+++ b/bin/post_compile
@ -0,0 +1,9 @@
+#!/usr/bin/env bash
+
+if [ -f bin/install_textblob_corpora ]; then
+    echo "-----> Running install_textblob_corpora"
+    chmod +x bin/install_textblob_corpora
+    bin/install_textblob_corpora
+fi
+
+echo "-----> Post-compile done"
--- a/bot.py
+++ b/bot.py
@ -18,17 +18,33 @@ from time import strftime
 import re
 import unidecode
 from mwt import MWT
+from googletrans import Translator

 class TelegramMonitorBot:


    def __init__(self):
-        self.debug = os.environ.get('DEBUG') is not None
+        self.debug = (
+            (os.environ.get('DEBUG') is not None) and
+            (os.environ.get('DEBUG').lower() != "false"))

-        # Users to notify of violoations
-        self.notify_user_ids = (
-            list(map(int, os.environ['NOTIFY_USER_IDS'].split(',')))
-            if "NOTIFY_USER_IDS" in os.environ else [])
+        # Are admins exempt from having messages checked?
+        self.admin_exempt = (
+            (os.environ.get('ADMIN_EXEMPT') is not None) and
+            (os.environ.get('ADMIN_EXEMPT').lower() != "false"))
+
+        if (self.debug):
+            print("🔵 debug:", self.debug)
+            print("🔵 admin_exempt:", self.admin_exempt)
+            print("🔵 TELEGRAM_BOT_POSTGRES_URL:", os.environ["TELEGRAM_BOT_POSTGRES_URL"])
+            print("🔵 TELEGRAM_BOT_TOKEN:", os.environ["TELEGRAM_BOT_TOKEN"])
+            print("🔵 NOTIFY_CHAT:", os.environ['NOTIFY_CHAT'] if 'NOTIFY_CHAT' in os.environ else "<undefined>")
+            print("🔵 MESSAGE_BAN_PATTERNS:\n", os.environ['MESSAGE_BAN_PATTERNS'])
+            print("🔵 MESSAGE_HIDE_PATTERNS:\n", os.environ['MESSAGE_HIDE_PATTERNS'])
+            print("🔵 NAME_BAN_PATTERNS:\n", os.environ['NAME_BAN_PATTERNS'])
+
+        # Channel to notify of violoations, e.g. '@channelname'
+        self.notify_chat = os.environ['NOTIFY_CHAT'] if 'NOTIFY_CHAT' in os.environ else None

        # List of chat ids that bot should monitor
        self.chat_ids = (
@ -71,19 +87,15 @@ class TelegramMonitorBot:
    def security_check_username(self, bot, update):
        """ Test username for security violations """

-        full_name = (update.message.from_user.first_name + " "
-            + update.message.from_user.last_name)
+        full_name = "{} {}".format(
+            update.message.from_user.first_name,
+            update.message.from_user.last_name)
        if self.name_ban_re and self.name_ban_re.search(full_name):
            # Logging
-            log_message = "Ban match full name: {}".format(full_name.encode('utf-8'))
+            log_message = "❌ 🙅‍♂️ BAN MATCH FULL NAME: {}".format(full_name.encode('utf-8'))
            if self.debug:
                update.message.reply_text(log_message)
            print(log_message)
-            for notify_user_id in self.notify_user_ids:
-                print (notify_user_id,"gets notified")
-                bot.send_message(
-                    chat_id=notify_user_id,
-                    text=log_message)
            # Ban the user
            self.ban_user(update)
            # Log in database
@ -94,17 +106,15 @@ class TelegramMonitorBot:
            s.add(userBan)
            s.commit()
            s.close()
+            # Notify channel
+            bot.sendMessage(chat_id=self.notify_chat, text=log_message)

        if self.name_ban_re and self.name_ban_re.search(update.message.from_user.username or ''):
            # Logging
-            log_message = "Ban match username: {}".format(update.message.from_user.username.encode('utf-8'))
+            log_message = "❌ 🙅‍♂️ BAN MATCH USERNAME: {}".format(update.message.from_user.username.encode('utf-8'))
            if self.debug:
                update.message.reply_text(log_message)
            print(log_message)
-            for notify_user_id in self.notify_user_ids:
-                bot.send_message(
-                    chat_id=notify_user_id,
-                    text=log_message)
            # Ban the user
            self.ban_user(update)
            # Log in database
@ -115,26 +125,47 @@ class TelegramMonitorBot:
            s.add(userBan)
            s.commit()
            s.close()
+            # Notify channel
+            bot.sendMessage(chat_id=self.notify_chat, text=log_message)


    def security_check_message(self, bot, update):
        """ Test message for security violations """

+        if not update.message.text:
+            return
+
        # Remove accents from letters (é->e, ñ->n, etc...)
        message = unidecode.unidecode(update.message.text)
        # TODO: Replace lookalike unicode characters:
        # https://github.com/wanderingstan/Confusables

-        if self.message_ban_re and self.message_ban_re.search(message):
+        # Hide forwarded messages
+        if update.message.forward_date is not None:
            # Logging
-            log_message = "Ban message match: {}".format(update.message.text.encode('utf-8'))
+            log_message = "❌ HIDE FORWARDED: {}".format(update.message.text.encode('utf-8'))
+            if self.debug:
+                update.message.reply_text(log_message)
+            print(log_message)
+            # Delete the message
+            update.message.delete()
+            # Log in database
+            s = session()
+            messageHide = MessageHide(
+                user_id=update.message.from_user.id,
+                message=update.message.text)
+            s.add(messageHide)
+            s.commit()
+            s.close()
+            # Notify channel
+            bot.sendMessage(chat_id=self.notify_chat, text=log_message)
+
+        if self.message_ban_re and self.message_ban_re.search(message):
+            # Logging
+            log_message = "❌ 🙅‍♂️ BAN MATCH: {}".format(update.message.text.encode('utf-8'))
            if self.debug:
                update.message.reply_text(log_message)
            print(log_message)
-            for notify_user_id in self.notify_user_ids:
-                bot.send_message(
-                    chat_id=notify_user_id,
-                    text=log_message)
            # Any message that causes a ban gets deleted
            update.message.delete()
            # Ban the user
@ -147,17 +178,15 @@ class TelegramMonitorBot:
            s.add(userBan)
            s.commit()
            s.close()
+            # Notify channel
+            bot.sendMessage(chat_id=self.notify_chat, text=log_message)

        elif self.message_hide_re and self.message_hide_re.search(message):
            # Logging
-            log_message = "Hide match: {}".format(update.message.text.encode('utf-8'))
+            log_message = "❌ 🙈 HIDE MATCH: {}".format(update.message.text.encode('utf-8'))
            if self.debug:
                update.message.reply_text(log_message)
            print(log_message)
-            for notify_user_id in self.notify_user_ids:
-                bot.send_message(
-                    chat_id=notify_user_id,
-                    text=log_message)
            # Delete the message
            update.message.delete()
            # Log in database
@ -168,6 +197,36 @@ class TelegramMonitorBot:
            s.add(messageHide)
            s.commit()
            s.close()
+            # Notify channel
+            bot.sendMessage(chat_id=self.notify_chat, text=log_message)
+
+
+    def attachment_check(self, bot, update):
+        """ Hide messages with attachments (except photo or video) """
+        if (update.message.audio or
+            update.message.document or
+            update.message.game or
+            update.message.voice):
+            # Logging
+            if update.message.document:
+                log_message = "❌ HIDE DOCUMENT: {}".format(update.message.document.__dict__)
+            else:
+                log_message = "❌ HIDE NON-DOCUMENT ATTACHMENT"
+            if self.debug:
+                update.message.reply_text(log_message)
+            print(log_message)
+            # Delete the message
+            update.message.delete()
+            # Log in database
+            s = session()
+            messageHide = MessageHide(
+                user_id=update.message.from_user.id,
+                message=update.message.text)
+            s.add(messageHide)
+            s.commit()
+            s.close()
+            # Notify channel
+            bot.sendMessage(chat_id=self.notify_chat, text=log_message)


    def logger(self, bot, update):
@ -184,7 +243,8 @@ class TelegramMonitorBot:
                return

            if self.id_exists(user.id):
-                self.log_message(user.id, update.message.text)
+                self.log_message(user.id, update.message.text,
+                                 update.message.chat_id)
            else:
                add_user_success = self.add_user(
                    user.id,
@ -193,30 +253,43 @@ class TelegramMonitorBot:
                    user.username)

                if add_user_success:
-                    self.log_message(user.id, update.message.text)
+                    self.log_message(
+                        user.id, update.message.text, update.message.chat_id)
                    print("User added: {}".format(user.id))
                else:
                    print("Something went wrong adding the user {}".format(user.id), file=sys.stderr)

+            user_name = (
+                user.username or
+                "{} {}".format(user.first_name, user.last_name) or
+                "<none>").encode('utf-8')
            if update.message.text:
                print("{} {} ({}) : {}".format(
                    strftime("%Y-%m-%dT%H:%M:%S"),
                    user.id,
-                    (user.username or (user.first_name + " " + user.last_name) or "").encode('utf-8'),
+                    user_name,
                    update.message.text.encode('utf-8'))
                )
+            else:
+                print("{} {} ({}) : non-message".format(
+                    strftime("%Y-%m-%dT%H:%M:%S"),
+                    user.id,
+                    user_name)
+                )

-            if (self.debug or
-                update.message.from_user.id not in self.get_admin_ids(bot, update.message.chat_id)):
+            # Don't check admin activity
+            is_admin = update.message.from_user.id in self.get_admin_ids(bot, update.message.chat_id)
+            if is_admin and self.admin_exempt:
+                print("👮‍♂️ Skipping checks. User is admin: {}".format(user.id))
+            else:
                # Security checks
+                self.attachment_check(bot, update)
                self.security_check_username(bot, update)
                self.security_check_message(bot, update)
-            else:
-                print("Skipping checks. User is admin: {}".format(user.id))

        except Exception as e:
            print("Error: {}".format(e))
-
+            print('Error on line {}'.format(sys.exc_info()[-1].tb_lineno), type(e).__name__, e)

    # DB queries
    def id_exists(self, id_value):
@ -231,10 +304,19 @@ class TelegramMonitorBot:
        return bool_set


-    def log_message(self, user_id, user_message):
+    def log_message(self, user_id, user_message, chat_id):
        try:
            s = session()
-            msg1 = Message(user_id=user_id, message=user_message)
+            language_code = english_message = ""
+            try:
+                translator = Translator()
+                translated = translator.translate(user_message)
+                language_code = translated.src
+                english_message = translated.text
+            except Exception as e:
+                print(e.message)
+            msg1 = Message(user_id=user_id, message=user_message,
+                           chat_id=chat_id, language_code=language_code, english_message=english_message)
            s.add(msg1)
            s.commit()
            s.close()
@ -277,7 +359,7 @@ class TelegramMonitorBot:

        # on noncommand i.e message - echo the message on Telegram
        dp.add_handler(MessageHandler(
-            Filters.text,
+            Filters.all,
            lambda bot, update : self.logger(bot, update)
        ))

--- a/env_sample.sh
+++ b/env_sample.sh
@ -1,6 +1,7 @@
 # Example env vars for bot
+# Copy this to `env.sh` and edit with your real vars -- it is ignored by git

-export TELEGRAM_BOT_POSTGRES_URL="postgresql://postgres:postgres@localhost/origindb"
+export TELEGRAM_BOT_POSTGRES_URL="postgresql://localhost/postgres"

 read -r -d '' MESSAGE_BAN_PATTERNS << 'EOF'
 # ETH
@ -25,3 +26,7 @@ export TELEGRAM_BOT_TOKEN="XXXXXXXXX:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
 export NAME_BAN_PATTERNS="admin$"

 export CHAT_IDS="-250531994"
+
+# Needed to make these env vars visible to python
+export MESSAGE_BAN_PATTERNS=$MESSAGE_BAN_PATTERNS
+export MESSAGE_HIDE_PATTERNS=$MESSAGE_HIDE_PATTERNS
--- a/model.py
+++ b/model.py
@ -1,8 +1,9 @@
-from sqlalchemy import Column, DateTime, String, Integer, ForeignKey, func
+from sqlalchemy import Column, DateTime, BigInteger, String, Integer, ForeignKey, func
 from sqlalchemy.orm import relationship, backref
 from sqlalchemy.ext.declarative import declarative_base
 import os

+# Localhost url: postgresql://localhost/postgres
 postgres_url = os.environ["TELEGRAM_BOT_POSTGRES_URL"]


@ -26,9 +27,13 @@ class Message(Base):
    id = Column(Integer, primary_key=True)
    user_id = Column(Integer, ForeignKey('telegram_users.id'), nullable=False)
    message = Column(String)
+    language_code = Column(String)
+    english_message = Column(String)
+    chat_id = Column(BigInteger)
+    polarity = Column(Numeric)
+    subjectivity = Column(Numeric)
    time = Column(DateTime, default=func.now())

-
 class MessageHide(Base):
    __tablename__ = 'telegram_message_hides'
    id = Column(Integer, primary_key=True)
--- a/mwt.py
+++ b/mwt.py
@ -1,7 +1,7 @@
 import time

 class MWT(object):
-    """Memoize With Timeout"""
+    """Memorize With Timeout"""
    _caches = {}
    _timeouts = {}

@ -26,11 +26,11 @@ class MWT(object):
            key = (args, tuple(kw))
            try:
                v = self.cache[key]
-                print("cache")
+                # print("cache")
                if (time.time() - v[1]) > self.timeout:
                    raise KeyError
            except KeyError:
-                print("new")
+                # print("new")
                v = self.cache[key] = f(*args,**kwargs),time.time()
            return v[0]
        func.func_name = f.__name__
--- a/requirements.txt
+++ b/requirements.txt
@ -3,3 +3,5 @@ python-telegram-bot==9.0.0
 SQLAlchemy==1.2.2
 configparser==3.5.0
 Unidecode==1.0.22
+googletrans==2.4.0
+textblob
Author	SHA1	Message	Date
Stan James	6fd5bd1020	Fix disabling of `DEBUG` and `ADMIN_EXEMPT` As written (by unknown idiot), there was no way to actually set `DEBUG` and `ADMIN_EXEMPT` env vars in a way that disabled them (as claimed by docs), as their value was converted to uppercase and then tested for a lower case `false`. Fixed to forcing to _lowercase_ with `.lower()`.	2020-01-27 20:14:20 -08:00
Josh Fraser	4e1e586123	setup textblob for sentiment analysis	2020-01-27 20:08:00 -08:00
Josh Fraser	82787cc428	update readme	2020-01-27 19:52:52 -08:00
Josh Fraser	0e94b860e4	translate telegram messages	2020-01-27 18:24:25 -08:00
Josh Fraser	b1a57bb917	log chat IDs	2020-01-24 17:51:34 -08:00
Stan James	5e6f9b4eba	Merge branch 'master' of github.com:OriginProtocol/telegram-moderator	2018-10-22 15:36:25 -06:00
Stan James	400f51b381	Facepalm. Backwards logic for detecting admins. 🤦‍♀️	2018-10-22 15:35:45 -06:00
Stan James	c272b88809	Readme typo	2018-10-19 20:47:41 +02:00
Stan James	13e6ce7847	Clarify readme about env vars	2018-10-19 20:46:41 +02:00
Stan James	79678abd72	ENV vars to readme	2018-10-19 20:39:01 +02:00
Stan James	92a8426261	Fix bug when user has no first or last name Fixing: https://github.com/OriginProtocol/telegram-moderator/issues/15 Using `.format()` to handle `None` instead of the conditional check.	2018-10-19 18:30:28 +02:00
Stan James	a398742b96	Fixed checking of attachments. Plus cleanup of old memorize util	2018-10-19 13:05:17 +02:00
Stan James	34ba0c4d04	Hide messages with attachements	2018-10-19 02:30:28 +02:00
Stan James	a5a7f30c93	Merge pull request #19 from OriginProtocol/stan/mainnet Stan/mainnet	2018-10-16 19:13:04 +03:00
Stan James	e1a29f06a9	Changes for mainnet. Hide forwarded messages.	2018-10-16 17:54:47 +02:00
Stan James	ffe3790946	Merge branch 'master' of https://github.com/OriginProtocol/telegram-moderator	2018-10-03 06:18:41 -07:00
Josh Fraser	8b8018a93a	Merge pull request #13 from OriginProtocol/dev-page-link Link to developer landing page from README	2018-07-01 17:00:14 -07:00
Micah Alcorn	c5c1ae5140	Link to developer landing page from README	2018-07-01 16:48:20 -07:00
Stan James	68aa8b85b8	Move notification chat to end of function ...So if notification channel is not set up right, it doesn't stop blocking from happening.	2018-06-28 11:54:17 -06:00
Stan James	7874dbfbfe	syntax fix	2018-06-28 11:23:34 -06:00
Stan James	b2460d8e3a	better handling of debug setting	2018-06-28 11:22:23 -06:00
Stan James	a0e3922bed	Added better logging output.	2018-06-28 10:45:17 -06:00