Reads comments from all of reddit and picks out phrases, then sends any found matches to either 'localhost' if local, or another heroku application depending on if application is run locally or on heroku
npm run start
- Starts the program
npm run test
- Runs tests, not including 'live' tests, which require environment variables filled with tokens.
npm run eslint
- Used to keep consistent format. This should be pass before every commit
- Have postgresql service running
- Ensure you have a database user
postgres
with passwordpostgresql
(Or modify the batch file below to correct username/password) - Run
reddit-comment-reader\database\create_local_database.bat
or .sh for linux This will drop the database if it exists and recreate it - Create environment variables
DATABASE_URL
,OUTPUT_URL
,REDDIT_USERNAME
,REDDIT_PASSWORD
,REDDIT_APP_ID
,REDDIT_APP_SECRET
. Or create an.env
file with these values. You can look atexample.env
for an example. - Install dependencies by running
npm install
- Start the application by running
npm run start
Database connection is expected to be contained in an evironment variable 'DATABASE_URL'
Example: SET DATABASE_URL=postgres://postgres:postgresql@localhost:5432/reddit_comment_reader
Note on windows I get an error when setting the above, but it works regardless of error
RegexpComment - Phrases to look for are taken from the PostgreSQL database
SubredditMatch | CommentMatch | ReplyMessage | IsReplyRegexp | id |
---|
RegexpComment Creation script
-- Table: public."RegexpComment"
-- DROP TABLE public."RegexpComment";
CREATE TABLE public."RegexpComment"
(
"SubredditMatch" text COLLATE pg_catalog."default" NOT NULL DEFAULT '.*'::text,
"CommentMatch" text COLLATE pg_catalog."default" NOT NULL,
"ReplyMessage" text COLLATE pg_catalog."default" NOT NULL,
"IsReplyRegexp" boolean DEFAULT false,
id integer NOT NULL DEFAULT nextval('"RegexpComment_id_seq"'::regclass)
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE public."RegexpComment"
OWNER to uuhsiyqcwwsszg;
ErrorTable - Errors are logged here. Application is hosted on Heroku, which doesn't keep a second log for errors
id | ErrorDescription | ErrorTrace | AdditionalInfo | CreatedOn |
---|
ErrorTable Creation script
-- Table: public."ErrorTable"
-- DROP TABLE public."ErrorTable";
CREATE TABLE public."ErrorTable"
(
id integer NOT NULL DEFAULT nextval('errortable_id_seq'::regclass),
errordescription character varying(255) COLLATE pg_catalog."default",
errortrace character varying(5000) COLLATE pg_catalog."default",
additionalinfo character varying(1000) COLLATE pg_catalog."default",
createdon timestamp without time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT errortable_pkey PRIMARY KEY (id)
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE public."ErrorTable"
OWNER to uuhsiyqcwwsszg;
This program does not use an authenticated client. Since none is required for reading data from Reddit's api.
Since Reddit's 2023 API changes, this program now uses an Authenticated client. Also the number of comments that can be retrieved has lessened.
Uses an https client require('https') to make requests to 'reddit.com/all/comments.json' and occassionally 'reddit.com/subreddit/moderators.json'.
Uses Faye require('faye') to send data to either another heroku application, or localhost.com when comments are found matching regular expressions taken from the database
Program is hardcoded to ignore moderator comments. Done by querying the URL for the appropriate subreddit. A variable is maintained and requests to a single subreddit are only made once per program duration
Makes a request to a Reddit URL every 1100 milliseconds. Reddit may block connections that make requests less than 1000 milliseconds and I've found using that exact limit causes issues
Reddit as of 2023 has silently changed the number of comments that can be retrieved.
Ignores comments from blacklisted subreddits. Some serious subreddits are hardcoded to be ignored, such as /r/depression
Doesn't post the same comment to the same subreddit too many times within a duration