GitHub - LionelBergen/reddit-comment-reader: Node.js program hosted on Heroku which uses the Reddit API to constantly read from all comments. Reads from a PostgreSQL database to find which comments to look for, and send POST requests when found

Reddit Comment Reader

Reads comments from all of reddit and picks out phrases, then sends any found matches to either 'localhost' if local, or another heroku application depending on if application is run locally or on heroku

npm run start - Starts the program
npm run test - Runs tests, not including 'live' tests, which require environment variables filled with tokens.
npm run eslint - Used to keep consistent format. This should be pass before every commit

Quick Start

Have postgresql service running
Ensure you have a database user postgres with password postgresql (Or modify the batch file below to correct username/password)
Run reddit-comment-reader\database\create_local_database.bat or .sh for linux This will drop the database if it exists and recreate it
Create environment variables DATABASE_URL, OUTPUT_URL, REDDIT_USERNAME, REDDIT_PASSWORD, REDDIT_APP_ID, REDDIT_APP_SECRET. Or create an .env file with these values. You can look at example.env for an example.
Install dependencies by running npm install
Start the application by running npm run start

Database Connection

Database connection is expected to be contained in an evironment variable 'DATABASE_URL'

Example: SET DATABASE_URL=postgres://postgres:postgresql@localhost:5432/reddit_comment_reader

Note on windows I get an error when setting the above, but it works regardless of error

Database Tables

RegexpComment - Phrases to look for are taken from the PostgreSQL database

SubredditMatch	CommentMatch	ReplyMessage	IsReplyRegexp	id

RegexpComment Creation script

-- Table: public."RegexpComment"
-- DROP TABLE public."RegexpComment";

CREATE TABLE public."RegexpComment"
(
	"SubredditMatch" text COLLATE pg_catalog."default" NOT NULL DEFAULT '.*'::text,
	"CommentMatch" text COLLATE pg_catalog."default" NOT NULL,
	"ReplyMessage" text COLLATE pg_catalog."default" NOT NULL,
	"IsReplyRegexp" boolean DEFAULT false,
	id integer NOT NULL DEFAULT nextval('"RegexpComment_id_seq"'::regclass)
)
WITH (
	OIDS = FALSE
)
TABLESPACE pg_default;

ALTER TABLE public."RegexpComment"
	OWNER to uuhsiyqcwwsszg;

ErrorTable - Errors are logged here. Application is hosted on Heroku, which doesn't keep a second log for errors

id	ErrorDescription	ErrorTrace	AdditionalInfo	CreatedOn

ErrorTable Creation script

-- Table: public."ErrorTable"
-- DROP TABLE public."ErrorTable";

CREATE TABLE public."ErrorTable"
(
	id integer NOT NULL DEFAULT nextval('errortable_id_seq'::regclass),
	errordescription character varying(255) COLLATE pg_catalog."default",
	errortrace character varying(5000) COLLATE pg_catalog."default",
	additionalinfo character varying(1000) COLLATE pg_catalog."default",
	createdon timestamp without time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
	CONSTRAINT errortable_pkey PRIMARY KEY (id)
)
WITH (
	OIDS = FALSE
)
TABLESPACE pg_default;

ALTER TABLE public."ErrorTable"
	OWNER to uuhsiyqcwwsszg;

Reddit API connection

~~This program does not use an authenticated client. Since none is required for reading data from Reddit's api.~~
Since Reddit's 2023 API changes, this program now uses an Authenticated client. Also the number of comments that can be retrieved has lessened.

Uses an https client require('https') to make requests to 'reddit.com/all/comments.json' and occassionally 'reddit.com/subreddit/moderators.json'.

Sending data

Uses Faye require('faye') to send data to either another heroku application, or localhost.com when comments are found matching regular expressions taken from the database

Other notes

Program is hardcoded to ignore moderator comments. Done by querying the URL for the appropriate subreddit. A variable is maintained and requests to a single subreddit are only made once per program duration

Makes a request to a Reddit URL every 1100 milliseconds. Reddit may block connections that make requests less than 1000 milliseconds and I've found using that exact limit causes issues
Reddit as of 2023 has silently changed the number of comments that can be retrieved.

Ignores comments from blacklisted subreddits. Some serious subreddits are hardcoded to be ignored, such as /r/depression

Doesn't post the same comment to the same subreddit too many times within a duration

Name		Name	Last commit message	Last commit date
Latest commit History 280 Commits
database		database
live-test		live-test
reddit-comment-reader		reddit-comment-reader
test		test
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.npmignore		.npmignore
.npmrc		.npmrc
.slugignore		.slugignore
Procfile		Procfile
README.md		README.md
app.js		app.js
example.env		example.env
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reddit Comment Reader

Quick Start

Database Connection

Database Tables

Reddit API connection

Sending data

Other notes

About

Releases

Packages

Languages

LionelBergen/reddit-comment-reader

Folders and files

Latest commit

History

Repository files navigation

Reddit Comment Reader

Quick Start

Database Connection

Database Tables

Reddit API connection

Sending data

Other notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages