Edit: PRAW has some severe limitations that made me change my mind.
PRAW, after some recent updates, removed all the date range functions and added hard limitations to the amount of posts it can fetch. a few years ago, when I tried to do this before, I used a hybrid method of using PSAW(PushShift.io)-PRAW. However after the API changes, PushShift.io was now locked downed to ONLY reddit admnis sadly.
Obfuscation and API limitations
You may also have realized that the karma values change from run to run. This inconsistency is due to reddit’s obfuscation of the upvotes and downvotes. The obfuscation is done to everything and everybody to thwart potential cheaters. There’s nothing we can do to prevent this.
from the docs
How is a submission’s score determined?
A submission’s score is simply the number of upvotes minus the number of downvotes. If five users like the submission and three users don’t it will have a score of 2. Please note that the vote numbers are not “real” numbers, they have been “fuzzed” to prevent spam bots etc. So taking the above example, if five users upvoted the submission, and three users downvote it, the upvote/downvote numbers may say 23 upvotes and 21 downvotes, or 12 upvotes, and 10 downvotes. The points score is correct, but the vote totals are “fuzzed”.
Logging in PRAW
any API ratelimits from POST actions that are handled will produce a log entry with a message similar to the following message:
Rate limit hit, sleeping for 5.5 seconds
storing credentials and configs
create a praw.ini
file to store configs and credentials.
add this file to .gitignore
if needed!