Very cool project! Quick question: is the underlying Pushshift dataset updated with new Reddit data on any regular cadence (daily/weekly/monthly), or is this essentially a fixed historical snapshot up to a certain date? Just want to understand if self-hosters would need to periodically re-download for fresh content or if it's archival-only.
the data from 2025-12 has been released already, it is usually released every month, it just needs to be split and reprocessed for 2025 by watchful1. i will probably eventually add support for importing data from the monthly arctic shift dumps so that archives can be updated monthly.
We have been seeing quite a lot of conversations in customer support teams around tagging tickets (used as part of triggers, macros, analytics, etc). And we know how its hard and time consuming process.
This is why we built a MonkeyLearn extension for Zendesk (that we are releasing today) to help this tagging process with machine learning.
With this integration, MonkeyLearn will automatically tag and categorize incoming tickets into Zendesk. It will predict the value of a given field based on the subject and content of a ticket (it uses your historical data to train the machine learning model).
This is an initial version and it’s free to use (at least for most cs teams).
We are trying to understand the value and if it helps support teams in this process, so any type of feedback is greatly appreciated. Also, if you need any kind of help to fine tune the model, more than happy to help.
For creating Tarsier, we used Tweepy for extracing tweets using the Twitter Public API, we used MonkeyLearn for analyzing the tweets and finally used Plotly for creating the visualizations.
For creating Tarsier, we used Tweepy for extracing tweets using the Twitter Public API, we used MonkeyLearn for analyzing the tweets and finally used Plotly for creating the visualizations.