Gauging US Politics with Reddit

Reddit is an entertainment, social networking, and news site where registered users can vote submissions up or down in a bulletin board-like fashion . Content entries are organized by areas of interest called “subreddits.” This post uses subreddits /r/Republican and /r/Democrats to analyze US Politics as of July 22, 2015.

Thanks to Dr. Randal Olson and his reddit-analysis script, we crawled /r/Republican and /r/Democrats. Making word clouds, we visualize word frequency, largest to smallest by count.

/r/Republican

redditrepublicanwordcloud

/r/Democrats

redditdemocratswordcloudThe word clouds provide a high level view of the subreddits. Now let’s dive in to gain insight!

/r/Republican has 16,942 readers, and /r/Democrats has 15,152.

During the timespan 6/22/15 – 7/22/15, 86,609 words appeared in /r/Republican and 73,156 words appeared in /r/Democrats. We will compare word frequency as % of total. In the event of significant difference, the greater of the two will be bolded.

/r/Republican % of total /r/Democrats % of total
“Good” 0.11 0.20
“Bad” 0.06 0.10
 /r/Republican % of total  /r/Democrats % of total
“Love” 0.05 0.05
“Hate” 0.05 0.05
 /r/Republican % of total   /r/Democrats % of total
“GOP” 0.07 0.27
“Fox” 0.02 0.08
 /r/Republican % of total  /r/Democrats % of total
“Trump” 0.30 0.15
“Hillary” 0.07 0.28
  /r/Republican % of total  /r/Democrats % of total
“Obama” 0.12 0.18
“Bush” 0.07 0.13
 /r/Republican % of total  /r/Democrats % of total
“Country” 0.11 0.14
“States” 0.14 0.07
 /r/Republican % of total  /r/Democrats % of total
“Students” 0.04 0.00
“School” 0.04 0.02
 /r/Republican % of total  /r/Democrats % of total
“Gay” 0.06 0.09
“Marriage” 0.10 0.13
 /r/Republican % of total  /r/Democrats % of total
“Inequality” 0.01 0.02
“Equality” 0.00 0.03
 /r/Republican % of total  /r/Democrats % of total
“White” 0.06 0.10
“Black” 0.04 0.04
 /r/Republican % of total  /r/Democrats % of total
“Health” 0.02 0.10
“Insurance” 0.03 0.05
 /r/Republican % of total  /r/Democrats % of total
“Workers” 0.02 0.04
“Unions” 0.05 0.01
 /r/Republican % of total  /r/Democrats % of total
“Gun” 0.05 0.04
“Control” 0.03 0.10
 /r/Republican % of total  /r/Democrats % of total
“Minimum” 0.02 0.06
“Wage” 0.02 0.08
 /r/Republican % of total  /r/Democrats % of total
“Church” 0.06 0.01
“Religion” 0.01 0.01

While we’ll let you come to your own conclusions, here are the insights we found surprising:

  • Greater Frequency of “GOP” and “Fox” in /r/Democrats
  • Greater Frequency of “Students” in /r/Republicans
  • Greater Frequency of “White” in /r/Democrats
  • Greater Frequency of “Union” in /r/Republicans

That’s it for now. Please comment with additional insights or reach out directly at:

andrewshamlet@gmail.com

Part 3: Most common words used in tweets by Taylor Swift, Katy Perry, and Britney Spears

Hi there! For part 3, we will use visualization to analyze the most common words used in tweets among Taylor Swift, Katy Perry, and Britney Spears. In Part 1, we noted a favorite bias among entertainers. That is, entertainers receive more favorites than retweets. A clue to the favorite bias may lie in word choice. Let’s see.

Taylor Swift:

taylorswift_copyright

Katy Perry:

katyperrycopyright

Britney Spears:

britneyspearscopyright

Each visualization displays the most common words, by usage, for the respective twitter profiles during June 2015. The more often a word is used, the larger the word is displayed.

“Love” and “Day”/”Today” appear in large typeface on all three. Furthermore, the sense of temporality is evident among all three. We see words such as new, now, hour, tonight, and night.

Based on our findings, followers of Taylor Swift, Katy Perry, and Britney Spears receive strong messages of Love Now/Today/Tonight. There’s positive sentiment in these message, followed by an implicit call to action.

Once again, the charts from part 1.

Twitter10TenJune2015RetweetFavorite_v4

monthlyaverageretweettable_V2

Part 2: Does Tweeting More Often Increase Favorites per Tweet and Retweets per Tweet?

In the last post we examined the relationship between MARpT and MAFpT, and found a .83 correlation. That is, as Retweets increase, Favorites increase. We also found a Favorite bias among entertainers and a Retweet bias with Barack Obama. In this post, we will examine whether tweeting more often increases favorites per tweet and/or retweets per tweet. Using the sample of Top Twitter Profiles, as listed by twittercounter.com, we will plot number of tweets against MAFpT and MARpT. Remember MAFpT and MARpT gives us the monthly average number of Favorites and Retweets per tweet. By this logic, if tweeting more often increases retweets or favorites per tweet, we should see higher MARpT and MAFpT with higher Monthly Tweets. Let’s look and see!

tweetstomafpt_v1

This plot displays MAFpT as a function of Monthly Tweets. A previous plot included Justin Bieber, who single handedly increased the correlation 0.4. Considering Justin Bieber an outlier and removing him from the sample, we find a weak correlation, r = 0.25. That is, there’s no strong link between monthly tweets and MAFpT. As for Justin Bieber, as his monthly tweets increase, his favorites per tweet increase–keep on tweeting, Biebs!

retweetsmonthlytweets_v1

This plot displays MARpT as a function of Monthly Tweets. We find an even weaker correlation here, r = 0.16. That is there’s no strong link between monthly tweets and MARpT. The weaker correlation supports the favorite bias we found among entertainers in part 1. That is, entertainers receive more favorites than retweets.

In conclusion, for this sample, we find little evidence that tweeting more often increases either favorites per tweet or retweets per tweet. While this may, or may not, be transferrable to your own twitter account, the findings lead us to ask what factors increase engagement with a tweet, or a message in general?

Until next time!

Part 1: Top Twitter Profiles, June 2015

Hello! My name is Andrew Hamlet, and I am a MBA student at NYU. I am developing proficiency with Python, for data mining applications. I plan to publish research on this blog. Since my background is in social media and web analytics, that’s where I will start. As is the nature with scientific inquiry, collaboration is welcome! Please comment with suggestions or further lines of inquiry. Now let’s begin. Using Python, I gathered tweets, including the retweet and favorite counts, occurring in June 2015 for the Top Twitter Personalities (no companies), as listed by twittercounter.com. Here’s a table displaying the data, in ascending order by Followers. The Tweets, Retweets, and Favorites columns display totals for each Twitter Profile during June 2015. celebrityanalysistable_v2 Since there is variance among total number of Tweets, that is Katy Perry tweeted 35 times in June 2015 while Justin Bieber tweeted 167 times in June 2015, I normalized Retweets and Favorites, dividing them by Tweets to give Monthly Average Retweet per Tweet and Monthly Average Favorite per Tweet. Here’s a table displaying the results. monthlyaverageretweettable_V2 There appears to be a relationship Between Monthly Average Retweet per Tweet and Monthly Average Favorite per Tweet, that is as MARpT increases MAFpT increases. Let’s plot to verify. (For visual display, Justin Timberlake, Britney Spears, Ellen Degeneres, and Justin Bieber are removed from the chart.) Twitter10TenJune2015RetweetFavorite_v5 Yes! There’s a 0.83 correlation between MARpT and MAFpT, so Favorites increase as Retweets increase. Even more, we see a Favorite bias among this sample, that is the twitter profiles receive more MAFpT than MARpT. However, we see a Retweet Bias with Barack Obama. Perhaps politicians receive more retweets, while entertainers receive more favorites? In the next post, using this sample we will investigate whether there is a relationship between total number of Tweets and MARpT or MAFpT, that is does tweeting more during a month boost the number of interactions per tweet? Check back to find out!