If they Google You, Do you Win?

In a way, this election is a referendum on “do actions speak louder than words”, is what people do in the privacy of their internet browsing more reflective of their future behavior than what they tell pollsters? And while I have focused on twitter as a barometer of public opinion, there are other data sources that could signal the private thoughts and future actions of voters. The linked NYT article, “If they Google you, Do you Win?”, mentions using the Google queries “Trump Clinton” vs. “Clinton Trump” as signals of voter interest, with the respective queries reflecting bias towards the candidate listed first, i.e. “Clinton Trump” would reflect bias towards Clinton. Using this methodology, I researched Google trends for Battleground states to see where public opinion may be. The data are displayed below.

screen-shot-2016-11-03-at-5-49-10-pm

For the month of October 2016, “Trump Clinton” leads “Clinton Trump” in every state with the exception of Nevada.

You might say Trump is a polarizing celebrity, and for that reason he may be top of mind even if the individual plans to vote for Clinton. Okay, well then let’s penalize Trump 10%. Even in that case, ‘Factored “Trump Clinton”‘ indicates that, with the exception of Nevada, the three states that are in play are Virginia, Iowa, and Florida.

So while it is unclear in which direction the election will result, I believe we may be surprised at how close the results turn out to be, and that one thing we may remember is the discrepancy between what was reported in the polls leading up to the election and what actually happened online. We only have 4 days left to see which source provides a clearer signal of truth, and until then….Good luck to both candidates!

Moral Foundations: Reddit Political Communities

Moral Foundations Theory is a social psychological theory intended to explain the origins of and variation in human moral reasoning. The theory proposes moral foundations such as fairness, care, in-group, authority, and purity, and has been popularized by psychologist Jonathan Haidt in his book The Righteous Mind.

Haidt describes human morality as it relates to politics and proposes differences between conservatives and liberals as they relate to the moral foundations (TED Talk). Specifically, whereas conservatives appeal to fairness, care, in-group, authority, and purity equally, liberals appeal to fairness and care more than they appeal to in-group, authority, and purity.

Setting out to observe this phenomenon within Reddit Political Communities, I performed word frequency analyses on the /r/Republican and /r/Democrats corpora, totaling the words for each moral foundation, as defined by the LIWC dictionary. Comparing the totals, I found a trend consistent with Moral Foundations Theory. The visualization shows the moral foundations for /r/Democrats normalized against those for /r/Republican, with each value for /r/Republican set at 100%.

reddit_moralfoundations

Content Strategy: The Cats Meeeow

It has been written 15% of all Internet traffic is cat-related. Whether you believe this statistic, there is no doubt cats inhabit the digital space. To cite more popular examples, we have encountered LOL Cats, Grumpy Cat, and Lil’ Bub…what cuties! To date, there are 72 million media tagged as “cat” on instagram.

With so many kitties purring around the interwebz, how might a content creator know where to start? Well I have created a visualization showing the most popular cat breeds by hashtag on instagram. Enjoy, and MEEEEEEEOW!!!
catbreedsvisualization_1

Triangles, Networks, and New Connections: Donald Trump and Bill O’reilly

oreillyfactorrealdonaldtrumpnetwork

Have you ever wanted to send a message to someone with whom you do not have direct contact? What if you knew a friend who had direct contact with this someone? You could send the message through your friend.

This principle can be modeled geometrically. Let’s say you are A and the person with whom you want to speak is C. There is no direct link between A and C. 
However, your friend, B, knows this someone, C. So there are links between you, A, and your friend, B, as well as your friend, B, and this someone, C.

Thus, your friend, B, can serve as a bridge, linking you and this someone, and a triangle is formed.

Although when presented geometrically this process may appear abstract, we are familiar with the practice in everyday settings. For instance, we link two contacts via an introduction email, or we introduce two friends at a cocktail party. Triangles are a divine geometry, as they serve to create new connections.

With the US Political season on the horizon, let’s use Twitter and apply this geometry to Republican Presidential Candidate Donald Trump, @realdonaldtrump. Trump follows 44 twitter profiles, one of which is that of Bill O’reilly, @oreillyfactor. O’reilly follows 37 twitter profiles, one of which is that of Donald Trump. So there is a two way connection between Donald Trump and Bill O’reilly.

Well Let’s image this two way connection did not exist and that Donald Trump wanted to connect with Bill O’reilly. How might the two connect? Yes, we must find the common links between Trump and O’reilly, that is the twitter profiles that they both follow. When conducting the analysis, via Python, we find that there are 4 twitter profiles that both Trump and O’reilly follow. The four profiles are @foxandfriends, @BretBaier, @greta, and @ericbollinger . Again, here’s the visualization from above.

oreillyfactorrealdonaldtrumpnetwork

To connect with O’reilly, Trump could form a triangle from any of the four. The more links, the greater the likelihood a connection will be made. Triangles create opportunities for our message to reach peripheral networks, without us having to directly transmit the information to the end recipient.

Therein lies the true power of social networks.

Targeting an Audience, Mapping a Tour: Luther Dickinson

In this post, we will map Luther Dickinson’s US twitter followers, by count and influence, and examine how these distributions match his band’s upcoming tour routing, with the intent to demonstrate the value of twitter data for targeting audiences and planning performances.

Screen Shot 2015-08-11 at 10.37.43 PM

Luther Dickinson is the lead guitarist and vocalist for the North Mississippi Allstars. As of August 11, 2015, Luther has 1010 twitter followers. Of these 1010 followers, 483 identify their location as based in the US (not all followers identify location). The map below shows the concentrations of US followers, with the greatest numbers in darkest blue.

Followers

lutherdickinson follower map

Top 10 states by follower count (darkest blue):

State Followers
Tennessee 78
Mississippi 60
California 45
New York 35
Pennsylvania 25
Georgia 22
Colorado 20
Louisiana 18
Washington 17
Illinois 15


Now we will map Luther’s followers by influence, i.e. the followers of Luther’s followers. In other words, if each of Luther’s followers retweeted, how many individuals would see the retweet?

Influence

luterhdickinson_follower_influenceTop 10 states by influence (darkest blue):

State Influence
California 1137586
Tennessee 479783
New York 70690
Georgia 64776
Louisiana 63685
Mississippi 59011
Illinois 35373
Colorado 29206
Texas 26612
Rhode Island 23377

We see differences between followers and influence, with Mississippi, Pennsylvania, and Washington hosting greater concentrations of followers, who have less influence. Conversely, we see Rhode Island and Texas hosting lower concentrations of followers, who have more influence. California and Tennessee are strong points for both followers and influence.

Let’s see if this aligns with Luther’s plans for Fall 2015.

According to www.nmallstars.com, the band will tour the following cities in October 2015:

10.1 – San Francisco, CA
10.2 – San Francisco, CA
10.3 – Los Angeles, CA
10.4 – Anaheim, CA
10.5 – Solana Beach, CA
10.6 – Las Vegas, NV
10.9 – Boulder, CO
10.10 – Denver, CO
10.12 – Chicago, IL
10.13 – Pittsburgh, PA
10.14 – Washington D.C.
10.15. – Glenside, PA
10.16 – New York, NY
10.17 – Boston, MA
10.24 – Placerville, CA
10.25 – Placerville, CA

While we do not see a Tennessee performance during the stretch, all dates besides for two, in Las Vegas and Boston, match the list for top 10 states by followers. Furthermore, we see almost half of the performances, 44%, in California, a strong point for both followers and influence. We view this as strong support for the value of twitter data in targeting audiences and planning performances.

Luther’s map serves as a guide for up and coming artists within the genre. Using the raw data, one could target influencers within each state who would welcome the genre.

To this point, I envision developing a platform that leverages twitter data to help artists better identify audiences and geographic strong points within the genre. If you are an artist, manager, data scientist, or entrepreneur, and are interested in this work, contact me at andrewshamlet@gmail.com 

Gauging US Politics with Reddit

Reddit is an entertainment, social networking, and news site where registered users can vote submissions up or down in a bulletin board-like fashion . Content entries are organized by areas of interest called “subreddits.” This post uses subreddits /r/Republican and /r/Democrats to analyze US Politics as of July 22, 2015.

Thanks to Dr. Randal Olson and his reddit-analysis script, we crawled /r/Republican and /r/Democrats. Making word clouds, we visualize word frequency, largest to smallest by count.

/r/Republican

redditrepublicanwordcloud

/r/Democrats

redditdemocratswordcloudThe word clouds provide a high level view of the subreddits. Now let’s dive in to gain insight!

/r/Republican has 16,942 readers, and /r/Democrats has 15,152.

During the timespan 6/22/15 – 7/22/15, 86,609 words appeared in /r/Republican and 73,156 words appeared in /r/Democrats. We will compare word frequency as % of total. In the event of significant difference, the greater of the two will be bolded.

/r/Republican % of total /r/Democrats % of total
“Good” 0.11 0.20
“Bad” 0.06 0.10
 /r/Republican % of total  /r/Democrats % of total
“Love” 0.05 0.05
“Hate” 0.05 0.05
 /r/Republican % of total   /r/Democrats % of total
“GOP” 0.07 0.27
“Fox” 0.02 0.08
 /r/Republican % of total  /r/Democrats % of total
“Trump” 0.30 0.15
“Hillary” 0.07 0.28
  /r/Republican % of total  /r/Democrats % of total
“Obama” 0.12 0.18
“Bush” 0.07 0.13
 /r/Republican % of total  /r/Democrats % of total
“Country” 0.11 0.14
“States” 0.14 0.07
 /r/Republican % of total  /r/Democrats % of total
“Students” 0.04 0.00
“School” 0.04 0.02
 /r/Republican % of total  /r/Democrats % of total
“Gay” 0.06 0.09
“Marriage” 0.10 0.13
 /r/Republican % of total  /r/Democrats % of total
“Inequality” 0.01 0.02
“Equality” 0.00 0.03
 /r/Republican % of total  /r/Democrats % of total
“White” 0.06 0.10
“Black” 0.04 0.04
 /r/Republican % of total  /r/Democrats % of total
“Health” 0.02 0.10
“Insurance” 0.03 0.05
 /r/Republican % of total  /r/Democrats % of total
“Workers” 0.02 0.04
“Unions” 0.05 0.01
 /r/Republican % of total  /r/Democrats % of total
“Gun” 0.05 0.04
“Control” 0.03 0.10
 /r/Republican % of total  /r/Democrats % of total
“Minimum” 0.02 0.06
“Wage” 0.02 0.08
 /r/Republican % of total  /r/Democrats % of total
“Church” 0.06 0.01
“Religion” 0.01 0.01

While we’ll let you come to your own conclusions, here are the insights we found surprising:

  • Greater Frequency of “GOP” and “Fox” in /r/Democrats
  • Greater Frequency of “Students” in /r/Republicans
  • Greater Frequency of “White” in /r/Democrats
  • Greater Frequency of “Union” in /r/Republicans

That’s it for now. Please comment with additional insights or reach out directly at:

andrewshamlet@gmail.com

Part 3: Most common words used in tweets by Taylor Swift, Katy Perry, and Britney Spears

Hi there! For part 3, we will use visualization to analyze the most common words used in tweets among Taylor Swift, Katy Perry, and Britney Spears. In Part 1, we noted a favorite bias among entertainers. That is, entertainers receive more favorites than retweets. A clue to the favorite bias may lie in word choice. Let’s see.

Taylor Swift:

taylorswift_copyright

Katy Perry:

katyperrycopyright

Britney Spears:

britneyspearscopyright

Each visualization displays the most common words, by usage, for the respective twitter profiles during June 2015. The more often a word is used, the larger the word is displayed.

“Love” and “Day”/”Today” appear in large typeface on all three. Furthermore, the sense of temporality is evident among all three. We see words such as new, now, hour, tonight, and night.

Based on our findings, followers of Taylor Swift, Katy Perry, and Britney Spears receive strong messages of Love Now/Today/Tonight. There’s positive sentiment in these message, followed by an implicit call to action.

Once again, the charts from part 1.

Twitter10TenJune2015RetweetFavorite_v4

monthlyaverageretweettable_V2