Python Tutorial: RSI

Download the accompanying IPython Notebook for this Tutorial from Github. 

Python streamlines tasks requiring multiple steps in a single block of code. For this reason, it is a great tool for querying and performing analysis on data.

Last Tutorial, we outlined steps for calculating Price Channels.

In this Tutorial, we introduce a new technical indicator, the Relative Strenght Index (RSI).

The Relative Strength Index (RSI) is a momentum indicator developed by noted technical analyst Welles Wilder, that compares the magnitude of recent gains and losses over a specified time period to measure speed and change of price movements of a security. It is primarily used to identify overbought or oversold conditions in the trading of an asset.

The Relative Strength Index (RSI) is calculated as follows:

RSI = 100 - 100 / (1 + RS)

RS = Average gain of last 14 trading days / Average loss of last 14 trading days

RSI values range from 0 to 100.

Traditional interpretation and usage of the RSI is that RSI values of 70 or above indicate that a security is becoming overbought or overvalued, and therefore may be primed for a trend reversal or corrective pullback in price. On the other side, an RSI reading of 30 or below is commonly interpreted as indicating an oversold or undervalued condition that may signal a trend change or corrective price reversal to the upside.

Let’s use Python to compute the Relative Strenght Index (RSI).

1.) Import modules (numpy included).

import pandas as pd
import numpy as np
from pandas_datareader import data as web
import matplotlib.pyplot as plt
%matplotlib inline

2.) Define function for querying daily close.

def get_stock(stock,start,end):
 return web.DataReader(stock,'google',start,end)['Close']

3.) Define function for RSI.

def RSI(series, period):
 delta = series.diff().dropna()
 u = delta * 0
 d = u.copy()
 u[delta > 0] = delta[delta > 0]
 d[delta < 0] = -delta[delta < 0]
 u[u.index[period-1]] = np.mean( u[:period] ) #first value is sum of avg gains
 u = u.drop(u.index[:(period-1)])
 d[d.index[period-1]] = np.mean( d[:period] ) #first value is sum of avg losses
 d = d.drop(d.index[:(period-1)])
 rs = pd.stats.moments.ewma(u, com=period-1, adjust=False) / \
 pd.stats.moments.ewma(d, com=period-1, adjust=False)
 return 100 - 100 / (1 + rs)

How does the RSI function work?

– 3.a.) Function creates two series of daily differences.

– 3.b.) One series is daily positive differences, i.e. gains.

– 3.c.) One series is daily negative difference, i.e. losses.

– 3.d.) Average daily positive differences for the period specified.

– 3.e.) Average daily negative difference for the period specified.

– 3.f.) RS is set equal to Exponential Moving Average of daily positive differences for the period sepcified / Exponential Moving Average of daily positive differences for the period sepcified.

– 3.g) Return 100 – 100 / (1 + RS)

 4.) Query daily close for ‘FB’ during 2016.

df = pd.DataFrame(get_stock('FB', '1/1/2016', '12/31/2016'))

5.) Run daily close through RSI function. Save series to new column in dataframe.

df['RSI'] = RSI(df['Close'], 14)
df.tail()

6.) Plot daily close and RSI.

df.plot(y=['Close'])
df.plot(y=['RSI'])

There you have it! We created our RSI indicator. Here’s the full code:

import pandas as pd
import numpy as np
from pandas_datareader import data as web
import matplotlib.pyplot as plt
%matplotlib inline

def get_stock(stock,start,end):
 return web.DataReader(stock,'google',start,end)['Close']
 
def RSI(series, period):
 delta = series.diff().dropna()
 u = delta * 0
 d = u.copy()
 u[delta > 0] = delta[delta > 0]
 d[delta < 0] = -delta[delta < 0]
 u[u.index[period-1]] = np.mean( u[:period] ) #first value is sum of avg gains
 u = u.drop(u.index[:(period-1)])
 d[d.index[period-1]] = np.mean( d[:period] ) #first value is sum of avg losses
 d = d.drop(d.index[:(period-1)])
 rs = pd.stats.moments.ewma(u, com=period-1, adjust=False) / \
 pd.stats.moments.ewma(d, com=period-1, adjust=False)
 return 100 - 100 / (1 + rs)
 
df = pd.DataFrame(get_stock('FB', '1/1/2016', '12/31/2016'))
df['RSI'] = RSI(df['Close'], 14)
df.tail()

Python Tutorial: Price Channels

Download the accompanying IPython Notebook for this Tutorial from Github. 

 

Python streamlines tasks requiring multiple steps in a single block of code. For this reason, it is a great tool for querying and performing analysis on stock ticker data.

Last post, we outlined steps for calculating Bollinger Bands.

In this post, we introduce a new technical indicator,  Price Channels.

Price Channels

Price Channels are lines set above and below the price of a security. The upper channel is set at the x-period high and the lower channel is set at the x-period low. For a 20-day Price Channel, the upper channel would equal the 20-day high and the lower channel would equal the 20-day low.

Price Channels can be used to identify upward thrusts that signal the start of an uptrend or downward plunges that signal the start of a downtrend. Price Channels can also be used to identify overbought or oversold levels within a bigger downtrend or uptrend.

Price Channels are calculated as follows:

Upper Channel: 20-day high
Lower Channel: 20-day low

Let’s use Python to compute Price Channels.

1. Import modules.

import pandas as pd
import pandas.io.data as web
import matplotlib.pyplot as plt
%matplotlib inline

2. Define function for querying the daily high.

def get_high(stock, start, end): 
     return web.get_data_yahoo(stock, start, end)['High']

3. Define function for querying the daily low.

def get_low(stock, start, end): 
     return web.get_data_yahoo(stock, start, end)['Low']

4. Define function for querying daily close.

def get_close(stock, start, end): 
     return web.get_data_yahoo(stock, start, end)['Adj Close']

5. Query daily high, daily low, and daily close for ‘FB’ during 2016.

x = pd.DataFrame(get_high('FB', '1/1/2016', '12/31/2016'))
x['Low'] = pd.DataFrame(get_low('FB', '1/1/2016', '12/31/2016'))
x['Close'] = pd.DataFrame(get_close('FB', '1/1/2016', '12/31/2016'))

6. Compute 4 week high and 4 week low using rolling max/min. Add 50 day simple moving average for good measure.

x['4WH'] = pd.rolling_max(x['High'], 20)
x['4WL'] = pd.rolling_min(x['Low'], 20)
x['50 sma'] = pd.rolling_mean(x['Close'], 50)

7. Plot 4WH, 4WL, 50 sma, and daily close.

x.plot(y=['4WH', '4WL', '50 sma', 'Close'])

There you have it! We created our Price Channels. Here’s the full code:

import pandas as pd 
import pandas.io.data as web 
import matplotlib.pyplot as plt
%matplotlib inline

def get_high(stock, start, end): 
     return web.get_data_yahoo(stock, start, end)['High']
def get_low(stock, start, end): 
     return web.get_data_yahoo(stock, start, end)['Low']
def get_close(stock, start, end): 
     return web.get_data_yahoo(stock, start, end)['Adj Close']

x = pd.DataFrame(get_high('FB', '1/1/2016', '12/31/2016'))
x['Low'] = pd.DataFrame(get_low('FB', '1/1/2016', '12/31/2016'))
x['Close'] = pd.DataFrame(get_close('FB', '1/1/2016', '12/31/2016'))

x['4WH'] = pd.rolling_max(x['High'], 20)
x['4WL'] = pd.rolling_min(x['Low'], 20)
x['50 sma'] = pd.rolling_mean(x['Close'], 50)

x.plot(y=['4WH', '4WL', '50 sma', 'Close'])

In celebration of completing this tutorial, let’s watch Ed Seykota sing ‘The Whipsaw Song’.

Python Tutorial: Bollinger Bands

Download the accompanying IPython Notebook for this Tutorial from Github. 

 

Python streamlines tasks requiring multiple steps in a single block of code. For this reason, it is a great tool for querying and performing analysis on data.

Last post, we outlined steps for calculating MACD Signal Line & Centerline Crossovers.

In this post, we introduce a new technical indicator,  Bollinger Bands.

Bollinger Bands

Developed by John Bollinger, Bollinger Bands® are volatility bands placed above and below a moving average. Volatility is based on the standard deviation, which changes as volatility increases and decreases. The bands automatically widen when volatility increases and narrow when volatility decreases. This dynamic nature of Bollinger Bands also means they can be used on different securities with the standard settings. For signals, Bollinger Bands can be used to identify Tops and Bottoms or to determine the strength of the trend.

Bollinger Bands reflect direction with the 20-period SMA and volatility with the upper/lower bands. As such, they can be used to determine if prices are relatively high or low. According to Bollinger, the bands should contain 88-89% of price action, which makes a move outside the bands significant. Technically, prices are relatively high when above the upper band and relatively low when below the lower band. However, relatively high should not be regarded as bearish or as a sell signal. Likewise, relatively low should not be considered bullish or as a buy signal. Prices are high or low for a reason. As with other indicators, Bollinger Bands are not meant to be used as a stand alone tool. Chartists should combine Bollinger Bands with basic trend analysis and other indicators for confirmation.

Bollinger Bands are calculated as follows:

Middle Band = 20 day moving average
Upper Band = 20 day moving average + (20 Day standard deviation of price x 2) 
Lower Band = 20 day moving average - (20 Day standard deviation of price x 2)

Bollinger Bands consist of a middle band with two outer bands. The middle band is a simple moving average that is usually set at 20 periods. A simple moving average is used because the standard deviation formula also uses a simple moving average. The look-back period for the standard deviation is the same as for the simple moving average. The outer bands are usually set 2 standard deviations above and below the middle band.

Let’s use Python to compute Bollinger Bands.

1. Start with the 30 Day Moving Average Tutorial code.

import pandas as pd
import pandas.io.data as web
%matplotlib inline
import matplotlib.pyplot as plt

stocks = ['FB']
def get_stock(stock, start, end):
     return web.get_data_yahoo(stock, start, end)['Adj Close']
px = pd.DataFrame({n: get_px(n, '1/1/2016', '12/31/2016') for n in names})
px

2. Compute the 20 Day Moving Average.

px['20 ma'] = pd.stats.moments.rolling_mean(px['FB'],20)

3. Compute 20 Day Standard Deviation. 

px['20 sd'] = pd.stats.moments.rolling_std(px['FB'],20)

4. Create Upper Band.

px['Upper Band'] = px['20 ma'] + (px['20 sd']*2)

5. Create Lower Band.

px['Lower Band'] = px['20 ma'] - (px['20 sd']*2)

6. Plot Bollinger Bands.

px.plot(y=['FB','20 ma', 'Upper Band', 'Lower Band'], title='Bollinger Bands')

There you have it! We created our Bollinger Bands. Here’s the full code:

import pandas as pd
import pandas.io.data as web
%matplotlib inline
import matplotlib.pyplot as plt

stocks = ['FB']
def get_stock(stock, start, end):
     return web.get_data_yahoo(stock, start, end)['Adj Close']
px = pd.DataFrame({n: get_px(n, '1/1/2016', '12/31/2016') for n in names})
px['20 ma'] = pd.stats.moments.rolling_mean(px['FB'],20)
px['20 sd'] = pd.stats.moments.rolling_std(px['FB'],20)
px['Upper Band'] = px['20 ma'] + (px['20 sd']*2)
px['Lower Band'] = px['20 ma'] - (px['20 sd']*2)
px.plot(y=['FB','20 ma', 'Upper Band', 'Lower Band'], title='Bollinger Bands')

Python Tutorial: MACD Signal Line & Centerline Crossovers

Python streamlines tasks requiring multiple steps in a single block of code. For this reason, it is a great tool for querying and performing analysis on data.

Last post, we outlined steps for calculating a stock’s MACD indicator.

In this post, we take MACD a step further by introducing Signal Line and Centerline Crossovers.

Signal Line Crossovers

Signal Line is defined as:

Signal Line: 9-day EMA of MACD Line

Signal line crossovers are the most common MACD signals. The signal line is a 9-day EMA of the MACD Line. As a moving average of the indicator, it trails the MACD and makes it easier to spot MACD turns. A bullish crossover occurs when the MACD turns up and crosses above the signal line. A bearish crossover occurs when the MACD turns down and crosses below the signal line. Crossovers can last a few days or a few weeks, it all depends on the strength of the move.

Let’s use Python to compute the Signal Line.

1.  Start with the MACD Tutorial code.

import pandas.io.data as web 
import pandas as pd 
%matplotlib inline 
import matplotlib.pyplot as plt 

names = ['FB'] 

def get_px(stock, start, end): 
     return web.get_data_yahoo(stock, start, end)['Adj Close'] 
px = pd.DataFrame({n: get_px(n, '1/1/2016', '1/17/2017') for n in names}) 
px['26 ema'] = pd.ewma(px["FB"], span=26) 
px['12 ema'] = pd.ewma(px["FB"], span=12) 
px['MACD'] = (px['12 ema'] - px['26 ema'])

2. Compute the 9 Day Exponential Moving Average of MACD.

px['Signal Line'] = pd.ewma(px['MACD'], span=9)

3. Create Signal Line Crossover Indicator. When MACD > Signal Line, 1. When MACD < Signal Line, 0.

px['Signal Line Crossover'] = np.where(px['MACD'] > px['Signal Line'], 1, 0)
px['Signal Line Crossover'] = np.where(px['MACD'] < px['Signal Line'], -1, px['Signal Line Crossover'])

Centerline Crossovers

Centerline crossovers are the next most common MACD signals. A bullish centerline crossover occurs when the MACD Line moves above the zero line to turn positive. This happens when the 12-day EMA of the underlying security moves above the 26-day EMA. A bearish centerline crossover occurs when the MACD moves below the zero line to turn negative. This happens when the 12-day EMA moves below the 26-day EMA.

Centerline crossovers can last a few days or a few months. It all depends on the strength of the trend. The MACD will remain positive as long as there is a sustained uptrend. The MACD will remain negative when there is a sustained downtrend.

4. Create Centerline Crossover Indicator. When MACD > 0, 1. When MACD < 0, 0.

px['Centerline Crossover'] = np.where(px['MACD'] > 0, 1, 0)
px['Centerline Crossover'] = np.where(px['MACD'] < 0, -1, px['Centerline Crossover'])

Plotting Crossovers

Last post, we posed the question: ‘When would you enter the position 😕 ?’

Now that we understand Signal Line Crossovers, let’s propose that we enter the position, ‘buy’, on 1, and we exit the position, ‘sell’, on -1.

5. Create Buy/Sell Indicator, based on Signal Line Crossovers. Multiply by 2 to increase size of indicator when plotted, so ‘buy’ on 2 and ‘sell’ on -2.

px['Buy Sell'] = (2*(np.sign(px['Signal Line Crossover'] - px['Signal Line Crossover'].shift(1))))

6. Plot close price, MACD & Signal Line, and Signal Line & Centerline Crossovers.

px.plot(y=['FB'], title='Close')
px.plot(y= ['MACD', 'Signal Line'], title='MACD & Signal Line')
px.plot(y= ['Centerline Crossover', 'Buy Sell'], title='Signal Line & Centerline Crossovers', ylim=(-3,3))

There you have it! We created MACD Signal Line and Centerline Crossovers, and based on the Crossovers, plotted ‘buy’ and ‘sell’ indicators.

Based on the entry and exit points, can you calculate the P&L? Stay tuned to find out.

Here’s the full code:

import pandas.io.data as web 
import pandas as pd 
%matplotlib inline 
import matplotlib.pyplot as plt 

names = ['FB'] 

def get_px(stock, start, end): 
     return web.get_data_yahoo(stock, start, end)['Adj Close'] 
px = pd.DataFrame({n: get_px(n, '1/1/2016', '1/17/2017') for n in names}) 
px['26 ema'] = pd.ewma(px["FB"], span=26) 
px['12 ema'] = pd.ewma(px["FB"], span=12) 
px['MACD'] = (px['12 ema'] - px['26 ema'])
px['Signal Line'] = pd.ewma(px['MACD'], span=9)
px['Signal Line Crossover'] = np.where(px['MACD'] > px['Signal Line'], 1, 0)
px['Signal Line Crossover'] = np.where(px['MACD'] < px['Signal Line'], -1, px['Signal Line Crossover'])
px['Centerline Crossover'] = np.where(px['MACD'] > 0, 1, 0)
px['Centerline Crossover'] = np.where(px['MACD'] < 0, -1, px['Centerline Crossover'])
px['Buy Sell'] = (2*(np.sign(px['Signal Line Crossover'] - px['Signal Line Crossover'].shift(1))))

px.plot(y=['FB'], title='Close')
px.plot(y= ['MACD', 'Signal Line'], title='MACD & Signal Line')
px.plot(y= ['Centerline Crossover', 'Buy Sell'], title='Signal Line & Centerline Crossovers', ylim=(-3,3))

Python Tutorial: MACD (Moving Average Convergence/Divergence)

Python streamlines tasks requiring multiple steps in a single block of code. For this reason, it is a great tool for querying and performing analysis on data.

In this post, we outline steps for calculating a stock’s MACD indicator. But first, what is MACD (Moving Average Convergence/Divergence)?

Developed by Gerald Appel in the late seventies, MACD is one of the simplest and most effective momentum indicators available. MACD turns two trend-following indicators, moving averages, into a momentum oscillator by subtracting the longer moving average from the shorter moving average. As a result, MACD offers the best of both worlds: trend following and momentum.

To calculate MACD, the formula is:

MACD: (12-day EMA - 26-day EMA)

EMA stands for Exponential Moving Average.

With that background, let’s use Python to compute MACD.

1. Start with the 30 Day Moving Average Tutorial code.

import pandas as pd
import pandas.io.data as web

stocks = ['FB']
def get_stock(stock, start, end):
     return web.get_data_yahoo(stock, start, end)['Adj Close']
px = pd.DataFrame({n: get_px(n, '1/1/2016', '12/31/2016') for n in names})
px

2. Compute the 26 Day Exponential Moving Average. We must call the column by the stock ticker.

px['26 ema'] = pd.ewma(px["FB"], span=26)

3. Then the 12 Day Exponential Moving Average.

px['12 ema'] = pd.ewma(px["FB"], span=12)

4. Subtract the 26 Day EMA from the 12 Day EMA, arriving at the MACD.

px['MACD'] = (px['12 ema'] - px['26 ema'])

5. Plot close price against MACD.

px.plot(y= ['FB'], title='FB')
px.plot(y= ['MACD'], title='MACD')

There you have it! We created our MACD indicator. Here’s the full code:

import pandas.io.data as web
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt

names = ['FB']
def get_px(stock, start, end): 
     return web.get_data_yahoo(stock, start, end)['Adj Close']
px = pd.DataFrame({n: get_px(n, '1/1/2016', '1/17/2017') for n in names})
px['26 ema'] = pd.ewma(px["FB"], span=26)
px['12 ema'] = pd.ewma(px["FB"], span=12)
px['MACD'] = (px['12 ema'] - px['26 ema'])
px.plot(y= ['FB'], title='FB')
px.plot(y= ['MACD'], title='MACD')

So when would you enter the position 😕 ?

Python Tutorial: Plot 30 Day Moving Average

Python streamlines tasks requiring multiple steps in a single block of code. For this reason, it is a great tool for querying and performing analysis on data.

Last post we created a DataFrame containing the daily ticker data for a specific stock and calculated its 30 day moving average. In this post, we will take it a step further and plot the DataFrame in order to visualize its contents.

1. Code from last post.

import pandas as pd
import pandas.io.data as web

stocks = ['FB']
def get_stock(stock, start, end):
     return web.get_data_yahoo(stock, start, end)['Adj Close']
px = pd.DataFrame({n: get_px(n, '1/1/2016', '12/31/2016') for n in names})
px['30 mavg'] = pd.rolling_mean(px, 30)
px

2. Import the matplotlib modules.

import matplotlib.pyplot as plt
%matplotlib inline

3. Call the plot function.

plt.plot(px)

There you have it! We used matplotlib to visualize our DataFrame. Looks like the stock tanked towards the end of 2016, perhaps due to the US Presidential Election 😉 

Here is the full code:

import pandas as pd 
import pandas.io.data as web 
import matplotlib.pyplot as plt
%matplotlib inline

stocks = ['FB'] 
def get_stock(stock, start, end):
     return web.get_data_yahoo(stock, start, end)['Adj Close'] 
px = pd.DataFrame({n: get_px(n, '1/1/2016', '12/31/2016') for n in names}) 
px['30 mavg'] = pd.rolling_mean(px, 30) 
plt.plot(px)

Python Tutorial: Query Stock Data, Calculate 30 Day Moving Average

Python streamlines tasks requiring multiple steps in a single block of code. For this reason, it is a great tool for querying and performing analysis on data. In this post we will use Python to pull ticker data for a specific stock and then calculate its 30 day moving average. Here are the steps:

1. Import the pandas modules.

import pandas as pd
import pandas.io.data as web

2. Create a list of the stocks for which you would like to query ticker data. For this example, we will pull ticker data for Facebook, ‘FB’. If you would like to add other stocks, simply add the symbols to the list separated by commas.

stocks = ['FB']

Or

stocks = ['FB','AAPL','GOOG','AMZN']

3. Write function to query data from yahoo finance. The function takes three arguments: the stock, the start date, and the end date. It returns the daily ‘Adj Close’. If you would like to pull a different value, simply switch it for ‘Adj Close’ without the brackets.

def get_stock(stock,start,end):
     return web.get_data_yahoo(stock,start,end)['Adj Close']

Or

def get_stock(stock,start,end):
     return web.get_data_yahoo(stock,start,end)['Volume']

4. Call function for the date range, 1/1/2016 – 12/31/2016. Use the ‘for n in stocks’ logic in case you have more than one stock for which you would like to pull data. Compile query in DataFrame, saved to variable ‘px’.

px = pd.DataFrame({n: get_stock(n, '1/1/2016', '12/31/2016') for n in stocks})

5. Add new column to DataFrame in which you calculate the 30 day moving average. Call DataFrame to view contents.

px['30 mavg'] = pd.rolling_mean(px, 30)
px

There you have it! We created a DataFrame containing the daily ticker data for a specific stock and then calculated its 30 day moving average. Here is the full code:

import pandas as pd
import pandas.io.data as web

stocks = ['FB']
def get_stock(stock, start, end):
     return web.get_data_yahoo(stock, start, end)['Adj Close']
px = pd.DataFrame({n: get_px(n, '1/1/2016', '12/31/2016') for n in names})
px['30 mavg'] = pd.rolling_mean(px, 30)
px

Part 2: Does Tweeting More Often Increase Favorites per Tweet and Retweets per Tweet?

In the last post we examined the relationship between MARpT and MAFpT, and found a .83 correlation. That is, as Retweets increase, Favorites increase. We also found a Favorite bias among entertainers and a Retweet bias with Barack Obama. In this post, we will examine whether tweeting more often increases favorites per tweet and/or retweets per tweet. Using the sample of Top Twitter Profiles, as listed by twittercounter.com, we will plot number of tweets against MAFpT and MARpT. Remember MAFpT and MARpT gives us the monthly average number of Favorites and Retweets per tweet. By this logic, if tweeting more often increases retweets or favorites per tweet, we should see higher MARpT and MAFpT with higher Monthly Tweets. Let’s look and see!

tweetstomafpt_v1

This plot displays MAFpT as a function of Monthly Tweets. A previous plot included Justin Bieber, who single handedly increased the correlation 0.4. Considering Justin Bieber an outlier and removing him from the sample, we find a weak correlation, r = 0.25. That is, there’s no strong link between monthly tweets and MAFpT. As for Justin Bieber, as his monthly tweets increase, his favorites per tweet increase–keep on tweeting, Biebs!

retweetsmonthlytweets_v1

This plot displays MARpT as a function of Monthly Tweets. We find an even weaker correlation here, r = 0.16. That is there’s no strong link between monthly tweets and MARpT. The weaker correlation supports the favorite bias we found among entertainers in part 1. That is, entertainers receive more favorites than retweets.

In conclusion, for this sample, we find little evidence that tweeting more often increases either favorites per tweet or retweets per tweet. While this may, or may not, be transferrable to your own twitter account, the findings lead us to ask what factors increase engagement with a tweet, or a message in general?

Until next time!

Part 1: Top Twitter Profiles, June 2015

Hello! My name is Andrew Hamlet, and I am a MBA student at NYU. I am developing proficiency with Python, for data mining applications. I plan to publish research on this blog. Since my background is in social media and web analytics, that’s where I will start. As is the nature with scientific inquiry, collaboration is welcome! Please comment with suggestions or further lines of inquiry. Now let’s begin. Using Python, I gathered tweets, including the retweet and favorite counts, occurring in June 2015 for the Top Twitter Personalities (no companies), as listed by twittercounter.com. Here’s a table displaying the data, in ascending order by Followers. The Tweets, Retweets, and Favorites columns display totals for each Twitter Profile during June 2015. celebrityanalysistable_v2 Since there is variance among total number of Tweets, that is Katy Perry tweeted 35 times in June 2015 while Justin Bieber tweeted 167 times in June 2015, I normalized Retweets and Favorites, dividing them by Tweets to give Monthly Average Retweet per Tweet and Monthly Average Favorite per Tweet. Here’s a table displaying the results. monthlyaverageretweettable_V2 There appears to be a relationship Between Monthly Average Retweet per Tweet and Monthly Average Favorite per Tweet, that is as MARpT increases MAFpT increases. Let’s plot to verify. (For visual display, Justin Timberlake, Britney Spears, Ellen Degeneres, and Justin Bieber are removed from the chart.) Twitter10TenJune2015RetweetFavorite_v5 Yes! There’s a 0.83 correlation between MARpT and MAFpT, so Favorites increase as Retweets increase. Even more, we see a Favorite bias among this sample, that is the twitter profiles receive more MAFpT than MARpT. However, we see a Retweet Bias with Barack Obama. Perhaps politicians receive more retweets, while entertainers receive more favorites? In the next post, using this sample we will investigate whether there is a relationship between total number of Tweets and MARpT or MAFpT, that is does tweeting more during a month boost the number of interactions per tweet? Check back to find out!