Twitter Sentiment Strategy

1. Introduction

This is an exploratory research trying to implement a sentiment analysis mechanism of ten of the most discussed techonology stocks in the market nowadays. By collecting people’s discussions about the stocks on twitter, we used NLP (Natural language processing) to analyze people’s reactions and guesses of the market and got a score which indicates people’s positive or negative attitudes towards those certain stocks. Based on the daily scores we get, we adjusted our positions automatically. The strategy is implemented with backtesting period from Jan. 1st 2020 to May 31th 2020. It yields a remarkable Sharpe Ratio of 3.01, which proves this is a pretty decent and tradable signal existing in the market.

2. Data Preparation

In this section, we’re going to discuss the source of our data and how we prepare for the sentiment strategy.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import os
import twint
import pandas as pd
import nest_asyncio
import flair
import warnings
from tqdm.auto import tqdm
import re
import nltk
from nltk import word_tokenize
from nltk.corpus import stopwords
from nltk.stem.wordnet import WordNetLemmatizer
from nltk.corpus import wordnet
from wordcloud import WordCloud
import matplotlib.pyplot as plt
import seaborn as sns
from collections import Counter
import datetime
import quandl
import pickle

nest_asyncio.apply()
warnings.filterwarnings('ignore')
plt.rcParams['figure.figsize'] = (20, 5)
quandl.ApiConfig.api_key = 'yFs2mPKxvfCC26C4vG3K'

A. Data Crawling

We chose ten of the most discussed technology stocks ( Tesla(tsla), Netflix(nflx), Microsoft(msft), Zoom(zm), Apple( aapl), Amazon(amzn), Twitter(twtr), Google(googl), Sony(sne), Nvidia(nvda) ) in the market nowadays. Technology’s changing the world, especially during the COVID19 quarantine. Unlike traditional retail business, social media and online shopping companies like Twitter and Amazon are still growing and making profits. Companies like Zoom doubled its market value. No matter what makes people excited, Tesla and its leader Elon Musk always interests people a lot and with the success of the Space X mission several days ago, it is one of the most heated topics on social media.

We want to know how the market changes with people’s attitudes and reactions towards stocks and investment, so we used Twitter, one of the most popular social media around the world, to collect people’s thoughts about the stocks in our portfolio. We used those companies’ names and their stock names as our keywords to crawl tweets from twitter.

To get the complete tweets in the time period we chose, we used a package named twint. It’s an advanced Twitter scaping tool. By implementing twint, we could get every tweet about our stocks during the backtest period. Twint has a lot of interesting functions to help you scape every information you need from Twitter.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def get_tweets(start_date, end_date, company_name, ticker):
    c = twint.Config()
    c.Search = company_name, ticker
    c.Since = start_date
    c.Until = end_date
    c.Store_csv = True
    c.Lang = 'en'
    c.Count = True
    c.Hide_output = True
    c.Format = 'Tweet id: {id} | Date: {date} | Time: {time} | Tweet: {tweet}'
    c.Custom['tweet'] = ['id', 'date', 'time', 'tweet']
    c.Output = f'{company_name}_tweets_202001.csv'
    twint.run.Search(c)

start_date = '2020-01-01'
end_date = '2020-05-31'hugo
stock_pool = {'tesla': 'tsla', 'netflix': 'nflx', 'microsoft': 'msft', 'zoom': 'zm', 'apple': 'aapl',
              'amazon': 'amzn', 'twitter': 'twtr', 'google': 'googl', 'sony': 'sne', 'nvidia': 'nvda'}

for k, v in tqdm(stock_pool.items()):
    if os.path.isfile(f'data/tweets_{k}_202001.csv'):
        print(f'{k} finished')
    else:  
        print(f'downloading {k}')
        get_tweets(start_date, end_date, k, v)

b. Collect EOD

From Quandl, we also collected the daily Adjusted Close Price of the stocks in our portfolio.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def get_eod(tic):
    return quandl.get(f'EOD/{tic}', start_date=start_date, end_date=end_date).Adj_Close

if os.path.isfile('data/EOD.pkl'):
    with open('data/EOD.pkl', 'rb') as f:
        EOD = pickle.load(f)
else:
    EOD = {}
    for k, v in tqdm(stock_pool.items()):
        EOD[k] = get_eod(v.upper())
    with open('data/EOD.pkl', 'wb') as f:
        pickle.dump(EOD, f)

For the EOD data, we convert the daily adjusted close price to daily return for the convenience of our further operations.

1
2
3
EOD = pd.DataFrame(EOD)
return_df = EOD.pct_change()
return_df = return_df.dropna()

3. Sentiment Strength Analysis

To analyze the twitter users’ attitudes towards the stocks, after crawling data from twitter, we tried to analyze users’ tweets and generate a sentiment score for each of the tweet.

Here, we used a package named flair. Basically, flair’s mechanism is simple. It contains a powerful library which allows users to use and combine different word and document embeddings. Based on the corpus, it could analyze and tell the attitudes of the speakers. Comparing with other NLP packages, flair’s sentiment classifier is based on a character-level LSTM neural network which takes sequences of letters and words into account when predicting. It’s based on a corpus but in the meantime, it could also predict a sentiment for OOV(Out of Vocab) words including typos.

1
flair_sentiment = flair.models.TextClassifier.load('en-sentiment')
2020-06-21 15:58:46,287 loading file /Users/christine/.flair/models/sentiment-en-mix-distillbert.pt

We created a function to computing the sentiment score of each tweet. The closer the number’s absolute value to 1, the more certain the model is and the more extreme the attitudes of the users are. When the sentiment of a sentence is positive, the score is positive; when the sentiment of a sentence is negative, the score is negative. Amazingly, since the sentiment classifier could tell the intensifiers such as ‘very’, the score will vary when intensifiers are in the game.

1
2
3
4
5
6
7
8
def senti_score(n):
    s = flair.data.Sentence(n)
    flair_sentiment.predict(s)
    total_sentiment = s.labels[0]
    assert total_sentiment.value in ['POSITIVE', 'NEGATIVE']
    sign = 1 if total_sentiment.value == 'POSITIVE' else -1
    score = total_sentiment.score
    return sign * score

Example: we used several examples to explain how the model works and how sensitive it is to intensifiers.

1
2
s = 'Tesla is a great company'
senti_score(s)
0.9977720379829407
1
2
s = 'Tesla is a good company'
senti_score(s)
0.9961444139480591
1
2
s = 'Tesla is just a company'
senti_score(s)
-0.8988585472106934
1
2
s = 'Tesla is a bad company'
senti_score(s)
-0.9998192191123962

We used the model for each of the stock’s tweets we got.

1
2
3
4
5
6
for k in tqdm(stock_pool.keys()):
    if os.path.isfile(f'data/{k}_with_sentiment.csv'):
        print('pass', k)
    else:
        data[k]['sentiment_score'] = [senti_score(n) for n in tqdm(data[k]['tweet'])]
        data[k].to_csv(f'data/{k}_with_sentiment.csv')

4. Preliminary Analysis

After getting the sentiment scores of our tweets, before implementing the strategy we have, we did some preliminary analysis of the data.

1
data = {k: pd.read_csv(f'data/{k}_with_sentiment.csv', lineterminator='\n', index_col=0, parse_dates=[3]).set_index('date_time') for k in stock_pool}

A. Clean the tweets

Since we directly crawled the tweets from Twitter, there’re a lot of websites links, emojis and also invalid characters included in the tweets. Here, we used a package named nltk to help us clean and tokenize the tweets. By applying a stoplist, we seperated our tweets into a list of words and then we eliminated the invalid characters and links to keep the keywords.

1
2
3
4
5
6
nltk.download('stopwords')
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('wordnet')
stopword_list = stopwords.words('english')
lemmatizer = WordNetLemmatizer()
[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/christine/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to /Users/christine/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /Users/christine/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package wordnet to
[nltk_data]     /Users/christine/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
def clean_tweets(df):
    df['tweet'] = df['tweet'].astype(str)
    df['tweet_cleaned'] = df['tweet'].map(lambda x: x + ' ') 
    df['tweet_cleaned'] = df['tweet_cleaned'].map(lambda x: re.sub(r'http.*', '', x))
    df['tweet_cleaned'] = df['tweet_cleaned'].map(lambda x: re.sub(r"[^a-zA-Z#]", ' ', x))
    df['tweet_cleaned'] = df['tweet_cleaned'].map(lambda x: x.lower())
    stopword_list = stopwords.words('english')
    for i in range(len(df['tweet_cleaned'])):
        tokens = word_tokenize(df['tweet_cleaned'][i])
        clean_tokens = [w for w in tokens if w not in stopword_list]
        df['tweet_cleaned'][i] = clean_tokens

for k in tqdm(stock_pool.keys()):
    if os.path.isfile(f'data/{k}_with_sentiment_lemmatized .csv'):
        print('pass', k)
    else:
        clean_tweets(data[k])

data['tesla'].head()
time id tweet sentiment_score tweet_cleaned
2020-01-01 00:18:25 1212256747051577344 #Update: Tesla CEO Elon Musk spends part of #N… -0.539993 [#, update, tesla, ceo, elon, musk, spends, pa…
2020-01-01 00:19:58 1212257135121162240 Executive mismanagement linked to fraud disgui… -0.999139 [executive, mismanagement, linked, fraud, disg…
2020-01-01 00:20:35 1212257289668907008 Delivery announcement, Q4 earnings report and … 0.994400 [delivery, announcement, q, earnings, report, …
2020-01-01 00:25:22 1212258495992651776 Tesla is a great company but it doesn’t mean t… -0.994792 [tesla, great, company, mean, success, guarant…
2020-01-01 00:28:44 1212259343606964224 This year was amazing. It was so great to be a… 0.997422 [year, amazing, great, among, incredible, peop…

B. Lemmatization

We created a functoin in order to lemmatize our list of words by removing inflectional endings and returning the base or dictionary form of a word. It’s also a standard step for NLP analysis.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
def lemmatize_tweet(tweet):
    for i in range(len(tweet)):
        for word in [tweet[i]]:
            pos_tag_list = nltk.pos_tag(word)
        wordnet_tags = []
        for j in pos_tag_list:
            if j[1].startswith('J'):
                wordnet_tags.append(wordnet.ADJ)
            elif j[1].startswith('N'):
                wordnet_tags.append(wordnet.NOUN)
            elif j[1].startswith('R'):
                wordnet_tags.append(wordnet.ADV)
            elif j[1].startswith('V'):
                wordnet_tags.append(wordnet.VERB)
            else:
                wordnet_tags.append(wordnet.NOUN)
        lem_words = []
        for k in range(len(tweet[i])):
            lem_words.append(lemmatizer.lemmatize(tweet[i][k], pos=wordnet_tags[k]))
        lem_tweet = ' '.join(lem_words)
        tweet[i] = lem_tweet
1
2
3
4
5
for k in tqdm(stock_pool.keys()):
    if os.path.isfile(f'data/{k}_with_sentiment_lemmatized .csv'):
        print('pass', k)
    else:
        lemmatize_tweet(data[k]['tweet_cleaned'])
1
2
for k in stock_pool.keys():
    data[k].to_csv(f'data/{k}_with_sentiment_lemmatized.csv')
1
data['tesla']
time id tweet sentiment_score tweet_cleaned
2020-01-01 00:18:25 1212256747051577344 #Update: Tesla CEO Elon Musk spends part of #N… -0.539993 # update tesla ceo elon musk spend part # newy…
2020-01-01 00:19:58 1212257135121162240 Executive mismanagement linked to fraud disgui… -0.999139 executive mismanagement link fraud disguise te…
2020-01-01 00:20:35 1212257289668907008 Delivery announcement, Q4 earnings report and … 0.994400 delivery announcement q earnings report batter…
2020-01-01 00:25:22 1212258495992651776 Tesla is a great company but it doesn’t mean t… -0.994792 tesla great company mean success guarantee sti…
2020-01-01 00:28:44 1212259343606964224 This year was amazing. It was so great to be a… 0.997422 year amazing great among incredible people # t…
2020-05-31 18:13:29 1267232763570249732 My latest @cleantechnica article shares footag… -0.995451 late cleantechnica article share footage erday…
2020-05-31 18:21:45 1267234842401550336 PUBLISHERS NOTE/\nComing On The Heals Of The @… 0.961094 publisher note come heals spacex usa nasa st m…
2020-05-31 18:31:45 1267237359021625353 they can really can ramp up and squeeze shorts… 0.919704 really ramp squeeze short sure want play elon …
2020-05-31 18:35:46 1267238371107041280 Quick Update on $TSLA - Wave 4 triangle has co… 0.999554 quick update tsla wave triangle complete # tes…
2020-05-31 18:42:40 1267240106018181120 Tesla continues to dominate electric vehicle t… 0.940820 tesla continue dominate electric vehicle techn…

C. Word Cloud and Frequency Analysis

After cleaning, tokenizing and lemmatizing our data, we wanted to see when people mentioned our stocks on Twitter, what are the most frequently mentioned words. We did a frequency analysis of our tweet-word-list and formed WordCloud, the word-frequency plot and also the dataframe of the most frequent words people mentioned. We eliminated the single letters, special characters and invalid words such as “inc”, “com” and “pic” from the list to keep the word lists more relevant.

1
data = {k: pd.read_csv(f'data/{k}_with_sentiment_lemmatized.csv', lineterminator='\n', index_col=0, parse_dates=[0]) for k in stock_pool}
1
stwds = ['inc', 'pic', 'com', '#'] + [chr(i) for i in range(ord('A'), ord('z') + 1)]
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
def freq_analysis(k, plot=False):
    # Wordcloud
    df = data[k].copy()
    df['tweet_cleaned'] = df['tweet_cleaned'].astype(str).map(lambda x: ' '.join([w for w in x.split() if w not in stwds]))
    complete_string = ' '.join([tweet for tweet in df['tweet_cleaned']])
    word_list = complete_string.split()
    frequency = nltk.FreqDist(word_list)
    
    if plot:
        wordcloud = WordCloud(
            width=900,
            height=500,
            max_words=500,
            max_font_size=100,
            relative_scaling=0.5,
            colormap='Blues',
            background_color=None,
            mode='RGBA',
            normalize_plurals=True
        ).generate_from_frequencies(frequency)
        plt.figure(figsize=(17,14))
        plt.imshow(wordcloud, interpolation='bilinear')
        plt.axis("off")
        plt.tight_layout()
        plt.show()
    
    # Word frequency analysis
    word_frequency = nltk.FreqDist(word_list)
    frequency_df = pd.DataFrame({'Word': list(word_frequency.keys()), 'Count': list(word_frequency.values())}).sort_values(by=['Count'], ascending=False)
    frequency_df = frequency_df[(frequency_df['Word'] != k) & (frequency_df['Word'] != stock_pool[k])]
    frequency_df = frequency_df.nlargest(columns="Count", n=25).reset_index(drop=True)
    
    if plot:
        # Word frequency plot
        plt.figure(figsize=(16,5))
        ax = sns.barplot(data=frequency_df, x="Word", y="Count")
        ax.set_ylabel('Count')
        ax.set_xlabel('Word')
        ax.set_xticklabels(ax.get_xticklabels(), rotation=40, ha="right", fontsize=12)
        ax.set_title("Word Frequency", fontsize=20)
        plt.tight_layout()
        plt.show()
    
    return frequency_df

In order to illustrate what we did, I showed the full analysis process of the stock TSLA. Following are the WordCloud of the tesla tweets, word frequency plot.

1
freq_analysis('tesla', plot=True);

picture alt

picture alt

For the readability of this paper, we hid the WordCloud and the plots for the rest 9 stocks, however, we summarized the most frequently used words for the 10 stocks in our portfolio in the following dataframe.

1
2
3
freq_analysis_result = {k: freq_analysis(k)['Word'] for k in stock_pool}

pd.concat(freq_analysis_result, keys=stock_pool, axis=1)
id tesla netflix microsoft zoom apple amazon twitter google sony nvidia
0 stock stock corp video stock nan stock goog twitter stock
1 tslaq twitter stock stock market check jack alphabet aapl amd
2 twitter earnings market security price stock fb stock apple earnings
3 market price rating verb iphone twitter price cloud playstation price
4 car dis twitter communication twitter market like twitter nikkei corporation
5 elonmusk new change twitter new buy tweet apple tokyo target
6 elon high result msft buy company go coronavirus jpm twitter
7 go buy surprise use coronavirus aapl get aapl msft high
8 short stream cloud company close apple trump amzn market buy
9 price market team conferencing rating earnings user new rkuny new
10 model amzn amzn secure earnings get company fb amzn ai
11 buy disney apple buy china fb facebook company nflx corp
12 musk subscriber new platform change say time amazon goog indicator
13 get share buy tech store year share msft get share
14 share watch aapl orcl amzn msft would facebook rakuten nasdaq
15 year trade amazon work result new close microsoft pt raise
16 day long work co year time say say stock short
17 make short earnings launch share black make make therealthing market
18 company content company crm msft bezos ceo work theoneandonly analyst
19 say see say user say share buy business fb trade
20 sell close price market company high day year sensor close
21 like amazon year shop surprise one new ad bac data
22 new year coronavirus go sell wireless see revenue year center
23 china go corporation people nasdaq microsoft year youtube buy rating
24 time target report partner amazon day account covid game report

5. Trading Strategy

Warren Buffet once said, We simply attempt to be fearful when others are greedy and to be greedy only when others are fearful. His words are the source of our idea of this strategy.

Our strategy is based on people’s reaction of today’s market. We collected every tweet from today’s 16:00(today’s market close) to next day’s 9:00 (next day’s market open). Based on people’s attitude towards today’s market performance, we make the opposite move.

A. Original Plan

Go long three stocks which have the lowest score and short three stocks with have the highest score.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
def night_filter(df):
    same_night = df[df.index.to_series().apply(lambda t: t.hour > 15)]['sentiment_score'].resample('D').sum()
    next_morning = df[df.index.to_series().apply(lambda t: t.hour < 9)]['sentiment_score'].resample('D').sum()
    return (same_night + next_morning.shift(-1)).fillna(0)

thresh = .5
score_df = {k: night_filter(data[k]) for k in stock_pool}
score_df = pd.concat(score_df, keys=score_df, axis=1)

pos_df_2 = score_df.shift().loc[return_df.index]
for i in range(len(pos_df_2)):
    long = pos_df_2.iloc[i].nsmallest(3).keys()
    short = pos_df_2.iloc[i].nlargest(3).keys()
    for key in pos_df_2:
        if key in long:
            pos_df_2.iloc[i][key] = 1/3
        elif key in short:
            pos_df_2.iloc[i][key] = -1/3
        else:
            pos_df_2.iloc[i][key] = 0
pos_df_2
time tesla netflix microsoft zoom apple amazon twitter google sony nvidia
2020-01-03 0.333333 0.333333 0.000000 -0.333333 0.333333 0.000000 -0.333333 0.000000 -0.333333 0.000000
2020-01-06 0.333333 0.000000 0.000000 -0.333333 0.333333 0.333333 0.000000 0.000000 -0.333333 -0.333333
2020-01-07 0.333333 0.333333 -0.333333 -0.333333 0.333333 0.000000 0.000000 0.000000 0.000000 -0.333333
2020-01-08 0.333333 0.000000 0.000000 -0.333333 0.333333 0.000000 0.333333 0.000000 -0.333333 -0.333333
2020-01-09 0.333333 0.000000 0.333333 -0.333333 0.333333 -0.333333 0.000000 0.000000 0.000000 -0.333333
2020-05-22 0.333333 0.333333 0.000000 -0.333333 0.000000 0.000000 0.000000 -0.333333 -0.333333 0.333333
2020-05-26 0.000000 0.000000 0.000000 -0.333333 0.333333 -0.333333 0.333333 -0.333333 0.000000 0.333333
2020-05-27 0.333333 0.000000 -0.333333 0.000000 0.333333 0.000000 0.333333 0.000000 -0.333333 -0.333333
2020-05-28 0.333333 0.000000 -0.333333 0.000000 0.333333 -0.333333 0.333333 0.000000 0.000000 -0.333333
2020-05-29 0.333333 0.000000 -0.333333 0.000000 0.333333 -0.333333 0.333333 0.000000 -0.333333 0.000000

We set the benchmark of this strategy to be hold the 10 stocks evenly everyday. Also, we set the alpha to be the excess return of the strategy.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
benchmark_2 = return_df.mean(axis=1)
benchmark_2.cumsum().plot(color='k', label='benchmark')
strategy_2 = (return_df * pos_df_2).sum(axis=1)
strategy_2.cumsum().plot(color='r', label='strategy (absolute return)')
alpha_2 = strategy_2 - benchmark_2
alpha_2.cumsum().plot(color='c', label='alpha (excess return)')
plt.xticks(rotation=0, ha='center')
plt.tight_layout()
plt.legend(frameon=False)
plt.ylabel('Cum PnL (%)')
plt.show()

picture alt

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
sharpe = lambda x: x.mean() / x.std() * 252**.5
sortino = lambda x: x.mean() / (((x - x.mean())**2 * (x < 0)).mean())**.5 * 252**.5


def maximum_drawdown(x):
    cumx = (x + 1).cumprod()
    return (1 - cumx / cumx.cummax()).max()


print(f'SR(benchmark) = {sharpe(benchmark_2):.2f}')
print(f'SR(strategy) = {sharpe(strategy_2):.2f}')
print(f'SR(alpha) = {sharpe(alpha_2):.2f}')
print()
print(f'Sortino(benchmark) = {sortino(benchmark_2):.2f}')
print(f'Sortino(strategy) = {sortino(strategy_2):.2f}')
print(f'Sortino(alpha) = {sortino(alpha_2):.2f}')
print()
print(f'Max_Drawdown(benchmark) = {maximum_drawdown(benchmark_2):.1%}')
print(f'Max_Drawdown(strategy) = {maximum_drawdown(strategy_2):.1%}')
print(f'Max_Drawdown(alpha) = {maximum_drawdown(alpha_2):.1%}')
SR(benchmark) = 1.87
SR(strategy) = -0.10
SR(alpha) = -1.78

Sortino(benchmark) = 2.45
Sortino(strategy) = -0.13
Sortino(alpha) = -2.33

Max_Drawdown(benchmark) = 27.9%
Max_Drawdown(strategy) = 32.5%
Max_Drawdown(alpha) = 44.1%

From the result, we could see that the Sharpe Ratio of our strategy is not good…(negative SR), so we tried to improve our strategy.

B. Improvement

Instead of taking the highest three scores and lowest three scores, we took the sum of every tweet’s positive sentiment score and also negative sentiment scores in this time period (which indicates people’s attitudes towards today’s market performance by sentiment score and also the volume, the higher the volume, the more satisfied/disappointed people are). Based on the sum of the score we got for each stock, we normalize our positive and negative scores by seperately dividing the positive/negative sums of the sentiment scores to achieve market neutral, which means this does not cost us but in the meantime makes us money.

1
2
3
4
5
6
7
8
9
def normalize(s):
    pos_sum = s[s > 0].sum()
    neg_sum = s[s < 0].sum()
    for i in range(len(s)):
        s[i] /= pos_sum if s[i] > 0 else neg_sum if s[i] < 0 else 1

pos_df = -score_df.shift().loc[return_df.index]
for i, row in pos_df.iterrows():
    normalize(row)
time tesla netflix microsoft zoom apple amazon twitter google sony nvidia
2020-01-03 0.401892 0.114972 0.059096 -0.000000 0.330018 0.020177 0.009599 0.044186 0.008309 0.011751
2020-01-06 0.270341 0.065981 0.042288 -0.000000 0.393034 0.095976 0.058378 0.040528 0.020787 0.012688
2020-01-07 0.442299 0.105930 0.021281 -0.000000 0.201535 0.080591 0.065456 0.041283 0.026490 0.015135
2020-01-08 0.407693 0.066788 0.079381 0.009896 0.220641 0.052692 0.103467 0.031961 0.003662 0.023819
2020-01-09 0.360541 0.041014 0.081616 0.000709 0.361786 0.014596 0.078035 0.021689 0.024410 0.015605
2020-05-22 0.209791 0.150681 0.040223 0.017995 0.136417 0.125920 0.106557 0.020171 0.032025 0.160221
2020-05-26 0.144673 0.100749 0.102627 0.014581 0.261007 1.000000 0.154036 0.018609 0.032293 0.171425
2020-05-27 0.279436 0.054341 0.014196 0.023837 0.138186 0.071761 0.367798 0.029868 0.000406 0.020169
2020-05-28 0.096364 0.065429 0.617590 0.011994 0.216065 0.382410 0.526869 0.055728 0.019243 0.008308
2020-05-29 0.235185 0.031032 0.331849 0.029525 0.125487 0.668151 0.526221 0.032061 -0.000000 0.020488

To evaluate the performance, like before, we also set the benchmark of this strategy to be hold the 10 stocks evenly everyday. Also, we set the alpha to be the excess return of the strategy.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
benchmark = return_df.mean(axis=1)
benchmark.cumsum().plot(color='k', label='benchmark')
strategy = (return_df * pos_df).sum(axis=1)
strategy.cumsum().plot(color='r', label='strategy (absolute return)')
alpha = strategy - benchmark
alpha.cumsum().plot(color='c', label='alpha (excess return)')
plt.xticks(rotation=0, ha='center')
plt.tight_layout()
plt.legend(frameon=False)
plt.ylabel('Cum PnL (%)')
plt.show()

picture alt

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
print(f'SR(benchmark) = {sharpe(benchmark):.2f}')
print(f'SR(strategy) = {sharpe(strategy):.2f}')
print(f'SR(alpha) = {sharpe(alpha):.2f}')
print()
print(f'Sortino(benchmark) = {sortino(benchmark):.2f}')
print(f'Sortino(strategy) = {sortino(strategy):.2f}')
print(f'Sortino(alpha) = {sortino(alpha):.2f}')
print()
print(f'Max_Drawdown(benchmark) = {maximum_drawdown(benchmark):.1%}')
print(f'Max_Drawdown(strategy) = {maximum_drawdown(strategy):.1%}')
print(f'Max_Drawdown(alpha) = {maximum_drawdown(alpha):.1%}')
SR(benchmark) = 1.87
SR(strategy) = 2.94
SR(alpha) = 3.01

Sortino(benchmark) = 2.45
Sortino(strategy) = 4.36
Sortino(alpha) = 5.03

Max_Drawdown(benchmark) = 27.9%
Max_Drawdown(strategy) = 27.1%
Max_Drawdown(alpha) = 12.2%

From the Sharpe ratio and the result we got, there’s a huge improvement. It illustrates that insteading of choosing the best and worse to go long and short evenly, we normally distribute our positions based on their level of scores. It’s more statistically significant. This is the strategy we’re gonna use.

C. Monthly Analysis

After setting the strategy, we digged into the monthly ratio to see the monthly performance.

1
2
performance = pd.concat([benchmark, strategy, alpha], axis=1)
performance.columns = ['Benchmark', 'Strategy', 'Alpha']

a. Sharpe Ratio

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
sharpe_month = performance.resample('M').apply(sharpe)
sharpe_month = sharpe_month.stack().reset_index()
sharpe_month.columns = ['month', 'class', 'value']
sharpe_month['month'] = pd.to_datetime(sharpe_month['month']).dt.strftime('%b')

g_sharpe = sns.catplot(x='month', y='value', hue='class', kind='bar', data=sharpe_month, legend=False)
g_sharpe.despine(left=True)
plt.legend(loc='upper right', frameon=False)
plt.tight_layout()
plt.show()

picture alt

b. Sortino Ratio

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
performance.resample('M').apply(sortino)

sortino_month = performance.resample('M').apply(sortino)
sortino_month = sortino_month.stack().reset_index()
sortino_month.columns = ['month', 'class', 'value']
sortino_month['month'] = pd.to_datetime(sortino_month['month']).dt.strftime('%b')

g_sortino = sns.catplot(x='month', y='value', hue='class', kind='bar', data=sortino_month, legend=False)
g_sortino.despine(left=True)
plt.legend(loc='upper right', frameon=False)
plt.tight_layout()
plt.show()

picture alt

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
performance.resample('M').apply(maximum_drawdown)

maxdraw_month = performance.resample('M').apply(maximum_drawdown)
maxdraw_month = maxdraw_month.stack().reset_index()
maxdraw_month.columns = ['month', 'class', 'value']
maxdraw_month['month'] = pd.to_datetime(maxdraw_month['month']).dt.strftime('%b')

g_maxdraw = sns.catplot(x='month', y='value', hue='class', kind='bar', data=maxdraw_month, legend=False)
g_maxdraw.despine(left=True)
plt.legend(loc='upper right', frameon=False)
plt.tight_layout()
plt.show()

picture alt

From the PNL plot we drew in the last section, we could see the performance of the strategy tends to be less effective in April and May. After plotting the three ratios of our monthly data, it’s obvious that our alpha and strategy’s edges are dimming from April to May (Sharpe Ratio and Sortino Ratio are dropping significantly and the maximum drawdown is increasing). On the contrary, the performance of the benchmark(hold ten stocks evenly every day) is steady and decent comparing with the other two. So, if we want to better our strategy’s performance further, we could use the benchmark strategy and just buy and hold equal position of ten stocks in April and May. Combining with what happened in the states( COVID-19, quarantine, Trump’s false administration…), we concluded that our strategy gradually lost its edge during this period because people’s attitudes and judgements of market are not only based on the performance of the stocks and their rational analysis, but also on their emotions(especially rage). In this case, the sentiment analysis is not effective like before. It shows the limitation of our strategy: people are emotional creatures when unusual things happen, so investing based on people’s attitudes towards market can be profitable but in the meantime risky.

6. Conclusion

Our strategy is based on NLP analysis towards twitters of the stock markets. The performance of our strategy shows the significance of the signal; however, during special time period, this strategy could be less effective and more volatile. In order to better improve the strategy’s performance, we could enlarge our portfolio pool, which would definitely increase the accuracy of the signal. Also, expanding the backtest period would also be effective.


Reference

Clarence C.Y. Kwan (1998). “A note on market-neutral portfolio selection”. Journal of BANKING & FINANCE.

Tushar Rao and Saket Srivastava(2013).“Analyzing Stock Market Movements Using Twitter Sentiment Analysis”. LINCOLN REPOSITORY.

Yuexin Mao, Bing Wang, Wei Wei, Benyuan Liu. “Correlating S&P 500 Stocks with Twitter Data”. ACM Digital Library.

Anshul Mittal and Arpit Goel(2011). “Stock Prediction Using Twitter Sentiment Analysis”.