What is the fraction of tweets that are neutral?

For this question i have converted my list into string for both happy and sad keywords and saved in a different list.

a.My question is what if the same tweet had multiple happy keyword or sad keyword ,How should i optimize my code ?
ex: " I’m happy that my good code has compiled without any errors"

b.Also pls explain the logic behind the neutral keywords ?

image

I’m just making an assumption here and saying that your variables sad_words and happy_words are lists?

if they are lists, then your code is basically creating a variable total_words into a really long string. This is not what you want to do as it will create problems when it gets to the for word in total_words loop since it will loop over each character instead of the word you were hoping to loop over in total_words.

I have the code below that outputs the correct neutral tweet count and explains the logic behind the neutral variable.

happy_words = ['great', 'excited', 'happy', 'nice', 'wonderful', 'amazing', 'good', 'best']
sad_words = ['sad', 'bad', 'tragic', 'unhappy', 'worst']

number_of_neutral_tweets = 0

total_words = sad_words + happy_words
for tweet in tweets:
neutral = True # Assigning neutral to be a boolean "True"
for word in total_words:
if word in tweet:
neutral = False # Converts neutral to false if any happy or sad word is found in the tweet
break # breaks the for loop and starts over
number_of_neutral_tweets += 1 if neutral else 0 # Will only increment if neutral is true

print(number_of_neutral_tweets) # Print out 2 as there are only 2 neutral tweets

1 Like

Don’t make it harder than it is.
There are no neutral keywords, only happy and sad.
By this point you have already calculated number_of_happy_tweets and number_of_sad_tweets by filtering on happy_words and sad_words, and have these values in variables. Neutral tweets are what is left.
hint ~ number of all tweets can be found with the len() function

1 Like

I simply looked for either a happy or sad word and then exited the loop. I don’t think any further analysis is required in this example. In the real world, you may want to sum up the number of happy and sad words and form a conclusion based on which one is higher.

If we treat all 3 categories as mutually exclusive, happy+sad+neutral=total tweets, so neutral=total-happy-sad.

1 Like

That, IMHO, is what makes (anti)SocialMedia polling so biased in the first place.
It’s all a matter of perspective.

At the same time there is nothing stopping a researcher from looking at whether the words chosen for focus occur individually or in groups.

1 Like

Good question, if your word has multiple happy words or sad words,
it would be a problem.
so you have to use BREAK statement, right after your counter, to
break out of the for loop once you find one occurrence of HAPPY or SAD word.

Hope it helps.

1 Like