Clean Up Your Personal Twitter Timeline by Clustering Tweets

Here's a little bit of gossip for you: The original project for this title had to do with detecting foreign influence on US elections in social media. At about the same time, I was also applying for a visa to the United States, to give a series of talks. It later transpired that I hadn't needed the visa after all; ESTA covered all the things I had wanted to do in the United States. But as I was preparing for the visa, an attorney gave me a very stern talking-to about writing a book on the politics of the United States. The general advice is this—if I don't want trouble with US Customs and Border Patrol, I should not write or say anything on social media about American politics, and especially not write a chapter of a book on it. So, I had to hastily rewrite this chapter. The majority of methods used in this chapter can be used for the original purpose, but the content is a lot milder.

I use Twitter a lot. I mainly tweet and read Twitter in my downtime. I follow many people who share similar interests, among other things, machine learning, artificial intelligence, Go, linguistics, and programming languages. These people not only share interests with me; they also share interests with one another. As such, sometimes, multiple people may be tweeting about the same topic.

As may be obvious from the fact that I use Twitter a lot, I am a novelty junkie. I like new things. Multiple people tweeting about the same topic is nice if I am interested in the differing viewpoints, but I don't use Twitter like that. I use Twitter as a sort of summary of interesting topics. Events X, Y, and Z happened. It's good enough that I know they happened. For most topics, there is no benefit for me to go deep and learn what the finer points are, and 140 characters is not a lot of characters for nuance anyway. Therefore, a shallow overview is enough to keep my general knowledge abreast with the rest of the population.

Thus, when multiple people tweet about the same topic, that's repetition in my newsfeed. That's annoying. What if, instead of that, my feed could just be one instance of each topic?

I think of my Twitter-reading habit as happening in sessions. Each session is typically five minutes. I really only read about 100 tweets each session. If out of 100 tweets I read, 30% of the people I follow overlap on topics, then I really only have read 30 tweets of real content. That's not efficient at all! Efficiency means being able to cover more topics per session.

So, how do you increase efficiency in reading tweets? Well, remove the tweets that cover the same topic of course! There is the secondary matter of choosing the best tweet that summarizes the topic, but that's a subject for another day.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.79.65