With a user base of over 200 million people, who collectively post 70 million new photographs and short videos a day, Instagram provides a timely view of events around the world that people are interested in. But how does the service spot emerging trends while they are still on the rise?
Two of the company’s engineers, Danilo Resende and Udi Weinsberg, have posted a blog item that reveals some of the company’s secrets in spotting emerging trends.
The details of the company’s algorithms might be of interest to other Internet-facing services also looking to harness the collective interest of the masses to produce more timely content. It also provides more light on how Instagram, and presumably other social networking sites, determine what is a trend. Like Twitter, Instagram generates its trending topics automatically.
Instagram is a mobile app for sharing photos and videos, with the majority of its users posting content captured on their phones. As a result many people capture pivotal moments of events as they are unfolding, at least of those events with pictorial elements.
For instance, a week ago, when the U.S. Supreme Court handed down its affirmative ruling on gay marriage, Instagram received thousands of new photos with the hashtag “#equality,” many taken on the steps of the Supreme Court building in Washington. A hashtag is a way to annotate content on social media services, using the hash symbol in front of a keyword or a word or phrase describing the topic.
Trends and hashtags play an increasingly important role for helping users spend more time on the service. Last week, the company revamped its “Explore” feature to include more trendy content to peruse. It also revised its search feature to highlight current trends as well.
With so many users posting with multiple hashtags, how does Instagram spot emerging hot topics for its users? The company keeps a database of all the hashtags ever used, along with how often they have appeared, on average, every five minutes for the past seven days. If a tag is suddenly more popular than usual, a trend may be afoot.
The researchers noted that they could use more complex neural network-driven models to calculate when a hashtag hits that point of popularity to make it a genuine trend, but a simple comparison to the prior seven days worth of measurements does the job well enough, they note, and can spot the big trends with relatively lightweight requirements for computing processing and memory.
The model also takes into account the decline of hashtags when an event is over.
“The amount of posts using a hashtag that is trending at the moment will naturally decrease as soon as the event is finished,” they write. This can be problematic insofar people still want to see pictures of an event after it has happened, so Instagram built in a decay, or half-life, function that helps highlight the trend in the hours following the event itself.
Another potential confounding issue is that multiple hashtags may be applied to the same event. For instance, the #fashionweek tag is frequently joined by #model and #fashion.
So the development teams wrote an algorithm that clusters together hashtags that refer to the same event. It looks at how often hashtags are paired together, such as #equality and #lovewins. It looks at words that are very similar, in order to detect misspellings, so that #valentinesday and #valentineday are clumped together. It also runs an internal tool that classifies tags into a predefined set of topics.
“When we approached trending, we tried to break this project down into smaller problems that could be tackled separately by components with a very specific function. As a result, each individual in our team was able to focus on one problem at a time before moving onto the next one,” the researchers wrote.