Twitter publishes code that it claims determines what tweets people see and why.

Twitter delivered on one of CEO Elon Musk’s many promises by posting on Friday afternoon what it claims is the code for its tweet recommendation algorithm on GitHub.

The code, published under the GNU Affero v3.0 General Public License, contains numerous details about what factors make a tweet more or less likely to appear in a user’s timeline.

In a blog post accompanying the release of the code, the Twitter engineering team (without a specific caption) notes that the system for determining which are “the most popular tweets that end up showing up on your device’s timeline for you””comprises many interconnected services and jobs.”Every time the Twitter home screen is refreshed, Twitter pulls “the top 1,500 tweets from hundreds of millions,”the post says.

The largest source of these tweets are “online sources”or users who are being followed by someone. The top tweets from this stack are ranked by the likelihood of a user interacting with the author of that tweet; the more likely their tweets are to appear in For You. For “offline sources”not followed by the user, Twitter says it considers tweets that get the attention of people the user follows and tweets that are liked by those who like tweets similar to the user.

Already those who have looked at the code have noticed considerations that raise many more questions. Many posted them, of course, on Twitter itself.

Twitter just released the source code of the “algorithm”.

Oh what is this file? Predicates for tweets on home timeline?

Oh what is that second picture? pic.twitter.com/UE3dU8e3Os

March 31, 2023

Olafur Vaage, senior software engineer at Norwegian software consulting service TurtleSec, noted that inside “HomeTweetTypePredicates.scala “some of the possible considerations for which a tweet could be a candidate for the “For You”section are as follows:

  • author_is_elon
  • author_is_power_user
  • author_is_democrat
  • author_is_republican

Elsewhere in the code, a code comment purportedly left by a Twitter engineer clarifies that these identification values ​​are “used solely for collecting metrics”. The comment goes like this:

These author ID lists are used solely for collecting metrics. We track how often we serve these authors’ tweets and how often their tweets impress users. This helps us confirm on our A/B experimentation platform that we’re not submitting changes that negatively impact one group over others.

The names of the objects in question, such as “DDGStatsDemocratsFeature”or “DDGStatsElonFeature”seem to support this interpretation, but this may not be possible to confirm with available code. However, it is interesting that Twitter checks and correlates these variables. During the Twitter Spaces audio session, a Twitter engineer noted that the labels used for the metrics were Democrats and Republicans. Musk, who claimed he didn’t know about the labels until today, suggested they shouldn’t be there.

Other things that are considered in relation to the tweet include whether it is less than 30 minutes old, whether it has images, and whether it is a “power user”, which some say means an “outdated”verified account.

Today, most of the recommendation algorithm will be made open source. The rest will follow.

The acid test is that independent third parties must be able to determine with reasonable accuracy what is likely to be shown to users.

No doubt there will be many awkward moments… https://t.co/41U4oexIev

March 31, 2023

Musk tweeted, along with a company blog post, that the recommendation algorithm, arguing that there would be an “acid test”if “independent third parties”could “determine with reasonable accuracy what is likely to be shown to users.”

Twitter’s release of its algorithm code comes just days after the social network’s broader source code was discovered on GitHub, potentially sitting there for months, according to The New York Times. Twitter then received a subpoena forcing GitHub to reveal information about the GitHub poster.

A report by Platformer earlier this week said that Twitter used a secret list of 35 top Twitter users, including President Biden, LeBron James, Ben Shapiro and Musk. Evidence of the implementation of this list, reportedly prompted in part by Musk’s dissatisfaction with his own involvement, has yet to be found in a codebase posted on Twitter.

Specifically, the code arrives just a few hours before “verified legacy”users — those who were blue ticked to indicate authenticity or notoriety before Musk bought the service — are due to be deprecated in favor of paid Twitter Blue subscribers. While some users associated with governments and large organizations may apply for other colored checkmarks, only $8/month Twitter Blue subscribers will receive “priority rating in conversations”among other things.

All of these changes take place on April 1, or April Fool’s Day.

CDN CTB