Updated · May 15 2026
interactive guide

The X algorithm, reverse-engineered in plain sight.

A visual, interactive walkthrough of the open-sourced xai-org/x-algorithm stack — Home Mixer, Thunder, Phoenix and the candidate pipeline. Read the code, tune the weights, watch the feed re-rank in real time.

Phoenix transformer
Predicts 19 actions per candidate
Two retrieval surfaces
Thunder in-net · Phoenix out-of-net
9-stage pipeline
Source → score → diversify → blend in ~320ms
Candidate isolation
Batch-independent, cacheable scores
02Pipeline architecture

Nine stages from post to feed.

The Home Mixer assembles your For You feed through a strictly ordered sequence. Sources run in parallel, hydrators enrich, filters strip, scorers rank, and a blender finally weaves ads and modules into the organic stream.

total budget
~321ms
stage 01Who is asking, and what do we know about them?

Query hydration

21 parallel hydrators load the viewer's followed accounts, mutual-follow MinHash, impression bloom filters, served history, blocked users, muted keywords, demographic context and the user's recent engagement sequence.

  • FollowedUserIdsHydrator
  • ImpressionBloomFilterHydratorfast 'seen?' lookups
  • UserActionSeqHydratorengagement history → Phoenix
  • ServedHistoryHydrator
  • BlockedUserIdsHydrator
  • MutualFollowQueryHydrator
mapYou are here
03Post → feed

The full journey of a single post.

Every post you read in the For You feed travelled through this. From a Kafka message at the moment of creation to a ranked, blended item in someone's timeline — usually in under a few hundred milliseconds.

step 01
Post created

Author posts. Thunder eats the Kafka event and stores it for sub-ms lookups.

step 02
Thunder ingests

Realtime in-memory store. Original posts, replies, video posts kept per-user.

step 03
Phoenix indexes

Candidate tower embedding stored for ANN retrieval over the global corpus.

step 04
Viewer requests feed

Home Mixer hydrates the viewer's context: follows, mutes, MinHash, history.

step 05
Two retrieval surfaces

Thunder returns in-network candidates; Phoenix returns out-of-network candidates.

step 06
Hydrate + filter

Add author info & engagement counts. Strip blocked, muted, stale, duplicates.

step 07
Phoenix predicts

Transformer predicts P(action) for every candidate × 19 actions.

step 08
Score · diversify · blend

Σ wᵢ·P(actionᵢ), attenuate repeated authors, slot ads, return top 250.

total path · ~320 ms server-side
04Candidate sourcing

In-network and out-of-network, in parallel.

Two completely separate retrieval systems run side by side and feed their results into the same ranker. One is a realtime in-memory store. The other is a transformer-based nearest-neighbour search over the global post corpus.

thunder
in-network

Realtime in-memory store of posts from accounts you follow.

Thunder consumes post create/delete events from Kafka and keeps per-user collections for original posts, replies, reposts and video posts. Old posts are trimmed automatically. Lookups for in-network content happen in sub-millisecond time without ever hitting a DB.

  • SourceKafka topic of post events
  • StoragePer-user buckets, trimmed by age
  • Latency<1 ms median lookup
  • Recall100% of follows · recent posts only
home-mixer/sources/thunder_source.rs
rust
1let request = GetInNetworkPostsRequest {
2 user_id: query.user_id,
3 following_user_ids: following_list,
4 max_results: ThunderMaxResults,
5 exclude_tweet_ids: query.seen_ids,
6 algorithm: ThunderAlgorithm,
7};
8// returns recent posts from people you follow
9let response = client.get_in_network_posts(request).await?;
phoenix
out-of-network

Two-tower transformer that finds posts you haven't seen yet.

A user tower encodes your engagement history into one embedding; a candidate tower has already encoded every post in the global corpus. Approximate nearest-neighbour search by dot product returns the top hundreds of candidates outside your follow graph.

  • ArchitectureTwo-tower, Grok-based transformer
  • Embedding128-dim · multi-hash lookups
  • Index~537K-item demo corpus · prod ≫
  • OutputTop-K candidates by cosine
phoenix/recsys_retrieval_model.py
python
1def retrieve(user_features, history, corpus_embeddings, k=200):
2 # User tower encodes the viewer + their recent actions
3 user_emb = user_tower(user_features, history) # [B, D]
4 # Candidate tower has pre-encoded the corpus
5 scores = jnp.matmul(user_emb, corpus_embeddings.T) # [B, N]
6 top_k = jnp.argsort(-scores, axis=-1)[:, :k] # [B, k]
7 return top_k
Why two sources?

In-network sourcing guarantees coverage of people you've chosen to follow. Out-of-network sourcing prevents the feed from collapsing into a small clique — and it's the only way new creators ever reach you. Production weights the two with the OonWeightFactor multiplier (educational default 0.75).

05Signals and weights

Every engagement has a price. Some are wildly more expensive than others.

Phoenix predicts the probability of each action and a weighted sum becomes the score. A follow is worth roughly 100× a passive dwell, and a single report drops your score by hundreds of times more than a like adds.

weighted score contribution · sorted by |w|hover to inspect
  • Report
    -369
  • Block author
    -120
  • Mute author
    -100
  • Not interested
    -74
  • Follow author
    +50
  • Reply
    +27
  • Profile click
    +12
  • Share → DM
    +2.5
  • Not-dwelled
    -2
  • Share
    +1.5
  • Quote
    +1.2
  • Share → Copy link
    +1.2
  • Repost
    +1
  • Favorite
    +0.5
  • Link click
    +0.12
  • Quoted click
    +0.08
  • Photo expand
    +0.03
  • Dwell time
    +0.01
  • Video quality view
    +0.005
  • Dwell
    +0.001
browse the signals
Hover any signal

Each row above represents one of the action probabilities Phoenix predicts. The bar shows the weight applied to that probability before summing into the final score.

weight
group
vs follow
formula
final_score = Σᵢ wᵢ · P(actionᵢ)
              ↑          ↑
        weight    probability
                  the Phoenix
                  transformer
                  predicts
positive signals
15
amplify reach
negative signals
5
suppress reach
strongest +
+50
follow_author
strongest −
−369
report
06How posts go viral

Distribution unlocks in stages.

Virality on X isn't a single event. It is a chain of unlocks where engagement at one tier qualifies the post for the next, larger pool. Each tier amplifies the reach by orders of magnitude.

distribution unlock · slow burn
P(impressions)
favorite
32
reply
11
repost
4
follow
2
amplification headroom×14

Each tier doesn't add reach — it multiplies the pool Phoenix considers for retrieval. Profile clicks and follows are the cleanest viral catalysts because the model treats them as evidence the author is worth long-term re-exposure.

what actually drives the unlock
Dwell time, not likes.

A 2-second dwell on a long video weighs more than ten silent likes. The cont_dwell_time signal is one of the highest-impact passive metrics.

Replies > reposts.

A reply costs effort. Phoenix correctly treats it as harder evidence of interest. Replies also keep your post pinned to the conversation tree, which DedupConversation rewards.

Follows are the catalyst.

A follow during a single session is the rarest, strongest positive. It signals 'I want this author in my feed permanently.' Subsequent posts get boosted on retrieval AND scoring.

07Decay · timing · diversity

Posts get cheaper as they age.

Three mechanisms quietly suppress reach: the hard 36-hour age filter kills old posts entirely; freshness reduces effective score; and author-diversity attenuation drops your second, third and fourth post per feed by a geometric ratio.

freshness multiplier
half-life90m

Score halves every ~90 minutes.

The For You feed strongly prefers fresh content. Past 24h, scores have already dropped to a fraction of their peak. Past 36h, the AgeFilter kills the candidate entirely before scoring even runs.

1h6h12h24h36h1.00.0
author diversity decay
AuthorDiversityScorer

Your second post per feed already pays a tax.

When the same author has multiple candidates in a feed, the second is multiplied by 0.7, the third by 0.525, then 0.46, then 0.43… asymptoting to the 0.4 floor. Posting back-to-back hurts your individual scores.

1.00
#1
0.70
#2
0.55
#3
0.48
#4
0.44
#5
0.42
#6
0.41
#7
0.40
#8
Hard age cutoff
36h

AgeFilter drops anything older.

Diversity decay
0.50

Multiplier = (1 − 0.4) · 0.5^pos + 0.4

Min video for VQV
10s

Shorter videos lose the VQV bonus entirely.

09Cold-start · distribution tiers

Every post starts on a small bench.

The ranker tests new posts against a tiny audience first. Pass that test, and Phoenix expands the audience — fail it, and the post gets quietly retired regardless of who the author is.

tier 00
0 – 200 imp
Author bench

First exposure to an in-network slice. The Phoenix ranker decides if the post even deserves to live past this tier based on the early dwell, replies and like-rate.

tier 01
200 – 5K imp
Follower cohort

Thunder pushes the post into more in-network feeds. If reply rate stays above the author's baseline, it qualifies for OON Phoenix retrieval.

tier 02
5K – 100K imp
OON expansion

Phoenix retrieval starts including the post for users whose engagement embedding is close to the post's. Profile clicks at this tier matter more than reposts.

tier 03
100K+ imp
Cascade

Network effects compound. Each tier-2 follow re-ranks all of that user's future feeds toward the author. Reach now grows non-linearly.

08Negative signals & suppression

The fastest way to disappear.

Suppression on X is not a single shadowban switch. It is many quiet, cumulative penalties — some scored, some filtered, some applied long before your post ever reaches the ranker.

-120
Block

Strongest single negative on the author. Subsequent posts from them get a discounted score in your feed and your follow-graph features quietly de-weight them globally.

-369
Report

Triggers visibility filtering at the VFFilter stage. If enough reports cluster, the post can hit Action::Drop and never appear, regardless of score.

-100
Mute author

Soft block. AuthorSocialgraphFilter strips their candidates entirely for you, and your action sequence carries the negative for future authors of similar style.

-74
Not interested

User-facing 'show me less' tap. Phoenix learns this as a labelled negative and downstream similar posts get scored lower.

-2
Not-dwelled

Implicit negative for posts the user scrolls past without stopping. Individually small, but accumulates fast.

policy filters

These don't lower your score — they delete the candidate.

Visibility filters and policy filters run as kill-switches, not re-rankers. If your post matches one, no amount of engagement can push it back into anyone's feed.

Filters run pre-score, so suppressed posts never even reach Phoenix.
VFFilter

Removes posts the visibility-filtering system has marked spam, violence, gore or deleted. Anything with Action::Drop on its safety result is gone before the user ever sees it.

MutedKeywordFilter

Tokenizes both your muted keywords and the post text via TweetTokenizer. Any match removes the candidate. Acts pre-score so muted topics never enter ranking at all.

AuthorSocialgraphFilter

Drops posts whose author you've blocked or muted — and whose author has blocked you. Also drops quoted/retweeted users that you've blocked.

IneligibleSubscriptionFilter

Paywalled / subscription-locked content that the viewer can't access. Removed pre-score so the model never wastes capacity ranking it.

11Before · After

Tiny edits, huge score deltas.

Three pairs of nearly-identical posts. The Phoenix predictions shown below are the same model output across both variants for the same demo viewer — only the post changed.

before

I've been thinking a lot lately about how the way we measure engineering productivity is completely broken in most companies — most teams optimize for output, not impact, and you end up with people shipping a lot of low-quality features instead of doing the deep, slow work that actually moves a product…

P(fav)
4.0%
P(reply)
0.5%
P(rt)
0.6%
P(dwell)
22.0%
after

Most engineering productivity metrics measure activity, not impact. Here's a better one. ↓

P(fav)
21.0%
P(reply)
3.8%
P(rt)
4.4%
P(dwell)
68.0%

Hookless openings never trigger dwell. A single tight sentence pulls dwell_time up 4× and drags the reply rate with it.

10Reference

Glossary — words you'll keep meeting.

term
Home Mixer

The Rust orchestration service that runs the candidate pipeline and returns a ranked feed via gRPC. Owns the For You endpoint.

term
Thunder

In-memory realtime store of recent posts from accounts you follow. Serves in-network candidates in sub-ms via the InNetworkPostsService.

term
Phoenix

Grok-based transformer system that handles two things: retrieval (two-tower model) and ranking (transformer with candidate isolation attention).

term
Candidate Pipeline

A reusable Rust framework defining traits — Source, Hydrator, Filter, Scorer, Selector, SideEffect — that compose into recommendation pipelines.

term
VQV signal

Video Quality View — a binary 'meaningful watch' label produced only for videos longer than MIN_VIDEO_DURATION_MS (10 seconds).

term
OON factor

Out-of-network multiplier applied after scoring. Defaults to 0.75 to slightly penalise OON posts; bumped to 1.0 for new users with few follows.

term
Author diversity

Geometric decay applied to repeated authors in a feed. Multiplier = (1−floor) · decay^pos + floor. Production defaults: decay=0.5, floor=0.4.

term
Bloom filter (impressions)

A probabilistic 'has the user seen this?' set delivered with every request. Pre-score filter strips anything that may have already been served.

term
MinHash Jaccard

256-bit signature stored per user. The MutualFollowJaccardHydrator computes overlap with the candidate author's signature — a fast 'shared social graph' score.

term
VFFilter

Visibility filter — a policy stage that drops candidates whose safety result is Action::Drop (spam, gore, violence, deletes).

term
Grox

Content-understanding service for spam detection, post-category classification and PTOS policy enforcement. Feeds VF and brand-safety stages.

term
DPP

Determinantal Point Process — diversification used optionally inside the VMRanker with theta and max_selected_rank parameters.

try it yourself

Turn the dials. Watch the feed rearrange.

Open the simulator to see how a real ranking pipeline reorders an example feed when you change Phoenix weights, toggle author diversity, or shift the OON multiplier.