Updated · May 15 2026

interactive guide

chapter 01 · the for-you feed

The X algorithm, reverse-engineered in plain sight.

A visual, interactive walkthrough of the open-sourced xai-org/x-algorithm stack — Home Mixer, Thunder, Phoenix and the candidate pipeline. Read the code, tune the weights, watch the feed re-rank in real time.

See the pipeline Open the simulator

Phoenix transformer

Predicts 19 actions per candidate

Two retrieval surfaces

Thunder in-net · Phoenix out-of-net

9-stage pipeline

Source → score → diversify → blend in ~320ms

Candidate isolation

Batch-independent, cacheable scores

02Pipeline architecture

Nine stages from post to feed.

The Home Mixer assembles your For You feed through a strictly ordered sequence. Sources run in parallel, hydrators enrich, filters strip, scorers rank, and a blender finally weaves ads and modules into the organic stream.

total budget

~321ms

stage 01Who is asking, and what do we know about them?

Query hydration

21 parallel hydrators load the viewer's followed accounts, mutual-follow MinHash, impression bloom filters, served history, blocked users, muted keywords, demographic context and the user's recent engagement sequence.

FollowedUserIdsHydrator
ImpressionBloomFilterHydratorfast 'seen?' lookups
UserActionSeqHydratorengagement history → Phoenix
ServedHistoryHydrator
BlockedUserIdsHydrator
MutualFollowQueryHydrator

mapYou are here

03Post → feed

The full journey of a single post.

Every post you read in the For You feed travelled through this. From a Kafka message at the moment of creation to a ranked, blended item in someone's timeline — usually in under a few hundred milliseconds.

step 01

Post created

Author posts. Thunder eats the Kafka event and stores it for sub-ms lookups.

step 02

Thunder ingests

Realtime in-memory store. Original posts, replies, video posts kept per-user.

step 03

Phoenix indexes

Candidate tower embedding stored for ANN retrieval over the global corpus.

step 04

Viewer requests feed

Home Mixer hydrates the viewer's context: follows, mutes, MinHash, history.

step 05

Two retrieval surfaces

Thunder returns in-network candidates; Phoenix returns out-of-network candidates.

step 06

Hydrate + filter

Add author info & engagement counts. Strip blocked, muted, stale, duplicates.

step 07

Phoenix predicts

Transformer predicts P(action) for every candidate × 19 actions.

step 08

Score · diversify · blend

Σ wᵢ·P(actionᵢ), attenuate repeated authors, slot ads, return top 250.

total path · ~320 ms server-side

04Candidate sourcing

In-network and out-of-network, in parallel.

Two completely separate retrieval systems run side by side and feed their results into the same ranker. One is a realtime in-memory store. The other is a transformer-based nearest-neighbour search over the global post corpus.

thunder

in-network

Realtime in-memory store of posts from accounts you follow.

Thunder consumes post create/delete events from Kafka and keeps per-user collections for original posts, replies, reposts and video posts. Old posts are trimmed automatically. Lookups for in-network content happen in sub-millisecond time without ever hitting a DB.

SourceKafka topic of post events
StoragePer-user buckets, trimmed by age
Latency<1 ms median lookup
Recall100% of follows · recent posts only

home-mixer/sources/thunder_source.rs

rust

1let request = GetInNetworkPostsRequest {
2  user_id: query.user_id,
3  following_user_ids: following_list,
4  max_results: ThunderMaxResults,
5  exclude_tweet_ids: query.seen_ids,
6  algorithm: ThunderAlgorithm,
7};
8// returns recent posts from people you follow
9let response = client.get_in_network_posts(request).await?;

phoenix

out-of-network

Two-tower transformer that finds posts you haven't seen yet.

A user tower encodes your engagement history into one embedding; a candidate tower has already encoded every post in the global corpus. Approximate nearest-neighbour search by dot product returns the top hundreds of candidates outside your follow graph.

ArchitectureTwo-tower, Grok-based transformer
Embedding128-dim · multi-hash lookups
Index~537K-item demo corpus · prod ≫
OutputTop-K candidates by cosine

phoenix/recsys_retrieval_model.py

python

1def retrieve(user_features, history, corpus_embeddings, k=200):
2    # User tower encodes the viewer + their recent actions
3    user_emb = user_tower(user_features, history)            # [B, D]
4    # Candidate tower has pre-encoded the corpus
5    scores = jnp.matmul(user_emb, corpus_embeddings.T)        # [B, N]
6    top_k = jnp.argsort(-scores, axis=-1)[:, :k]              # [B, k]
7    return top_k

Why two sources?

In-network sourcing guarantees coverage of people you've chosen to follow. Out-of-network sourcing prevents the feed from collapsing into a small clique — and it's the only way new creators ever reach you. Production weights the two with the OonWeightFactor multiplier (educational default 0.75).

05Signals and weights

Every engagement has a price. Some are wildly more expensive than others.

Phoenix predicts the probability of each action and a weighted sum becomes the score. A follow is worth roughly 100× a passive dwell, and a single report drops your score by hundreds of times more than a like adds.

weighted score contribution · sorted by |w|hover to inspect

Report
-369
Block author
-120
Mute author
-100
Not interested
-74
Follow author
+50
Reply
+27
Profile click
+12
Share → DM
+2.5
Not-dwelled
-2
Share
+1.5
Quote
+1.2
Share → Copy link
+1.2
Repost
+1
Favorite
+0.5
Link click
+0.12
Quoted click
+0.08
Photo expand
+0.03
Dwell time
+0.01
Video quality view
+0.005
Dwell
+0.001

browse the signals

Hover any signal

Each row above represents one of the action probabilities Phoenix predicts. The bar shows the weight applied to that probability before summing into the final score.

weight

—

group

—

vs follow

—

formula

final_score = Σᵢ wᵢ · P(actionᵢ)
              ↑          ↑
        weight    probability
                  the Phoenix
                  transformer
                  predicts

positive signals

15

amplify reach

negative signals

5

suppress reach

strongest +

+50

follow_author

strongest −

−369

report

06How posts go viral

Distribution unlocks in stages.

Virality on X isn't a single event. It is a chain of unlocks where engagement at one tier qualifies the post for the next, larger pool. Each tier amplifies the reach by orders of magnitude.

distribution unlock · slow burn

P(impressions)

favorite

32

reply

11

repost

4

follow

2

amplification headroom×14

Each tier doesn't add reach — it multiplies the pool Phoenix considers for retrieval. Profile clicks and follows are the cleanest viral catalysts because the model treats them as evidence the author is worth long-term re-exposure.

what actually drives the unlock

Dwell time, not likes.

A 2-second dwell on a long video weighs more than ten silent likes. The cont_dwell_time signal is one of the highest-impact passive metrics.

Replies > reposts.

A reply costs effort. Phoenix correctly treats it as harder evidence of interest. Replies also keep your post pinned to the conversation tree, which DedupConversation rewards.

Follows are the catalyst.

A follow during a single session is the rarest, strongest positive. It signals 'I want this author in my feed permanently.' Subsequent posts get boosted on retrieval AND scoring.

07Decay · timing · diversity

Posts get cheaper as they age.

Three mechanisms quietly suppress reach: the hard 36-hour age filter kills old posts entirely; freshness reduces effective score; and author-diversity attenuation drops your second, third and fourth post per feed by a geometric ratio.

freshness multiplier

half-life90m

Score halves every ~90 minutes.

The For You feed strongly prefers fresh content. Past 24h, scores have already dropped to a fraction of their peak. Past 36h, the AgeFilter kills the candidate entirely before scoring even runs.

author diversity decay

AuthorDiversityScorer

Your second post per feed already pays a tax.

When the same author has multiple candidates in a feed, the second is multiplied by 0.7, the third by 0.525, then 0.46, then 0.43… asymptoting to the 0.4 floor. Posting back-to-back hurts your individual scores.

1.00

#1

0.70

#2

0.55

#3

0.48

#4

0.44

#5

0.42

#6

0.41

#7

0.40

#8

Hard age cutoff

36h

AgeFilter drops anything older.

Diversity decay

0.50

Multiplier = (1 − 0.4) · 0.5^pos + 0.4

Min video for VQV

10s

Shorter videos lose the VQV bonus entirely.

09Cold-start · distribution tiers

Every post starts on a small bench.

The ranker tests new posts against a tiny audience first. Pass that test, and Phoenix expands the audience — fail it, and the post gets quietly retired regardless of who the author is.

tier 00

0 – 200 imp

Author bench

First exposure to an in-network slice. The Phoenix ranker decides if the post even deserves to live past this tier based on the early dwell, replies and like-rate.

tier 01

200 – 5K imp

Follower cohort

Thunder pushes the post into more in-network feeds. If reply rate stays above the author's baseline, it qualifies for OON Phoenix retrieval.

tier 02

5K – 100K imp

OON expansion

Phoenix retrieval starts including the post for users whose engagement embedding is close to the post's. Profile clicks at this tier matter more than reposts.

tier 03

100K+ imp

Cascade

Network effects compound. Each tier-2 follow re-ranks all of that user's future feeds toward the author. Reach now grows non-linearly.

08Negative signals & suppression

The fastest way to disappear.

Suppression on X is not a single shadowban switch. It is many quiet, cumulative penalties — some scored, some filtered, some applied long before your post ever reaches the ranker.

-120

Block

Strongest single negative on the author. Subsequent posts from them get a discounted score in your feed and your follow-graph features quietly de-weight them globally.

-369

Report

Triggers visibility filtering at the VFFilter stage. If enough reports cluster, the post can hit Action::Drop and never appear, regardless of score.

-100

Mute author

Soft block. AuthorSocialgraphFilter strips their candidates entirely for you, and your action sequence carries the negative for future authors of similar style.

-74

Not interested

User-facing 'show me less' tap. Phoenix learns this as a labelled negative and downstream similar posts get scored lower.

-2

Not-dwelled

Implicit negative for posts the user scrolls past without stopping. Individually small, but accumulates fast.

policy filters

These don't lower your score — they delete the candidate.

Visibility filters and policy filters run as kill-switches, not re-rankers. If your post matches one, no amount of engagement can push it back into anyone's feed.

Filters run pre-score, so suppressed posts never even reach Phoenix.

VFFilter

Removes posts the visibility-filtering system has marked spam, violence, gore or deleted. Anything with Action::Drop on its safety result is gone before the user ever sees it.

MutedKeywordFilter

Tokenizes both your muted keywords and the post text via TweetTokenizer. Any match removes the candidate. Acts pre-score so muted topics never enter ranking at all.

AuthorSocialgraphFilter

Drops posts whose author you've blocked or muted — and whose author has blocked you. Also drops quoted/retweeted users that you've blocked.

IneligibleSubscriptionFilter

Paywalled / subscription-locked content that the viewer can't access. Removed pre-score so the model never wastes capacity ranking it.

11Before · After

Tiny edits, huge score deltas.

Three pairs of nearly-identical posts. The Phoenix predictions shown below are the same model output across both variants for the same demo viewer — only the post changed.

before

I've been thinking a lot lately about how the way we measure engineering productivity is completely broken in most companies — most teams optimize for output, not impact, and you end up with people shipping a lot of low-quality features instead of doing the deep, slow work that actually moves a product…

P(fav)

4.0%

P(reply)

0.5%

P(rt)

0.6%

P(dwell)

22.0%

after

Most engineering productivity metrics measure activity, not impact. Here's a better one. ↓

P(fav)

21.0%

P(reply)

3.8%

P(rt)

4.4%

P(dwell)

68.0%

Hookless openings never trigger dwell. A single tight sentence pulls dwell_time up 4× and drags the reply rate with it.

10Reference

Glossary — words you'll keep meeting.

term

Home Mixer

The Rust orchestration service that runs the candidate pipeline and returns a ranked feed via gRPC. Owns the For You endpoint.

term

Thunder

In-memory realtime store of recent posts from accounts you follow. Serves in-network candidates in sub-ms via the InNetworkPostsService.

term

Phoenix

Grok-based transformer system that handles two things: retrieval (two-tower model) and ranking (transformer with candidate isolation attention).

term

Candidate Pipeline

A reusable Rust framework defining traits — Source, Hydrator, Filter, Scorer, Selector, SideEffect — that compose into recommendation pipelines.

term

VQV signal

Video Quality View — a binary 'meaningful watch' label produced only for videos longer than MIN_VIDEO_DURATION_MS (10 seconds).

term

OON factor

Out-of-network multiplier applied after scoring. Defaults to 0.75 to slightly penalise OON posts; bumped to 1.0 for new users with few follows.

term

Author diversity

Geometric decay applied to repeated authors in a feed. Multiplier = (1−floor) · decay^pos + floor. Production defaults: decay=0.5, floor=0.4.

term

Bloom filter (impressions)

A probabilistic 'has the user seen this?' set delivered with every request. Pre-score filter strips anything that may have already been served.

term

MinHash Jaccard

256-bit signature stored per user. The MutualFollowJaccardHydrator computes overlap with the candidate author's signature — a fast 'shared social graph' score.

term

VFFilter

Visibility filter — a policy stage that drops candidates whose safety result is Action::Drop (spam, gore, violence, deletes).

term

Grox

Content-understanding service for spam detection, post-category classification and PTOS policy enforcement. Feeds VF and brand-safety stages.

term

DPP

Determinantal Point Process — diversification used optionally inside the VMRanker with theta and max_selected_rank parameters.

try it yourself

Turn the dials. Watch the feed rearrange.

Open the simulator to see how a real ranking pipeline reorders an example feed when you change Phoenix weights, toggle author diversity, or shift the OON multiplier.

Open the simulator Read the playbook Walk through the code