If you've ever wondered why some collaborative editors feel instant and others feel like a fight over a Google Doc from 2014, the answer is usually CRDTs. For engineering managers evaluating tools for distributed teams, the data structure under the hood matters more than the marketing page suggests. It decides whether your team can ship a spec together at 4pm on a Friday or whether someone has to be the designated typist.

This is a technical primer on how real-time collaborative editing actually works, why it's hard, and what changes when your project management tool treats pages as a first-class collaborative surface rather than a comment thread with delusions of grandeur.

Key takeaways

  • Real-time editing is hard because two people typing in the same paragraph at the same time creates conflicts that naive sync cannot resolve.
  • CRDTs (Conflict-free Replicated Data Types) solve this by giving every operation a unique identity and merge rule, so the result is the same no matter what order updates arrive.
  • Yjs is the most widely adopted CRDT library for collaborative text, and it pairs naturally with editors like Lexical, ProseMirror, and Tiptap.
  • Presence indicators (who is on the page right now) are a separate channel from document state, and they matter for awareness even without live cursor tracking.
  • For engineering teams, the practical win is fewer Slack threads asking "are you in this doc?" and fewer merge conflicts on shared specs, runbooks, and RFCs.

The problem: why concurrent editing breaks naive systems

Imagine two engineers editing the same paragraph in a planning doc. One inserts a sentence at character 200. The other deletes a word at character 150. If the server processes them in the wrong order, the second edit lands at the wrong position because the first edit shifted everything.

The old fix was operational transformation, or OT. Google Docs famously runs on it. OT works, but it requires a central server that rewrites every operation against the state it has seen, and getting the transformation functions right for rich text is notoriously painful. Every new formatting feature is a new edge case.

CRDTs take a different route. Instead of transforming operations, they design the data so that any order of operations produces the same final result. Math, not orchestration.

How CRDTs make this work

A CRDT assigns every character (or block, or attribute) a unique identifier that includes the client ID and a logical clock. When you type, you are not inserting at "position 200." You are inserting after the character with ID (clientA, 42). That position is stable regardless of what anyone else does upstream or downstream.

When two clients exchange updates, they can apply them in any order, multiple times, even out of sequence after a reconnection, and the document converges to the same state on every machine. This property is called strong eventual consistency, and it is what makes offline editing, P2P sync, and resilient reconnection actually viable.

Why Yjs in particular

Yjs is a CRDT library that has become the default for collaborative editors that care about performance. It uses a compact binary encoding, shared types (Y.Text, Y.Array, Y.Map) that mirror the data structures you already use, and a provider model that lets you swap the transport layer — WebSocket, WebRTC, or anything that moves bytes.

The reason engineering teams gravitate to Yjs over rolling their own: collaborative text editing has roughly a thousand edge cases and Yjs has hit most of them in production already.

Pairing CRDTs with a structured editor

A CRDT gives you a conflict-free document model. It does not give you formatting, blocks, slash commands, or anything users actually interact with. For that you need an editor framework.

Lexical is Meta's editor framework, designed from the start with collaboration in mind. Its node tree maps cleanly to a Y.Doc, which is why a growing number of tools (including Zoobbe's pages) use the Lexical + Yjs combination. You write your editor logic against Lexical's node model and let the Yjs binding handle the wire format.

The same pattern works with ProseMirror and Tiptap. The point is that the editor and the CRDT are decoupled. The CRDT does not care that you added a callout block. The editor does not care that two clients are syncing.

Presence: the other half of "feels collaborative"

Even with a perfect CRDT, a page does not feel alive if you cannot tell who else is there. Presence is a separate concern from document state. It tells the UI: Maya is viewing this page, Dev is editing it.

Presence usually rides a lightweight ephemeral channel — Yjs ships an Awareness protocol exactly for this. It is intentionally not persisted, because no one needs to know that you opened a doc last Tuesday.

One thing worth being honest about: presence indicators are not the same as live cursor tracking. Showing that a teammate is on the page is a much smaller commitment than rendering their caret position in real time. Both are useful. Most teams get 90% of the value from presence alone, and Zoobbe's current implementation focuses there.

What this changes for engineering teams

If your planning, specs, and runbooks live in a tool that treats documents as static, every edit becomes a coordination problem. Someone has to "own" the doc. Comments stack up. Decisions happen in Slack and get back-ported.

When the document is a live surface — when an eng manager and a tech lead can sketch a sprint plan together while the standup is still happening — the coordination cost goes to zero. The document becomes the meeting.

That is the actual product story behind real-time collab. Not "we have a fancy editor." The shift is that the tool stops being a system of record you update afterward and starts being the place the work happens.

FAQ

Are CRDTs better than operational transformation?

For modern collaborative editors with rich content, most teams choose CRDTs. They handle offline reconnection, P2P, and out-of-order delivery without a central transformation server. OT still works fine and Google Docs runs on it; it's more a question of where you want complexity to live.

Do I need a backend to use Yjs?

No. Yjs is transport-agnostic. You can run it over WebSocket with a relay server, peer-to-peer over WebRTC, or even file-based for offline-first apps. Most production deployments use a WebSocket provider because it is the simplest reliable path.

What is the difference between presence and live cursors?

Presence answers "who is on this page right now." Live cursors answer "where exactly is each person's caret." Live cursors are a strict superset and require more bandwidth and rendering work. Presence indicators are enough for most collaborative-awareness use cases.

Does collaborative editing work offline?

With a CRDT, yes. Edits made offline are applied locally, then synced and merged on reconnection. There are no manual merge conflicts to resolve, because the data structure guarantees convergence.

Why does this matter for engineering managers specifically?

Because the cost of bad collaboration tooling shows up as duplicated work, stale specs, and meetings about meetings. A document model that lets two people work in the same paragraph without stepping on each other removes a class of coordination overhead that compounds across a team.

Where Zoobbe fits

Zoobbe's pages use Lexical for the editor and Yjs for the collaboration layer. Two engineers can edit the same RFC at the same time, you can see who is on the page, and the document converges cleanly even with flaky connections. It is one of the reasons teams who outgrew Trello but found Notion too freeform end up giving it a look.

If you want to see the difference between a document that fights you and one that does not, the fastest path is to open a page with a teammate and start typing.

Photo by ThisisEngineering on Unsplash