Engineering

Building Real Time Systems: Architecture Patterns That Scale

Lemorange Team 9 min read

When You Actually Need Real Time (and When You Do Not)

The first question to answer before building any real time system is whether you genuinely need real time communication. Real time adds complexity to every layer of your architecture: connection management, state synchronization, error handling, scaling, and monitoring. If polling every 30 seconds meets your requirements, polling is the correct answer. It is simpler, more reliable, and easier to scale.

Genuine real time requirements exist when the data is time sensitive and user action depends on immediate awareness. Trading platforms where price staleness means financial loss. Chat applications where users expect sub second message delivery. Live sports dashboards where odds change every few seconds during a match. Collaborative editing where multiple users modify the same document simultaneously. Monitoring dashboards where operators need to see system state changes as they happen.

The gray area is large. Notification systems, activity feeds, and dashboard updates often feel like they need real time but function perfectly well with 5 to 15 second polling intervals. The rule of thumb: if a user would not notice or care about a 10 second delay in data freshness, polling is fine. If a 10 second delay would cause confusion, incorrect decisions, or a degraded experience, you need real time.

The cost of getting this wrong in either direction is real. Building unnecessary real time infrastructure wastes development time and adds operational complexity. Failing to provide real time when users expect it creates a frustrating experience that drives adoption down. Start by documenting the latency requirements for each data flow in your system, and you will find that most of them do not need persistent connections.

WebSocket vs SSE vs Long Polling: Making the Right Choice

WebSocket provides full duplex communication over a single TCP connection. Both client and server can send messages at any time without the overhead of HTTP request response cycles. The protocol starts as an HTTP request that upgrades to WebSocket via the Upgrade header. Once established, the connection stays open and both sides can push data through it. WebSocket is the right choice when you need bidirectional communication: chat, collaborative editing, interactive gaming, or any scenario where the client sends frequent updates to the server.

Server Sent Events (SSE) provides unidirectional streaming from server to client over a standard HTTP connection. The client opens a long lived HTTP GET request, and the server sends events through it as they occur. SSE is built on top of standard HTTP, which means it works through all HTTP proxies, load balancers, and CDNs without special configuration. It also handles reconnection automatically through the EventSource API, including the ability to resume from a specific event ID after disconnection. SSE is the right choice when you only need server to client updates: live dashboards, notification streams, real time feeds, or status updates.

Long polling is the fallback when neither WebSocket nor SSE is available or appropriate. The client makes an HTTP request, and the server holds the response open until it has data to send or a timeout occurs. The client then immediately makes another request. Long polling works everywhere, including environments with aggressive proxies and firewalls that terminate WebSocket and SSE connections. The tradeoff is higher latency (the message cannot be sent until the next request arrives) and higher server resource consumption (each pending request holds a thread or connection on the server).

In practice, most applications should default to SSE for server to client streaming and add WebSocket only when bidirectional communication is required. The simplicity of SSE, its built in reconnection logic, and its compatibility with standard HTTP infrastructure make it the pragmatic choice for the majority of real time use cases.

SignalR Architecture: A Deep Dive

SignalR is Microsoft's real time communication library for .NET, and it abstracts the transport layer decision entirely. A SignalR connection attempts WebSocket first, falls back to SSE, and finally to long polling, all transparently to your application code. The hub abstraction lets you write real time logic as method calls rather than raw message handling.

The SignalR architecture has three key concepts. Hubs are the server side entry points that define methods clients can call and methods the server can invoke on clients. Connections represent individual client connections, each with a unique connection ID. Groups allow you to organize connections for targeted message delivery, sending a message to a group name delivers it to all connections that have joined that group.

A common architectural pattern with SignalR separates the real time concern from the business logic. Your API controllers handle HTTP requests and process business operations. When an operation produces an event that clients should receive in real time (a new order, a price update, a chat message), the controller publishes that event to a message broker or in process event bus. A background service subscribes to these events and broadcasts them through SignalR hubs. This keeps the real time broadcasting logic out of your controllers and makes the system testable.

Connection lifecycle management deserves careful attention in production SignalR applications. Clients disconnect frequently due to network changes, device sleep, and browser tab suspension. Your server side code should handle the OnConnectedAsync and OnDisconnectedAsync events to maintain connection state, rejoin groups after reconnection, and replay missed events when appropriate. The SignalR client library handles reconnection automatically, but your application logic must handle the state reconciliation that follows.

Event Driven Patterns with Message Queues

Real time systems in production almost always involve message queues, even if the final delivery to clients is through WebSocket or SSE. The message queue decouples event producers from event consumers, provides durability (events are not lost if the WebSocket server restarts), and enables horizontal scaling of both production and consumption independently.

The typical architecture has three tiers. The event production tier consists of your API services, background workers, or external integrations that generate events. These services publish events to a message broker, either RabbitMQ for simpler topologies or Kafka for high throughput scenarios requiring event replay and partitioned consumption. The event processing tier subscribes to relevant event streams, applies any necessary transformation or filtering, and determines which clients should receive each event. The delivery tier manages the WebSocket or SSE connections to clients and pushes events to the appropriate connections.

Kafka deserves special mention for real time systems because of its log based architecture. Unlike traditional message queues where messages are consumed and removed, Kafka retains events for a configurable duration (often days or weeks). This means a new consumer, or a consumer that was offline, can replay events from any point in the log. For real time dashboards where clients need to display the last N events when they first connect, Kafka's ability to replay from an offset is extremely valuable.

RabbitMQ with its exchange and binding model is better suited for routing patterns where different event types need to reach different consumers. A fanout exchange delivers events to all bound queues (useful for broadcasting), a topic exchange routes based on pattern matching (useful for filtering by event type or entity), and a direct exchange routes to specific queues (useful for targeted delivery). For most real time systems processing fewer than 50,000 events per second, RabbitMQ provides simpler operations and sufficient throughput.

Handling Back Pressure and Reconnection

Back pressure is what happens when your system produces events faster than clients can consume them. This occurs in real time systems more than most engineers expect. A mobile client on a slow network, a browser tab running in the background with reduced resources, or a sudden burst of events from the server can all create a situation where the outbound message queue for a specific connection grows unboundedly.

The strategies for handling back pressure depend on the data semantics. For data where only the latest value matters (stock prices, sensor readings, dashboard metrics), the correct approach is conflation: replace the pending message with the newest value. The client receives the most recent state rather than a backlog of stale updates. SignalR does not provide built in conflation, so you need to implement it in your hub logic by maintaining a per connection dictionary of pending updates and sending only the latest value on each flush cycle.

For data where every message matters (chat messages, transaction records, audit events), you need bounded queues with overflow handling. Set a maximum outbound queue depth per connection. When the queue is full, you have two options: drop the connection and force the client to reconnect (at which point it can request a full state sync), or buffer messages server side and deliver them when the client catches up, accepting the memory cost.

Reconnection handling requires careful design on both client and server. The client should implement exponential backoff with jitter to avoid thundering herd problems when a server restarts and thousands of clients reconnect simultaneously. The server should support idempotent event delivery, where the client sends the ID of the last event it received, and the server replays all events since that ID. This requires events to be sequenced (typically with a monotonically increasing ID or timestamp) and stored durably for at least the reconnection window period.

Scaling WebSocket Connections: The Redis Backplane

A single server can handle 10,000 to 50,000 concurrent WebSocket connections depending on message frequency, payload size, and server resources. Beyond that, you need horizontal scaling, and WebSocket connections create a problem that stateless HTTP does not have: connection affinity. A client's WebSocket connection is to a specific server. If you want to send a message to that client from a different server, you need a mechanism for cross server communication.

The standard solution is a backplane, a shared communication channel that all servers subscribe to. When a server needs to send a message to a client that might be connected to any server in the cluster, it publishes the message to the backplane. All servers receive the message and the one holding the target connection delivers it. Redis Pub/Sub is the most common backplane for SignalR, and Azure SignalR Service provides a managed alternative that eliminates the need to manage the backplane infrastructure yourself.

Redis as a backplane has practical limits. Redis Pub/Sub delivers messages to all subscribers, which means every server in the cluster processes every message even if the target client is not connected to that server. For clusters up to 10 to 20 servers handling up to 200,000 concurrent connections, this works well. Beyond that, the fan out cost becomes significant and you need to consider partitioning strategies: sharding connections across multiple Redis instances based on user ID or group membership.

Azure SignalR Service handles this scaling transparently. It offloads connection management entirely: clients connect to the Azure service rather than directly to your servers. Your servers communicate with Azure SignalR through its SDK, and the service handles connection routing, message delivery, and scaling to millions of connections. For applications that expect more than 100,000 concurrent connections, the managed service is almost always more cost effective than operating the infrastructure yourself.

Monitoring and Observability for Real Time Systems

Real time systems require monitoring that is itself real time. Checking connection counts on a 5 minute dashboard interval is insufficient when connection drops can affect thousands of users within seconds. The observability stack for a real time system should include four dimensions: connection metrics, message metrics, infrastructure metrics, and client side metrics.

Connection metrics include the total number of active connections, connections per server, connection rate (new connections per second), disconnection rate, and reconnection rate. A sudden spike in disconnections followed by a spike in reconnections indicates a transient infrastructure issue. A steadily climbing reconnection rate suggests clients are being disconnected and re establishing connections in a loop, which wastes resources and degrades experience.

Message metrics track the volume, latency, and delivery success of real time events. Measure the time from event production to client delivery (end to end latency), the messages per second through each hub, the backplane message throughput, and the per connection outbound queue depth. Alert on queue depth growth, which is the earliest indicator of back pressure problems.

Client side metrics are often overlooked but essential. Instrument your client application to report connection state transitions (connected, reconnecting, disconnected), message receipt latency (how long between server send and client receive), and missed message counts (if the client detects sequence gaps). These metrics reveal problems that server side monitoring cannot see: network issues in the last mile, client side processing bottlenecks, and CDN or proxy interference with WebSocket connections. Ship these metrics through your standard telemetry pipeline (Application Insights, Datadog, or Prometheus) alongside your server metrics for correlated analysis.

  • Track active connections per server and alert on sudden drops exceeding 10%
  • Measure end to end message latency from production to client delivery
  • Monitor per connection outbound queue depth for back pressure detection
  • Instrument client side connection state transitions and missed message counts
  • Set up synthetic monitoring that maintains test connections and verifies message delivery
  • Log connection durations to identify abnormally short lived connections indicating instability

Looking for help with .NET development, system architecture, or modernization?

We build production systems using the patterns and technologies discussed in this article. Tell us about your project.

Get in Touch