Scaling SignalR is harder than scaling a stateless API. Because the connection is persistent, the user must stay 'glued' to the same server.
When you have multiple web servers behind a load balancer, you must enable **Sticky Sessions** (Application Request Routing). This ensures that the SignalR handshake and the subsequent WebSocket connection both land on the same server instance. Without this, the connection will fail immediately because Server B won't know about the connection request that landed on Server A.
Each SignalR connection uses memory on the server. If you have 50,000 users, and each connection takes 50KB, that's 2.5GB of RAM just for the 'Socket' state. As an architect, you must monitor memory usage as closely as CPU when scaling real-time apps.
Q: "Why is statelessness an issue for SignalR?"
Architect Answer: "Because of **Hub broadcasts**. If User A is on Server 1 and User B is on Server 2, and User A sends a message to 'All Users', Server 1 only knows about the users on itself. User B will never get the message. This requires a 'Backplane' to sync messages across servers."