The Predecessors of WebSocket: The Past and Present of Short Polling and Long Polling

Network

2024 Dec 021227 words|Estimated reading time: 7 minutes

In modern web application development, to achieve real-time communication features (such as chat applications, stock market updates, online multiplayer games, etc.), efficient and low-latency bidirectional communication between the server and client is essential.

However, before the emergence of WebSocket, developers primarily relied on short polling and long polling techniques to implement relatively "real-time" communication. This article will detail the implementation methods, advantages, and disadvantages of these two techniques, as well as their roles in web development.

Short Polling

Implementation

Short polling is the most basic implementation method, and its core idea is:
The client periodically sends HTTP requests to the server to ask if there is new data. If there is, the server returns the data; if not, it returns an empty response.

The implementation steps are as follows:

The client sends an HTTP request to the server at fixed intervals (e.g., every 1 second).
After processing the request, the server checks for new data:
- If there is new data, it sends the data back in the response.
- If there is no new data, it returns an empty response (usually HTTP status code 204 or an empty JSON response).
The client receives the server's response, processes the returned data, and waits for the next interval to send the request again.

Code Example

Server-side (Node.js):

import http from "http";

const server = http.createServer((req, res) => {
  if (req.url === "/poll") {
    // Simulate new data arrival
    const randomNumber = Math.random();

    // Assume a 30% chance of returning new data
    if (randomNumber < 0.3) {
      res.writeHead(200, { "Content-Type": "application/json" });
      res.end(JSON.stringify({ message: "New data from server" }));
    } else {
      // No new data
      res.writeHead(204); // No Content
      res.end();
    }
  }
});

server.listen(3000, () => {
  console.log("Server is listening on http://localhost:3000");
});

Client-side (HTML+JavaScript):

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>Short Polling Example</title>
  <script>
    function poll() {
      fetch("http://localhost:3000/poll")
        .then((response) => {
          if (response.status === 200) {
            return response.json(); // Parse the returned JSON data
          } else if (response.status === 204) {
            return null; // No new data
          }
        })
        .then((data) => {
          if (data && data.message) {
            const messagesDiv = document.getElementById("messages");
            messagesDiv.innerHTML += `<p>${data.message}</p>`;
          }
        })
        .catch((error) => {
          console.error("Polling error:", error);
        })
        .finally(() => {
          // Poll again after a fixed interval
          setTimeout(poll, 1000); // Poll every 1 second
        });
    }

    window.onload = function () {
      poll(); // Start polling
    };
  </script>
</head>
<body>
  <h1>Short Polling Example</h1>
  <div id="messages"></div>
</body>
</html>

Advantages and Disadvantages

Advantages:

Simple implementation, good compatibility, suitable for all browsers and servers supporting HTTP.
No need to maintain long connections, suitable for low-frequency update scenarios.

Disadvantages:

Resource Waste: Even when the server has no new data, the client still periodically sends requests, wasting bandwidth and server resources.
Higher Latency: Due to the fixed interval between each request, there will be some delay in data updates.

Long Polling

Implementation

Long polling is an improved polling method, and its core idea is:
After the client sends an HTTP request, if the server temporarily has no new data, it does not immediately return a response, but keeps the connection open until new data is available.

The implementation steps are as follows:

The client sends an HTTP request to the server.
The server checks for new data:
- If there is, it returns the data immediately.
- If not, it keeps the connection open until there is new data (or times out and returns an empty response).
The client receives the response, processes the data immediately, and sends a new request, forming a loop.

Code Example

Here’s a simple implementation of a long polling server and client.

Server-side (Node.js):

import http, { IncomingMessage, ServerResponse } from "http";

let clients: (ServerResponse<IncomingMessage> & { req: IncomingMessage })[] = [];
let message: null | string = null;

const server = http.createServer((req, res) => {
  // Set CORS headers
  res.setHeader("Access-Control-Allow-Origin", "*");
  res.setHeader("Access-Control-Allow-Methods", "GET, POST");
  res.setHeader("Access-Control-Allow-Headers", "Content-Type");

  if (req.url === "/poll") {
    // Store the client's response object
    clients.push(res);

    // Listen for client close event to clean up resources
    req.on("close", () => {
      clients = clients.filter((client) => client !== res);
    });
  }

  if (req.url === "/send" && req.method === "POST") {
    let body = "";

    req.on("data", (chunk) => {
      body += chunk.toString();
    });

    req.on("end", () => {
      message = body;

      // Broadcast the message to all waiting clients
      clients.forEach((client) => {
        client.end(JSON.stringify({ message }));
      });

      clients = [];
      res.end("Message sent");
    });
  }
});

server.listen(3000, () => {
  console.log("Server is listening on http://localhost:3000");
});

Client-side (HTML+JavaScript):

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>Long Polling Example</title>
  <script>
    function poll() {
      fetch("http://localhost:3000/poll")
        .then((response) => response.json())
        .then((data) => {
          if (data.message) {
            const messagesDiv = document.getElementById("messages");
            messagesDiv.innerHTML += `<p>${data.message}</p>`;
          }
          // Continue polling
          poll();
        })
        .catch((error) => {
          console.error("Polling error:", error);
          setTimeout(poll, 5000); // Wait 5 seconds to retry on error
        });
    }

    function sendMessage() {
      const input = document.getElementById("messageInput");
      const message = input.value;
      fetch("http://localhost:3000/send", {
        method: "POST",
        body: message,
        headers: {
          "Content-Type": "text/plain",
        },
      })
        .then((response) => response.text())
        .then((data) => {
          console.log(data);
          input.value = ""; // Clear input
        })
        .catch((error) => {
          console.error("Send error:", error);
        });
    }

    window.onload = function () {
      poll(); // Start long polling
    };
  </script>
</head>
<body>
  <h1>Long Polling Example</h1>
  <div id="messages"></div>
  <input type="text" id="messageInput" placeholder="Enter your message" />
  <button onclick="sendMessage()">Send</button>
</body>
</html>

Advantages and Disadvantages

Advantages:

Compared to short polling, it reduces the number of ineffective requests, saving bandwidth and server resources.
Lower latency, making data arrival time closer to real-time.

Disadvantages:

Each request still requires re-establishing an HTTP connection, which involves some overhead.
If the number of clients is large, the server must maintain many long connections, which can lead to resource exhaustion.

Why Long Polling Works

Long polling can be implemented primarily because the HTTP protocol does not mandate that the server must respond to a request within a fixed time. It does not define a specific timeout for requests, leaving the handling of timeouts to the client's or server's implementation.

On the frontend, we can manually implement request timeouts using fetch:

function fetchWithTimeout(url, options, timeout = 5000) {
    return Promise.race([
        fetch(url, options),
        new Promise((_, reject) =>
            setTimeout(() => reject(new Error('Request timed out')), timeout)
        )
    ]);
}

// Usage example
fetchWithTimeout('https://api.example.com/data', { method: 'GET' }, 3000)
    .then(response => {
        if (!response.ok) {
            throw new Error('Network response was not ok');
        }
        return response.json();
    })
    .then(data => console.log(data))
    .catch(error => console.error('Fetch error:', error));

This flexibility allows the server, upon receiving a client's request, to choose not to return a response immediately. Instead, it can maintain the connection until new data is available.

This behavior benefits from the HTTP protocol's flexibility and the reliability of the underlying TCP/IP protocol, which ensures data transmission integrity and reliability. Even if the connection remains open for a long time, data will not be lost.

At the same time, the IP protocol ensures that data packets can correctly reach their intended recipients, allowing requests and responses to be transmitted smoothly across the network.

Therefore, long polling takes advantage of the flexibility of HTTP and the reliability of the underlying TCP/IP protocols to implement a pseudo-real-time communication method.