Nginx 大流量负载均衡答案

【问题标题】：Nginx High volume traffic load balancingNginx 大流量负载均衡
【发布时间】：2012-08-27 09:01:03
【问题描述】：

在过去的 3 周里，我们一直在测试 Nginx 作为负载平衡。目前，我们无法成功处理超过 1000 个请求/秒和 18K 个活动连接。当我们达到上述数字时，Nginx 开始挂起，并返回超时代码。获得响应的唯一方法是大幅减少连接数。

我必须注意，我的服务器可以并且确实每天处理这么多的流量，我们目前使用简单的轮询 DNS 平衡。

我们正在使用具有以下硬件的专用服务器：

英特尔至强 E5620 CPU
16GB 内存
2T SATA 硬盘
1Gb/s 连接
操作系统：CentOS 5.8

我们需要负载平衡 7 个运行 Tomcat6 并在 peek 时间上处理超过 2000 req/sec 的后端服务器，处理 HTTP 和 HTTPS 请求。

在运行 Nginx 时，cpu 消耗约为 15%，使用的 RAM 约为 100MB。

我的问题是：

有没有人尝试过使用 nginx 对这种流量进行负载均衡？
你认为 nginx 可以处理这样的流量吗？
您知道导致挂起的原因吗？
我的配置是否遗漏了什么？

以下是我的配置文件：

nginx.conf：

user  nginx;
worker_processes 10;

worker_rlimit_nofile 200000;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;


events {
    worker_connections  10000;
    use epoll;
    multi_accept on;
}


http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    #access_log  /var/log/nginx/access.log  main;
    access_log off;

    sendfile        on;
    tcp_nopush     on;

    keepalive_timeout  65;
    reset_timedout_connection on;

    gzip  on;
    gzip_comp_level 1;
    include /etc/nginx/conf.d/*.conf;
}

servers.conf：

#Set the upstream (servers to load balance)
#HTTP stream
upstream adsbar {
  least_conn;
  server xx.xx.xx.34 max_fails=2 fail_timeout=15s;
  server xx.xx.xx.36 max_fails=2 fail_timeout=15s;
  server xx.xx.xx.37 max_fails=2 fail_timeout=15s;
  server xx.xx.xx.39 max_fails=2 fail_timeout=15s;
  server xx.xx.xx.40 max_fails=2 fail_timeout=15s;
  server xx.xx.xx.42 max_fails=2 fail_timeout=15s;
  server xx.xx.xx.43 max_fails=2 fail_timeout=15s;
}      

#HTTPS stream
upstream adsbar-ssl {
  least_conn;
  server xx.xx.xx.34:443 max_fails=2 fail_timeout=15s;
  server xx.xx.xx.36:443 max_fails=2 fail_timeout=15s;
  server xx.xx.xx.37:443 max_fails=2 fail_timeout=15s;
  server xx.xx.xx.39:443 max_fails=2 fail_timeout=15s;
  server xx.xx.xx.40:443 max_fails=2 fail_timeout=15s;
  server xx.xx.xx.42:443 max_fails=2 fail_timeout=15s;
  server xx.xx.xx.43:443 max_fails=2 fail_timeout=15s;
}

#HTTP
server {
  listen xxx.xxx.xxx.xxx:8080;
  server_name www.mycompany.com;
  location / {
      proxy_set_header Host $host;
      # So the original HTTP Host header is preserved
      proxy_set_header X-Real-IP $remote_addr;
      # The IP address of the client (which might be a proxy itself)
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_pass http://adsbar;
  }
}

#HTTPS
server {
  listen xxx.xxx.xxx.xxx:8443;
  server_name www.mycompany.com;
  ssl on;
  ssl_certificate /etc/pki/tls/certs/mycompany.crt;
  # Path to an SSL certificate;
  ssl_certificate_key /etc/pki/tls/private/mycompany.key;
  # Path to the key for the SSL certificate;
  location / {
      proxy_set_header Host $host;
      # So the original HTTP Host header is preserved
      proxy_set_header X-Real-IP $remote_addr;
      # The IP address of the client (which might be a proxy itself)
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_pass https://adsbar-ssl;
  }
}

server {
    listen xxx.xxx.xxx.xxx:61709;
    location /nginx_status {
        stub_status on;
        access_log off;
        allow 127.0.0.1;
        deny all;
    }
}

sysctl.conf：

# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 

0 is disabled, 1 is enabled.  See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 1

# Controls whether core dumps will append the PID to the core filename
# Useful for debugging multi-threaded applications
kernel.core_uses_pid = 1

# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1

# Controls the maximum size of a message, in bytes
kernel.msgmnb = 65536

# Controls the default maxmimum size of a mesage queue
kernel.msgmax = 65536

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296

fs.file-max = 120000
net.ipv4.ip_conntrack_max = 131072
net.ipv4.tcp_max_syn_backlog = 8196
net.ipv4.tcp_fin_timeout = 25
net.ipv4.tcp_keepalive_time = 3600
net.ipv4.ip_local_port_range = 1024 65000
net.ipv4.tcp_rmem = 4096 25165824 25165824
net.core.rmem_max = 25165824
net.core.rmem_default = 25165824
net.ipv4.tcp_wmem = 4096 65536 25165824
net.core.wmem_max = 25165824
net.core.wmem_default = 65536
net.core.optmem_max = 25165824
net.core.netdev_max_backlog = 2500
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1

我们将不胜感激任何帮助、指导和想法。

【问题讨论】：

标签： nginx dns kernel load-balancing tomcat6

【解决方案1】：

这里有一些很好的参考：

http://dak1n1.com/blog/12-nginx-performance-tuning

服务器故障： https://serverfault.com/questions/221292/tips-for-maximizing-nginx-requests-sec

来自 dak1n1 链接的详细记录配置：

# This number should be, at maximum, the number of CPU cores on your system. 
# (since nginx doesn't benefit from more than one worker per CPU.)
worker_processes 24;

# Number of file descriptors used for Nginx. This is set in the OS with 'ulimit -n 200000'
# or using /etc/security/limits.conf
worker_rlimit_nofile 200000;


# only log critical errors
error_log /var/log/nginx/error.log crit


# Determines how many clients will be served by each worker process.
# (Max clients = worker_connections * worker_processes)
# "Max clients" is also limited by the number of socket connections available on the system (~64k)
worker_connections 4000;


# essential for linux, optmized to serve many clients with each thread
use epoll;


# Accept as many connections as possible, after nginx gets notification about a new connection.
# May flood worker_connections, if that option is set too low.
multi_accept on;


# Caches information about open FDs, freqently accessed files.
# Changing this setting, in my environment, brought performance up from 560k req/sec, to 904k req/sec.
# I recommend using some varient of these options, though not the specific values listed below.
open_file_cache max=200000 inactive=20s; 
open_file_cache_valid 30s; 
open_file_cache_min_uses 2;
open_file_cache_errors on;


# Buffer log writes to speed up IO, or disable them altogether
#access_log /var/log/nginx/access.log main buffer=16k;
access_log off;


# Sendfile copies data between one FD and other from within the kernel. 
# More efficient than read() + write(), since the requires transferring data to and from the user space.
sendfile on; 


# Tcp_nopush causes nginx to attempt to send its HTTP response head in one packet, 
# instead of using partial frames. This is useful for prepending headers before calling sendfile, 
# or for throughput optimization.
tcp_nopush on;


# don't buffer data-sends (disable Nagle algorithm). Good for sending frequent small bursts of data in real time.
tcp_nodelay on; 


# Timeout for keep-alive connections. Server will close connections after this time.
keepalive_timeout 30;


# Number of requests a client can make over the keep-alive connection. This is set high for testing.
keepalive_requests 100000;


# allow the server to close the connection after a client stops responding. Frees up socket-associated memory.
reset_timedout_connection on;


# send the client a "request timed out" if the body is not loaded by this time. Default 60.
client_body_timeout 10;


# If the client stops reading data, free up the stale client connection after this much time. Default 60.
send_timeout 2;


# Compression. Reduces the amount of data that needs to be transferred over the network
gzip on;
gzip_min_length 10240;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml;
gzip_disable "MSIE [1-6]\.";

还有更多关于 sysctl.conf 的 linux 系统调优的信息：

# Increase system IP port limits to allow for more connections

net.ipv4.ip_local_port_range = 2000 65000


net.ipv4.tcp_window_scaling = 1


# number of packets to keep in backlog before the kernel starts dropping them 
net.ipv4.tcp_max_syn_backlog = 3240000


# increase socket listen backlog
net.core.somaxconn = 3240000
net.ipv4.tcp_max_tw_buckets = 1440000


# Increase TCP buffer sizes
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_congestion_control = cubic

【讨论】：

您是否对负载平衡器服务器和后端服务器或其中之一进行了这些更改？
请提供更多详细信息。如果容量很大，每个 nginx 服务器都会进行这些调整。
我假设如果我有数据库负载平衡服务器（pgpool，而不是 nginx 服务器），考虑到每个请求都会使用数据库连接，它也应该获取设置。相反，pgpool 和 postgres 之间的连接不会采用这些设置，因为在 pgpool 和 postgres 之间建立了持久连接，因此不会为每个数据库请求建立新的 tcp 连接。这听起来正确吗？
@JosephPersie 不确定，因为这是关于 nginx，而不是 pgpool。我也从未使用过 pgpool。

【解决方案2】：

nginx 肯定能够处理超过 1000 个请求/秒（我在使用 jmeter 的廉价笔记本电脑上使用 2 个内核中的一个和一半时在 nginx 中获得大约 2800 个请求/秒）

据我了解，您正在使用 epoll，这是当前 linux 内核上的最佳选择。

您已经关闭了 acces_log，因此您的磁盘 IO 也不应该成为瓶颈（注意：您还可以将 access_log 设置为具有大缓冲区的缓冲模式，它只在每个 x kb 之后写入，这样可以避免磁盘 io 不断被敲击，但保留日志以供分析）

我的理解是，为了最大化 nginx 性能，您通常将 worker_processes 的数量设置为等于 core/cpu 的数量，然后增加 worker_connections 的数量以允许更多的并发连接（以及打开文件的数量 os限制）。然而，在您上面发布的数据中，您有一个四核 cpu，有 10 个工作进程，每个进程允许 10k 个连接。因此，在 nginx 方面，我会尝试类似的方法：

worker_processes 4;
worker_rlimit_nofile 999999;
events {
  worker_connections 32768;
  use epoll;
  multi_accept on;
}

在内核方面，我会以不同方式调整 tcp 读取和写入缓冲区，您需要较小的最小值、较小的默认值和较大的最大值。

您已经扩大了临时端口范围。

我会增加打开文件的数量限制，因为您将有很多打开的套接字。

在 /etc/sysctl.conf 中添加/更改以下行

net.ipv4.tcp_rmem = 4096 4096 25165824                                
net.ipv4.tcp_wmem = 4096 4096 25165824
fs.file-max=999999

希望对您有所帮助。

【讨论】：

【解决方案3】：

我发现使用最小连接算法是有问题的。我换了

hash $remote_addr consistent;

并更快地找到服务。

【讨论】：