【问题标题】:Google App Engine - Intermittent 502 / connection reset by peerGoogle App Engine - 间歇性 502 / 对等方重置连接
【发布时间】:2021-09-05 15:27:26
【问题描述】:

在 Google App Engine (GAE) flex 实例上运行 nodejs 服务器我有客户端从我的应用程序中收到间歇性 502 错误。这些请求从未命中我的节点服务,但它们似乎与与对等方重置连接相关的 nginx 日志一致:

[error] 34#34: *25817 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: , request: "POST /endpoint HTTP/1.1", upstream: "http://172.17.0.1:8080/endpoint"
[error] 34#34: *27919 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: , request: "POST /endpoint HTTP/1.1", upstream: "http://172.17.0.1:8080/endpoint"
[error] 34#34: *28746 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: , request: "POST /endpoint HTTP/1.1", upstream: "http://172.17.0.1:8080/endpoint"
[error] 34#34: *28747 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: , request: "POST /endpoint HTTP/1.1", upstream: "http://172.17.0.1:8080/endpoint"
[error] 34#34: *24022 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: , request: "POST /endpoint HTTP/1.1", upstream: "http://172.17.0.1:8080/endpoint"
[error] 34#34: *29214 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: , request: "POST /endpoint HTTP/1.1", upstream: "http://172.17.0.1:8080/endpoint"

什么可能导致这种行为? CPU / 内存负载未接近资源限制,尽管当服务器处于某些负载时它似乎更频繁地发生。

【问题讨论】:

    标签: node.js google-app-engine


    【解决方案1】:

    在部署到 Google App Engine 时,负载平衡器会放置在实例前面。此负载平衡器的 HTTP 保持活动设置为 600 seconds

    然后负载均衡器连接到实例上的 nginx 服务,该服务使用 650 秒的保活时间,它甚至在配置中有一个有用的注释说它需要更高以防止竞争条件。

    # GCLB uses a 10 minutes keep-alive timeout. Setting it to a bit more here
    # to avoid a race condition between the two timeouts.
    keepalive_timeout 650;
    

    最后 nginx 反向代理到您的节点应用程序,它使用默认保持活动状态...5 seconds

    这会导致超时之间的竞争条件 (duh),您需要将节点服务器的超时设置为高于 650 秒。如果您正在使用如下所示的 expressjs:

    const app = express();
    const server = app.listen(process.env.PORT);
    
    //nginx uses a 650 second keep-alive timeout on GAE. Setting it to a bit more here to avoid a race condition between the two timeouts.
    server.keepAliveTimeout = 700000; 
    
    //ensure the headersTimeout is set higher than the keepAliveTimeout due to this nodejs regression bug: https://github.com/nodejs/node/issues/27363
    server.headersTimeout = 701000; 
    

    您可以查看Analyze ‘Connection reset’ error in Nginx upstream with keep-alive enabled,了解有关上游服务器需要更大超时时间的技术(TCP 级别)说明。

    【讨论】:

      猜你喜欢
      • 2018-10-24
      • 1970-01-01
      • 1970-01-01
      • 2018-04-28
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2011-03-29
      • 2013-06-02
      相关资源
      最近更新 更多