Synopsis:

Tenant claim about first request to NGINX return bad gateway

Architecture:

Internet ——> NGINX(proxy by hostname) ——> K8S cluster, float IP metalLB

NGINX log

2023/10/04 12:51:37 [error] 18641#18641: *416 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 111.98.254.197, server: wave.m-cloud.dev, request: "GET /argocd/login?return_url=https%3A%2F%2Fwave.m-cloud.dev%2Fargocd%2Fapplications HTTP/1.1", upstream: "http://10.0.50.128:80/argocd/login?return_url=https%3A%2F%2Fwave.m-cloud.dev%2Fargocd%2Fapplications", host: "wave.m-cloud.dev"2023/10/04 13:06:53 [notice] 27584#27584: signal process started2023/10/04 13:06:58 [notice] 27589#27589: signal process started2023/10/04 13:21:49 [error] 27590#27590: *477 open() "/var/www/html/11656810" failed (2: No such file or directory), client: 181.214.164.109, server: _, request: "POST /11656810 HTTP/1.1", host: "116.82.14.206"2023/10/04 13:41:35 [error] 27590#27590: *487 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 111.98.254.197, server: wave.m-cloud.dev, request: "GET /argocd/login?return_url=https%3A%2F%2Fwave.m-cloud.dev%2Fargocd%2Fapplications HTTP/1.1", upstream: "http://10.0.50.128:80/argocd/login?return_url=https%3A%2F%2Fwave.m-cloud.dev%2Fargocd%2Fapplications", host: "wave.m-cloud.dev"2023/10/04 14:03:14 [error] 27590#27590: *536 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 133.200.33.192, server: wave.m-cloud.dev, request: "GET /argocd/applications?showFavorites=false&proj=&sync=&autoSync=&health=&namespace=&cluster=&labels= HTTP/1.1", upstream: "http://10.0.50.128:80/argocd/applications?showFavorites=false&proj=&sync=&autoSync=&health=&namespace=&cluster=&labels=", host: "wave.m-cloud.dev"2023/10/04 14:23:24 [error] 27590#27590: *631 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 133.200.33.192, server: wave.m-cloud.dev, request: "GET /sandbox HTTP/1.1", upstream: "http://10.0.50.128:80/sandbox", host: "wave.m-cloud.dev"2023/10/04 14:39:29 [error] 27590#27590: *644 open() "/var/www/html/login.html" failed (2: No such file or directory), client: 192.241.215.35, server: _, request: "GET /login.html HTTP/1.1", host: "116.82.14.206"

DEBUG01:
extend timeout while waiting response from upstream
-> still happen
DEBUG02:
After 2~3 hours, always happens in first request.
-> container warmup
-> tenant confirm no warmup process here
-> tenant require OS upgrade
-> OS upgraded still happen
DEBUG03:
m-cloud deep package scan turn off
-> still happen
DEBUG04:
Network investigate, launch network inspector vm in same VLAN to tenant k8s cluster
-> package capture
-> somethings wrong with arp request and response
-> MetalLB config missing ???
-> tenant working in
-> tenant confirm MetalLB config float IP malfunction
-> tenant fix MetalLB config
-> resolved

DEBUG04: log
1>Confirm firewall layer
10.0.50.128 address do not boardcast itself, only has arp when receive request from nginx
first request from nginx show no ip address

root@home:~ # arp -a
<...>
? (10.0.50.152) at 9e:1e:7d:fd:8a:57 on bridge0 expires in 1077 seconds [bridge]
? (10.0.50.153) at fe:e0:85:6f:d4:5d on bridge0 expires in 1192 seconds [bridge]
? (10.0.50.154) at e6:a1:2d:7f:dd:d0 on bridge0 expires in 1187 seconds [bridge]
? (10.0.50.20) at 02:c1:ef:2e:1c:5d on bridge0 expires in 618 seconds [bridge]
? (10.0.50.150) at 46:5d:f8:fe:fb:ba on bridge0 expires in 1196 seconds [bridge]
? (10.0.50.151) at 2a:44:9d:89:7b:9c on bridge0 expires in 1012 seconds [bridge]
? (10.0.50.128) at fe:e0:85:6f:d4:5d on bridge0 expires in 1072 seconds [bridge]   ★★★★
? (10.0.50.1) at 58:9c:fc:10:97:26 on bridge0 permanent [bridge]
? (10.0.50.100) at 4a:99:a7:e6:03:23 on bridge0 expires in 703 seconds [bridge]

2>Package capture (so much memories about this tool, I used it when I was a student)

10.0.50.128 has multiple arp response, all node in k8s return this mac add for arp request :v
compare to 10.0.50.150 below

Leave a Reply

Your email address will not be published. Required fields are marked *