日本免费高清视频-国产福利视频导航-黄色在线播放国产-天天操天天操天天操天天操|www.shdianci.com

學(xué)無(wú)先后,達(dá)者為師

網(wǎng)站首頁(yè) 編程語(yǔ)言 正文

k8s 之 service ip

作者:分享放大價(jià)值 更新時(shí)間: 2022-07-12 編程語(yǔ)言

本文通過(guò)下面的例子,分析訪問(wèn)service ip的流程及iptables規(guī)則如何生效。

創(chuàng)建service

通過(guò)此yaml文件創(chuàng)建三個(gè)pod,一個(gè)client,兩個(gè)nginx(監(jiān)聽(tīng)在80端口),和一個(gè)service(將9999映射到nginx的80端口),實(shí)現(xiàn)到nginx后端的負(fù)載均衡。

[master-1 ~]# cat pod.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx1
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      nodeSelector:
        kubernetes.io/hostname: pccc-203-10-worker-1
      containers:
        - name: hello
          image: nginx
          ports:
            - name: http
              containerPort: 80

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx2
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      nodeSelector:
        kubernetes.io/hostname: pccc-203-10-worker-1
      containers:
        - name: hello
          image: nginx
          ports:
            - name: http
              containerPort: 80

---
apiVersion: v1
kind: Service
metadata:
  name: test
spec:
  selector:
    app: nginx
  ports:
  - protocol: TCP
    port: 9999
    targetPort: 80

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: client
spec:
  selector:
    matchLabels:
      app: test
  replicas: 1
  template:
    metadata:
      labels:
        app: test
    spec:
      nodeSelector:
        kubernetes.io/hostname: pccc-203-10-worker-2
      containers:
        - name: hello
          image: dgarros/tcpreplay:latest

[master-1 ~]# kubectl apply -f pod.yaml
deployment.apps/nginx1 created
deployment.apps/nginx2 created
service/test created
deployment.apps/client created

查看創(chuàng)建的三個(gè)pod,兩個(gè)nginx pod部署在worker1上,client部署在worker2上。

[master-1 ~]# kubectl get pod -o wide
NAME                         READY   STATUS      RESTARTS   AGE   IP             NODE                   NOMINATED NODE   READINESS GATES
client-6688779b7f-dzkwv      1/1     Running     0          93s   10.1.236.141   worker-2   <none>           <none>
nginx1-6fbbb6bf5c-5trp7      1/1     Running     0          45s   10.1.139.84    worker-1   <none>           <none>
nginx2-6fbbb6bf5c-t2b5b      1/1     Running     0          45s   10.1.139.93    worker-1   <none>           <none>

查看創(chuàng)建的service,可看到對(duì)應(yīng)的兩個(gè)endpoint。

[master-1 ~]# kubectl describe svc test
Name:              test
Namespace:         default
Labels:            <none>
Annotations:       Selector:  app=nginx
Type:              ClusterIP
IP:                10.99.64.233
Port:              <unset>  9999/TCP
TargetPort:        80/TCP
Endpoints:         10.1.139.84:80,10.1.139.93:80
Session Affinity:  None
Events:            <none>

service創(chuàng)建成功后,會(huì)在每個(gè)worker node上添加如下iptable規(guī)則

#下面8條規(guī)則是公共部分
:KUBE-SERVICES - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-POSTROUTING - [0:0]

-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES

-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE

-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000

#下面幾條規(guī)則是創(chuàng)建一個(gè)service添加的
#如果在worker node上訪問(wèn)service ip,需要給數(shù)據(jù)包加標(biāo)記  0x4000/0x4000,以便在POSTROUTING做snat。
-A KUBE-SERVICES ! -s 10.1.0.0/16 -d 10.99.64.233/32 -p tcp -m comment --comment "default/test: cluster IP" -m tcp --dport 9999 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.99.64.233/32 -p tcp -m comment --comment "default/test: cluster IP" -m tcp --dport 9999 -j KUBE-SVC-IOIC7CRUMQYLZ32S
#通過(guò)這兩條規(guī)則做負(fù)載均衡
-A KUBE-SVC-IOIC7CRUMQYLZ32S -m comment --comment "default/test:" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-LOGU6L2JYVKGEHXE
-A KUBE-SVC-IOIC7CRUMQYLZ32S -m comment --comment "default/test:" -j KUBE-SEP-ZSSX2RT66T6MAGNI
#如果訪問(wèn)service的pod正好是提供服務(wù)的pod,也需要給數(shù)據(jù)包加標(biāo)記0x4000/0x4000,以便在POSTROUTING做snat。
-A KUBE-SEP-LOGU6L2JYVKGEHXE -s 10.1.139.84/32 -m comment --comment "default/test:" -j KUBE-MARK-MASQ
-A KUBE-SEP-LOGU6L2JYVKGEHXE -p tcp -m comment --comment "default/test:" -m tcp -j DNAT --to-destination 10.1.139.84:80
#如果訪問(wèn)service的pod正好是提供服務(wù)的pod,也需要給數(shù)據(jù)包加標(biāo)記0x4000/0x4000,以便在POSTROUTING做snat。
-A KUBE-SEP-ZSSX2RT66T6MAGNI -s 10.1.139.93/32 -m comment --comment "default/test:" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZSSX2RT66T6MAGNI -p tcp -m comment --comment "default/test:" -m tcp -j DNAT --to-destination 10.1.139.93:80

訪問(wèn)service ip

有如下三種訪問(wèn)service ip的場(chǎng)景,下面分別驗(yàn)證并分析iptables規(guī)則
a. 在client pod內(nèi)部訪問(wèn)
b. 在worker node上訪問(wèn)
c. 在監(jiān)聽(tīng)80端口的nginx pod中訪問(wèn)

a. 在client pod內(nèi)部訪問(wèn)

image.png

  1. 第一步數(shù)據(jù)包從pod內(nèi)部發(fā)出去,eth0類(lèi)型為veth,調(diào)用 veth_xmit 發(fā)送。
  2. eth0和host上的calie8f03783e20成對(duì)存在,即從eth0發(fā)送的數(shù)據(jù)包會(huì)達(dá)到calie8f03783e20。在 calie8f03783e20 調(diào)用 netif_rx_internal 進(jìn)入worker上的軟中斷處理函數(shù),開(kāi)始執(zhí)行worker上的內(nèi)核協(xié)議棧流程。
  3. 在iptables的PREROUTING處,首先經(jīng)過(guò)conntrack處理,創(chuàng)建鏈接跟蹤表項(xiàng),記錄數(shù)據(jù)包的狀態(tài)。可通過(guò)conntrack命令查看鏈接跟蹤表項(xiàng)

[worker-2 ~]# conntrack -L | grep 10.1.236.141
tcp      6 102 TIME_WAIT src=10.1.236.141 dst=10.99.64.233 sport=44462 dport=9999 src=10.1.139.84 dst=10.1.236.141 sport=80 dport=44462 [ASSURED] mark=0 use=1

然后進(jìn)入nat表的處理,在PREROUTING 鏈上依次查找如下的規(guī)則

#跳轉(zhuǎn)到鏈KUBE-SERVICES
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
#數(shù)據(jù)包從pod內(nèi)部發(fā)出,源ip為10.1.0.0/16網(wǎng)段,不滿(mǎn)足此條規(guī)則
-A KUBE-SERVICES ! -s 10.1.0.0/16 -d 10.99.64.233/32 -p tcp -m comment --comment "default/test: cluster IP" -m tcp --dport 9999 -j KUBE-MARK-MASQ
#數(shù)據(jù)包目的ip為10.99.64.233,四層協(xié)議為tcp,目的端口為9999,滿(mǎn)足此條規(guī)則,跳轉(zhuǎn)到 KUBE-SVC-IOIC7CRUMQYLZ32S
-A KUBE-SERVICES -d 10.99.64.233/32 -p tcp -m comment --comment "default/test: cluster IP" -m tcp --dport 9999 -j KUBE-SVC-IOIC7CRUMQYLZ32S
#下面兩條為負(fù)載均衡機(jī)制,因?yàn)橛袃蓚€(gè)后端pod,所以只有兩條,
#如果有多個(gè)pod,就會(huì)有多條規(guī)則。--probability 0.50000000000表
#示命令概率為50%。注意這里的負(fù)載均衡完全是隨機(jī)的,不會(huì)考慮通五元組流發(fā)給同一個(gè)pod
-A KUBE-SVC-IOIC7CRUMQYLZ32S -m comment --comment "default/test:" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-LOGU6L2JYVKGEHXE
-A KUBE-SVC-IOIC7CRUMQYLZ32S -m comment --comment "default/test:" -j KUBE-SEP-ZSSX2RT66T6MAGNI
#假如命中前一個(gè)規(guī)則,則執(zhí)行下面兩條規(guī)則。
#源ip不為10.1.139.84/32,不滿(mǎn)足此規(guī)則
-A KUBE-SEP-LOGU6L2JYVKGEHXE -s 10.1.139.84/32 -m comment --comment "default/test:" -j KUBE-MARK-MASQ
#滿(mǎn)足最后一條規(guī)則,target為dnat到 10.1.139.84:80
-A KUBE-SEP-LOGU6L2JYVKGEHXE -p tcp -m comment --comment "default/test:" -m tcp -j DNAT --to-destination 10.1.139.84:80

-A KUBE-SEP-ZSSX2RT66T6MAGNI -s 10.1.139.93/32 -m comment --comment "default/test:" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZSSX2RT66T6MAGNI -p tcp -m comment --comment "default/test:" -m tcp -j DNAT --to-destination 10.1.139.93:80

雖然POSTROUTING鏈上也有規(guī)則,但是都不匹配。
所以查找nat表的結(jié)果就是做了dnat。

  1. 將目的ip轉(zhuǎn)換成 10.1.139.84后,查找到如下路由表項(xiàng),

10.1.139.64/26 via 192.168.2.2 dev tunl0
  1. 在tunl0處,封裝成ipip報(bào)文,外層目的ip為192.168.2.2,再次查找路由表,最終從em1發(fā)送出去

192.168.2.0/24 dev em1
  1. 在worker1上,em1收到報(bào)文后,交給ipip模塊處理,在tunl0處去掉外層ip,查找路由表可知,將報(bào)文發(fā)送給calic7bfae5c264

10.1.139.93 dev calic7bfae5c264
  1. calic7bfae5c264調(diào)用veth_xmit發(fā)送到nginx pod的eth0,報(bào)文開(kāi)始執(zhí)行pod內(nèi)部協(xié)議棧流程,最終發(fā)給nginx服務(wù)。

b. 在worker node上訪問(wèn)

image.png

  1. 在worker node上訪問(wèn)service ip時(shí),首先查找路由表,匹配到默認(rèn)路由

default via 192.168.2.254 dev em1
  1. 在iptables的OUTPUT處,首先經(jīng)過(guò)conntrack處理,創(chuàng)建鏈接跟蹤表項(xiàng),記錄數(shù)據(jù)包的狀態(tài)。可通過(guò)conntrack命令查看鏈接跟蹤表項(xiàng)

[worker-2 ~]# curl 10.99.64.233:9999
[worker-2 ~]# conntrack -L | grep 10.99.64.233
tcp      6 118 TIME_WAIT src=192.168.2.3 dst=10.99.64.233 sport=36064 dport=9999 src=10.1.139.93 dst=10.1.236.128 sport=80 dport=36064 [ASSURED] mark=0 use=1

然后進(jìn)入nat表的處理,依次查找如下的規(guī)則

#跳轉(zhuǎn)到鏈KUBE-SERVICES
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
#數(shù)據(jù)包從worker node發(fā)出,源ip不是10.1.0.0/16網(wǎng)段,滿(mǎn)足此條規(guī)則,跳轉(zhuǎn)到鏈KUBE-MARK-MASQ
-A KUBE-SERVICES ! -s 10.1.0.0/16 -d 10.99.64.233/32 -p tcp -m comment --comment "default/test: cluster IP" -m tcp --dport 9999 -j KUBE-MARK-MASQ
#在此鏈上,給數(shù)據(jù)包做標(biāo)記
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
#繼續(xù)匹配,滿(mǎn)足此規(guī)則
-A KUBE-SERVICES -d 10.99.64.233/32 -p tcp -m comment --comment "default/test: cluster IP" -m tcp --dport 9999 -j KUBE-SVC-IOIC7CRUMQYLZ32S
#下面兩條為負(fù)載均衡機(jī)制,因?yàn)橛袃蓚€(gè)后端pod,所以只有兩條,
#如果有多個(gè)pod,就會(huì)有多條規(guī)則。--probability 0.50000000000表
#示命令概率為50%。注意這里的負(fù)載均衡完全是隨機(jī)的,不會(huì)考慮通五元組流發(fā)給同一個(gè)pod
-A KUBE-SVC-IOIC7CRUMQYLZ32S -m comment --comment "default/test:" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-LOGU6L2JYVKGEHXE
-A KUBE-SVC-IOIC7CRUMQYLZ32S -m comment --comment "default/test:" -j KUBE-SEP-ZSSX2RT66T6MAGNI
#假如命中前一個(gè)規(guī)則,則執(zhí)行下面兩條規(guī)則。
#源ip不為10.1.139.84/32,不滿(mǎn)足此規(guī)則
-A KUBE-SEP-LOGU6L2JYVKGEHXE -s 10.1.139.84/32 -m comment --comment "default/test:" -j KUBE-MARK-MASQ
#滿(mǎn)足最后一條規(guī)則,target為dnat到 10.1.139.84:80
-A KUBE-SEP-LOGU6L2JYVKGEHXE -p tcp -m comment --comment "default/test:" -m tcp -j DNAT --to-destination 10.1.139.84:80

-A KUBE-SEP-ZSSX2RT66T6MAGNI -s 10.1.139.93/32 -m comment --comment "default/test:" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZSSX2RT66T6MAGNI -p tcp -m comment --comment "default/test:" -m tcp -j DNAT --to-destination 10.1.139.93:80

在OUTPUT鏈上,匹配到dnat規(guī)則,將數(shù)據(jù)包的目的ip/port換成了10.1.139.84:80或者10.1.139.93:80,并且給數(shù)據(jù)包做了標(biāo)記0x4000/0x4000。

  1. 因?yàn)槟康膇p被修改了,所以重新查找路由表,如下,下一跳為192.168.2.2,經(jīng)過(guò)經(jīng)過(guò)tunl0發(fā)送出去

10.1.139.64/26 via 192.168.2.2 dev tunl0
  1. 在POSTROUTING鏈查找如下規(guī)則,因?yàn)橹耙呀?jīng)給數(shù)據(jù)包加了標(biāo)記,此處可以匹配成功。

-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
#目標(biāo)為MASQUERADE,意味著需要做snat
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE

根據(jù)MASQUERADE做snat時(shí),源地址選擇可以通過(guò)下面命令獲取,在此例中,源ip為10.1.236.128

[root@pccc-203-10-worker-2 ~]# ip route get 10.1.139.84
10.1.139.84 via 192.168.2.2 dev tunl0 src 10.1.236.128
    cache
  1. 接下來(lái)需要發(fā)給tunl0,在此處給報(bào)文封裝外層ip,外層目的ip為192.168.2.2,再次查找路由表,需要通過(guò)em1發(fā)送出去

192.168.2.0/24 dev em1
  1. 數(shù)據(jù)包到底worker1后的處理和場(chǎng)景a一樣。

c. 在nginx pod訪問(wèn)
這里還要再分兩種場(chǎng)景,負(fù)載均衡后的ip是發(fā)起訪問(wèn)的pod和不是發(fā)起訪問(wèn)的pod。比如 在nginx1 pod內(nèi)部訪問(wèn)nginx的service服務(wù),負(fù)載均衡后的ip為nginx1 pod的ip,或者為nginx2 pod的ip。

不同pod
在nginx1 pod內(nèi)部訪問(wèn)nginx的service服務(wù),負(fù)載均衡后的ip為nginx2 pod的ip。

image.png


假設(shè)從nginx1 pod內(nèi)部訪問(wèn)service,前面部分和場(chǎng)景a是一樣的,在PREROUTING的nat表處做dnat,

#跳轉(zhuǎn)到鏈KUBE-SERVICES
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
#數(shù)據(jù)包從pod內(nèi)部發(fā)出,源ip為10.1.0.0/16網(wǎng)段,不滿(mǎn)足此條規(guī)則
-A KUBE-SERVICES ! -s 10.1.0.0/16 -d 10.99.64.233/32 -p tcp -m comment --comment "default/test: cluster IP" -m tcp --dport 9999 -j KUBE-MARK-MASQ
#數(shù)據(jù)包目的ip為10.99.64.233,四層協(xié)議為tcp,目的端口為9999,滿(mǎn)足此條規(guī)則,跳轉(zhuǎn)到 KUBE-SVC-IOIC7CRUMQYLZ32S
-A KUBE-SERVICES -d 10.99.64.233/32 -p tcp -m comment --comment "default/test: cluster IP" -m tcp --dport 9999 -j KUBE-SVC-IOIC7CRUMQYLZ32S
#下面兩條為負(fù)載均衡機(jī)制,因?yàn)橛袃蓚€(gè)后端pod,所以只有兩條,
#如果有多個(gè)pod,就會(huì)有多條規(guī)則。--probability 0.50000000000表
#示命令概率為50%。注意這里的負(fù)載均衡完全是隨機(jī)的,不會(huì)考慮通五元組流發(fā)給同一個(gè)pod
-A KUBE-SVC-IOIC7CRUMQYLZ32S -m comment --comment "default/test:" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-LOGU6L2JYVKGEHXE
-A KUBE-SVC-IOIC7CRUMQYLZ32S -m comment --comment "default/test:" -j KUBE-SEP-ZSSX2RT66T6MAGNI
#假如命中后一個(gè)規(guī)則,則不會(huì)執(zhí)行下面兩條規(guī)則。
-A KUBE-SEP-LOGU6L2JYVKGEHXE -s 10.1.139.84/32 -m comment --comment "default/test:" -j KUBE-MARK-MASQ
-A KUBE-SEP-LOGU6L2JYVKGEHXE -p tcp -m comment --comment "default/test:" -m tcp -j DNAT --to-destination 10.1.139.84:80
#負(fù)載均衡選擇下面兩條規(guī)則
#源ip為10.1.139.84,不滿(mǎn)足此規(guī)則
-A KUBE-SEP-ZSSX2RT66T6MAGNI -s 10.1.139.93/32 -m comment --comment "default/test:" -j KUBE-MARK-MASQ
#滿(mǎn)足此規(guī)則,做dnat
-A KUBE-SEP-ZSSX2RT66T6MAGNI -p tcp -m comment --comment "default/test:" -m tcp -j DNAT --to-destination 10.1.139.93:80

將目的ip修改為10.1.139.93后,查找路由表時(shí),發(fā)現(xiàn)只需要發(fā)給本worker上的calic6244c9748e即可。

10.1.139.93 dev calic7bfae5c264

此場(chǎng)景下的鏈接跟蹤表項(xiàng)

[worker-1 ~]# conntrack -L | grep 10.99.64.233
tcp      6 17 TIME_WAIT src=10.1.139.84 dst=10.99.64.233 sport=43328 dport=9999 src=10.1.139.93 dst=10.1.139.84 sport=80 dport=43328 [ASSURED] mark=0 use=1

同一個(gè)pod
在nginx1 pod內(nèi)部訪問(wèn)nginx的service服務(wù),負(fù)載均衡后的ip為nginx1 pod的ip。和上面的場(chǎng)景的區(qū)別是,不只做dnat,還要做snat。

image.png


假設(shè)從nginx1 pod內(nèi)部訪問(wèn)service,前面部分和場(chǎng)景a是一樣的,在PREROUTING的nat表處做dnat,

#跳轉(zhuǎn)到鏈KUBE-SERVICES
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
#數(shù)據(jù)包從pod內(nèi)部發(fā)出,源ip為10.1.0.0/16網(wǎng)段,不滿(mǎn)足此條規(guī)則
-A KUBE-SERVICES ! -s 10.1.0.0/16 -d 10.99.64.233/32 -p tcp -m comment --comment "default/test: cluster IP" -m tcp --dport 9999 -j KUBE-MARK-MASQ
#數(shù)據(jù)包目的ip為10.99.64.233,四層協(xié)議為tcp,目的端口為9999,滿(mǎn)足此條規(guī)則,跳轉(zhuǎn)到 KUBE-SVC-IOIC7CRUMQYLZ32S
-A KUBE-SERVICES -d 10.99.64.233/32 -p tcp -m comment --comment "default/test: cluster IP" -m tcp --dport 9999 -j KUBE-SVC-IOIC7CRUMQYLZ32S
#下面兩條為負(fù)載均衡機(jī)制,因?yàn)橛袃蓚€(gè)后端pod,所以只有兩條,
#如果有多個(gè)pod,就會(huì)有多條規(guī)則。--probability 0.50000000000表
#示命令概率為50%。注意這里的負(fù)載均衡完全是隨機(jī)的,不會(huì)考慮通五元組流發(fā)給同一個(gè)pod
-A KUBE-SVC-IOIC7CRUMQYLZ32S -m comment --comment "default/test:" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-LOGU6L2JYVKGEHXE
-A KUBE-SVC-IOIC7CRUMQYLZ32S -m comment --comment "default/test:" -j KUBE-SEP-ZSSX2RT66T6MAGNI
#假如命中前一個(gè)規(guī)則,則執(zhí)行下面兩條規(guī)則。
#源ip為10.1.139.84,滿(mǎn)足此規(guī)則,跳轉(zhuǎn)到KUBE-MARK-MASQ
#給數(shù)據(jù)包做標(biāo)記 0x4000/0x4000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-SEP-LOGU6L2JYVKGEHXE -s 10.1.139.84/32 -m comment --comment "default/test:" -j KUBE-MARK-MASQ
#滿(mǎn)足此規(guī)則,做dnat
-A KUBE-SEP-LOGU6L2JYVKGEHXE -p tcp -m comment --comment "default/test:" -m tcp -j DNAT --to-destination 10.1.139.84:80
#不執(zhí)行下面兩條規(guī)則
-A KUBE-SEP-ZSSX2RT66T6MAGNI -s 10.1.139.93/32 -m comment --comment "default/test:" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZSSX2RT66T6MAGNI -p tcp -m comment --comment "default/test:" -m tcp -j DNAT --to-destination 10.1.139.93:80

將目的ip修改為10.1.139.84后,查找路由表時(shí),發(fā)現(xiàn)只需要發(fā)給本worker上的calic6244c9748e。

10.1.139.84 dev calif67c1668c34

但是在POSTROUTING處,還需要執(zhí)行如下兩條規(guī)則,因?yàn)閿?shù)據(jù)包已經(jīng)被打上標(biāo)記0x4000/0x4000,所以在這里還要執(zhí)行MASQUERADE,即snat

-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE

snat后的源ip可以通過(guò)如下命令獲取

[root@pccc-203-10-worker-1 ~]# ip route get 10.1.139.84
10.1.139.84 dev calif67c1668c34 src 192.168.2.2

最后數(shù)據(jù)包經(jīng)過(guò)dnat和snat后發(fā)給給本worker node上的calif67c1668c34。

此場(chǎng)景下的鏈接跟蹤表項(xiàng)

[worker-1 ~]# conntrack -L | grep 10.99.64.233
tcp      6 117 TIME_WAIT src=10.1.139.84 dst=10.99.64.233 sport=48544 dport=9999 src=10.1.139.84 dst=192.168.2.2 sport=80 dport=16495 [ASSURED] mark=0 use=1

也可參考:k8s 之 service ip - 簡(jiǎn)書(shū) (jianshu.com)?

原文鏈接:https://blog.csdn.net/fengcai_ke/article/details/125717038

欄目分類(lèi)
最近更新