Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronization issues in global translation table #11

Open
T-X opened this issue Feb 19, 2018 · 0 comments
Open

Synchronization issues in global translation table #11

T-X opened this issue Feb 19, 2018 · 0 comments

Comments

@T-X
Copy link
Contributor

T-X commented Feb 19, 2018

Scenario:

batman-adv-legacy-tt-issue.png

(+300 other nodes connected to the mesh-vpn)

Description:

Issues with L3 ping from a host behind nml-wdr4300 to de:ad:ca:fe:46:1d (10.130.0.254/muehlentor).

13:30:48.216945 0e:85:8c:0f:63:fe > de:ad:ca:fe:46:1d, ethertype IPv4 (0x0800), length 98: 10.130.11.93 > 10.130.0.254: ICMP echo request, id 12263, seq 521, length 64

ICMP echo request never reached muehlentor.

Issues with the originator the frame is sent to:

Expected originator: 26:9c:57:9b:5c:b2 (muehlentor)
Got: d6:89:49:08:f6:9d (holstentor)

Might have occurred after a reboot of muehlentor.

A reboot of nml-wdr4300 fixed the issue for now.


Console Output:

Output before reboot of nml-wdr4300, during the issue.


root@nml-wdr4300:~# batctl tg | grep de:ad:ca:fe:46:1d
 * de:ad:ca:fe:46:1d  (  3) via d6:89:49:08:f6:9d     (  3)   (0x146d) [...]
 + de:ad:ca:fe:46:1d  (  2) via 26:9c:57:9b:5c:b2     (  2)   [...]
root@nml-wdr4300:~# batctl tg | grep 26:9c:57:9b:5c:b2
 + de:ad:ca:fe:46:1d  (  2) via 26:9c:57:9b:5c:b2     (  2)   [...]
 * 26:a8:54:c9:1d:a1  (  2) via 26:9c:57:9b:5c:b2     (  2)   (0x1f7d) [...]
root@nml-wdr4300:~# batctl tg | grep d6:89:49:08:f6:9d
 * 16:d0:f3:0e:72:a5  (  3) via d6:89:49:08:f6:9d     (  3)   (0x146d) [...]
 * fe:54:00:0c:bb:eb  (  3) via d6:89:49:08:f6:9d     (  3)   (0x146d) [...]
 * 52:54:00:0c:bb:eb  (  3) via d6:89:49:08:f6:9d     (  3)   (0x146d) [...]
 * de:ad:ca:fe:46:1d  (  3) via d6:89:49:08:f6:9d     (  3)   (0x146d) [...]
root@nml-wdr4300:~# batctl o | grep 26:9c:57:9b:5c:b2
26:9c:57:9b:5c:b2    0.140s   (198) 00:0d:b9:20:8f:05 [    br-wan]: 00:0d:b9:20:8f:05 (198)
root@nml-wdr4300:~# batctl o | grep d6:89:49:08:f6:9d
d6:89:49:08:f6:9d    0.360s   (224) 00:0d:b9:20:8f:05 [    br-wan]: 00:0d:b9:20:8f:05 (224)

root@nm-alix:~# batctl tg | grep de:ad:ca:fe:46:1d
 * de:ad:ca:fe:46:1d  (  2) via 26:9c:57:9b:5c:b2     (  2)   (0x1f7d) [...]
root@nm-alix:~# batctl tg | grep 26:9c:57:9b:5c:b2
 * de:ad:ca:fe:46:1d  (  2) via 26:9c:57:9b:5c:b2     (  2)   (0x1f7d) [...]
 * 26:a8:54:c9:1d:a1  (  2) via 26:9c:57:9b:5c:b2     (  2)   (0x1f7d) [...]
root@nm-alix:~# batctl tg | grep d6:89:49:08:f6:9d
 * 16:d0:f3:0e:72:a5  (  3) via d6:89:49:08:f6:9d     (  3)   (0x146d) [...]
 * fe:54:00:0c:bb:eb  (  3) via d6:89:49:08:f6:9d     (  3)   (0x146d) [...]
 * 52:54:00:0c:bb:eb  (  3) via d6:89:49:08:f6:9d     (  3)   (0x146d) [...]
root@nm-alix:~# batctl o | grep 26:9c:57:9b:5c:b2
26:9c:57:9b:5c:b2    0.784s   (225) ce:69:95:f0:a9:53 [ffhl-mesh-vpn]: ce:69:95:f0:a9:53 (225)
root@nm-alix:~# batctl o | grep d6:89:49:08:f6:9d
d6:89:49:08:f6:9d    0.744s   (255) ce:69:95:f0:a9:53 [ffhl-mesh-vpn]: ce:69:95:f0:a9:53 (255)

tux@holstentor ~ % ip -oneline link | grep de:ad:ca:fe:46:1d
tux@holstentor ~ % ip -oneline link | grep 26:9c:57:9b:5c:b2
tux@holstentor ~ % ip -oneline link | grep d6:89:49:08:f6:9d
17: ffhl-mesh-vpn:  mtu 1426 qdisc fq_codel master mesh-hl state UNKNOWN mode DEFAULT group default qlen 1000\    link/ether d6:89:49:08:f6:9d brd ff:ff:ff:ff:ff:ff
tux@holstentor ~ % sudo batctl -m mesh-hl tl
[B.A.T.M.A.N. adv 2013.4.0, MainIF/MAC: ffhl-mesh-vpn/d6:89:49:08:f6:9d (mesh-hl/16:d0:f3:0e:72:a5 BATMAN_IV), TTVN: 3]
Client             VID Flags    Last seen (CRC       )
16:d0:f3:0e:72:a5   -1 [.P....]   0.000   (0x0000146d)
fe:54:00:0c:bb:eb   -1 [......]   0.610   (0x0000146d)
52:54:00:0c:bb:eb   -1 [......]   0.000   (0x0000146d)
tux@holstentor ~ % sudo batctl -m mesh-hl tg | grep de:ad:ca:fe:46:1d
 * de:ad:ca:fe:46:1d   -1 [....] (  2) 26:9c:57:9b:5c:b2 (  2) (0x00001f7d)
tux@holstentor ~ % sudo batctl -m mesh-hl tg | grep 26:9c:57:9b:5c:b2
 * de:ad:ca:fe:46:1d   -1 [....] (  2) 26:9c:57:9b:5c:b2 (  2) (0x00001f7d)
 * 26:a8:54:c9:1d:a1   -1 [....] (  2) 26:9c:57:9b:5c:b2 (  2) (0x00001f7d)
tux@holstentor ~ % sudo batctl -m mesh-hl tg | grep d6:89:49:08:f6:9d
[B.A.T.M.A.N. adv 2013.4.0, MainIF/MAC: ffhl-mesh-vpn/d6:89:49:08:f6:9d (mesh-hl/16:d0:f3:0e:72:a5 BATMAN_IV)]

root@muehlentor ~ # ip -oneline link | grep de:ad:ca:fe:46:1d
root@muehlentor ~ # ip -oneline link | grep 26:9c:57:9b:5c:b2
9: ffhl-mesh-vpn:  mtu 1426 qdisc fq_codel master mesh-hl state UNKNOWN mode DEFAULT group default qlen 1000\    link/ether 26:9c:57:9b:5c:b2 brd ff:ff:ff:ff:ff:ff
root@muehlentor ~ # ip -oneline link | grep d6:89:49:08:f6:9d
4: freifunk-hl:  mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000\    link/ether de:ad:ca:fe:46:1d brd ff:ff:ff:ff:ff:ff
root@muehlentor ~ # batctl -m mesh-hl tl
[B.A.T.M.A.N. adv 2013.4.0, MainIF/MAC: ffhl-mesh-vpn/26:9c:57:9b:5c:b2 (mesh-hl/26:a8:54:c9:1d:a1 BATMAN_IV), TTVN: 2]
Client             VID Flags    Last seen (CRC       )
de:ad:ca:fe:46:1d   -1 [......]   0.010   (0x00001f7d)
26:a8:54:c9:1d:a1   -1 [.P....]   0.000   (0x00001f7d)
root@muehlentor ~ # batctl -m mesh-hl tg | grep de:ad:ca:fe:46:1d
root@muehlentor ~ # batctl -m mesh-hl tg | grep 26:9c:57:9b:5c:b2
[B.A.T.M.A.N. adv 2013.4.0, MainIF/MAC: ffhl-mesh-vpn/26:9c:57:9b:5c:b2 (mesh-hl/26:a8:54:c9:1d:a1 BATMAN_IV)]
root@muehlentor ~ # batctl -m mesh-hl tg | grep d6:89:49:08:f6:9d
 * 16:d0:f3:0e:72:a5   -1 [....] (  3) d6:89:49:08:f6:9d (  3) (0x0000146d)
 * fe:54:00:0c:bb:eb   -1 [....] (  3) d6:89:49:08:f6:9d (  3) (0x0000146d)
 * 52:54:00:0c:bb:eb   -1 [....] (  3) d6:89:49:08:f6:9d (  3) (0x0000146d)

Output after reboot of nml-wdr4300, with no more issues then:


root@nml-wdr4300:~# batctl tg | grep de:ad:ca:fe:46:1d
 * de:ad:ca:fe:46:1d  (  2) via 26:9c:57:9b:5c:b2     (  2)   (0x1f7d) [...]
root@nml-wdr4300:~# batctl tg | grep 26:9c:57:9b:5c:b2
 * de:ad:ca:fe:46:1d  (  2) via 26:9c:57:9b:5c:b2     (  2)   (0x1f7d) [...]
 * 26:a8:54:c9:1d:a1  (  2) via 26:9c:57:9b:5c:b2     (  2)   (0x1f7d) [...]
root@nml-wdr4300:~# batctl tg | grep d6:89:49:08:f6:9d
 * 16:d0:f3:0e:72:a5  (  3) via d6:89:49:08:f6:9d     (  5)   (0x146d) [...]
 * fe:54:00:0c:bb:eb  (  3) via d6:89:49:08:f6:9d     (  5)   (0x146d) [...]
 * 52:54:00:0c:bb:eb  (  3) via d6:89:49:08:f6:9d     (  5)   (0x146d) [...]

Observation from console output:

  • TT global on nml-wdr4300 does not match TT local on holstentor.
  • New OGMs did not resolve the issue.
  • It is unclear how the de:ad:ca:fe:46:1d via holstentor entry could end up in the global TT of nml-wdr4300 as holstentor does not have this address anywhere.
  • On nml-wdr4300, batctl does not display the CRC of the correct entry for de:ad:ca:fe:46:1d via muehlentor?

General Notes:

  • If this were due to some CRC16 collision the issue might be a lot less likely on a recent batman-adv (it uses CRC32)
  • There might have been fixes for this in non-legacy batman-adv already. (I remember some restructuring/fixing around list operations, for instance)
  • I seem to stumble over this about once a year. So it is not happening that frequently and might therefore be difficult to reproduce. Migrating to a recent batman-adv is probably less effort than trying to hunt this bug in batman-adv-legacy and would probably fix the issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant