IPv6 Neighbor Discovery Responder for KVM VPS

edited April 28 in Technical

This article is originally published on yoursunny.com blog https://yoursunny.com/t/2021/ndpresponder/

I Want IPv6 for Docker

I'm playing with Docker these days, and I want IPv6 in my Docker containers.
The best guide for enabling IPv6 in Docker is how to enable IPv6 for Docker containers on Ubuntu 18.04.
The first method in that article assigns private IPv6 addresses to containers, and uses IPv6 NAT similar to how Docker handles IPv4 NAT.
I quickly got it working, but I noticed an undesirable behavior: Network Address Translation (NAT) changes the source port number of outgoing UDP datagrams, even if there's a port forwarding rule for inbound traffic; consequently, a UDP flow with the same source and destination ports is being recognized as two separate flows.

$ docker exec nfd nfdc face show 262
    faceid=262
    remote=udp6://[2001:db8:f440:2:eb26:f0a9:4dc3:1]:6363
     local=udp6://[fd00:2001:db8:4d55:0:242:ac11:4]:6363
congestion={base-marking-interval=100ms default-threshold=65536B}
       mtu=1337
  counters={in={25i 4603d 2n 1179907B} out={11921i 14d 0n 1506905B}}
     flags={non-local permanent point-to-point congestion-marking}
$ docker exec nfd nfdc face show 270
    faceid=270
    remote=udp6://[2001:db8:f440:2:eb26:f0a9:4dc3:1]:1024
     local=udp6://[fd00:2001:db8:4d55:0:242:ac11:4]:6363
   expires=0s
congestion={base-marking-interval=100ms default-threshold=65536B}
       mtu=1337
  counters={in={11880i 0d 0n 1498032B} out={0i 4594d 0n 1175786B}}
     flags={non-local on-demand point-to-point congestion-marking}

The second method in that article allows every container to have a public IPv6 address.
It avoids NAT and the problems that come with it, but requires the host to have a routed IPv6 subnet.
However, routed IPv6 is hard to come by on KVM servers, because virtualization platform such as Virtualizor does not support routed IPv6 subnets, but can only provide on-link IPv6.

On-Link IPv6 vs Routed IPv6

So what's the difference between on-link IPv6 and routed IPv6, anyway?
It differs in how the router at the previous hop is configured to reach a destination IP address.

Let me explain in IPv4 terms first:

|--------| 192.0.2.1/24       |--------| 198.51.100.1/24    |-----------|
| router |--------------------| server |--------------------| container |
|--------|       192.0.2.2/24 |--------|    198.51.100.2/24 |-----------|
            (192.0.2.16-23/24)    |
                                  | 192.0.2.17/28           |-----------|
                                  \-------------------------| container |
                                              192.0.2.18/28 |-----------|
  • The server has on-link IP address 192.0.2.2.

    • The router knows this IP address is on-link because it is in the 192.0.2.0/24 subnet that is configured on the router interface.
    • To deliver a packet to 192.0.2.2, the router sends an ARP query of 192.0.2.2 to learn the server's MAC address, which should be responded by the server.
  • The server has routed IP subnet 198.51.100.0/24.

    • The router must be configured to know: 198.51.100.0/24 is reachable via 192.0.2.2.
    • To deliver a packet to 198.51.100.2, the router first queries its routing table and finds the above entry, then sends an ARP query to learn the MAC address of 192.0.2.2 which should be responded by the server, and finally delivers the packet to the learned MAC address.
  • The main difference is what IP address is enclosed in the ARP query:

    • If the destination IP address is an on-link IP address, the ARP query contains the destination IP address itself.
    • If the destination IP address is in a routed subnet, the ARP query contains the nexthop IP address, as determined by the routing table.
  • If I want to assign an on-link IPv4 address (e.g. 192.0.2.18/28) to a container, the server should be made to answer ARP queries for that IP address so that the router would deliver packets to the server, and then forwards these packets to the container.

    • This technique is called ARP proxy, in which the server responds to ARP queries on behalf of the container.

The situation is a bit more complex in IPv6 because each network interface can have multiple IPv6 addresses, but the same concept applies.
Instead of Address Resolution Protocol (ARP), IPv6 uses Neighbor Discovery Protocol that is part of ICMPv6.
A few terminology differs:

IPv4 IPv6
ARP Neighbor Discovery Protocol (NDP)
ARP query ICMPv6 Neighbor Solicitation
ARP reply ICMPv6 Neighbor Advertisement
ARP proxy NDP proxy

If I want to assign an on-link IPv6 address to a container, the server should respond to neighbor solicitations for that IP address, so that the router would deliver packets to the server.
After that, the server's Linux kernel could route the packet to the container's bridge, as if the destination IPv6 address was in a routed subnet.

NDP Proxy Daemon to the Rescue, I Hope?

ndppd, or NDP Proxy Daemon, is a program that listens for neighbor solicitations on a network interface and responds with neighbor advertisements.
It is often recommended for dealing with the scenario when the server has only on-link IPv6 but we need a routed IPv6 subnet.

I installed ndppd on one of my servers, and it worked as expected with this configuration:

proxy uplink {
  rule 2001:db8:fbc0:2:646f:636b:6572::/112 {
    auto
  }
}

I can start up a Docker container with a public IPv6 address.
It can reach the IPv6 Internet, and can be ping-ed from outside.

$ docker network create --ipv6 --subnet=172.26.0.0/16
  --subnet=2001:db8:fbc0:2:646f:636b:6572::/112 ipv6exposed
118c3a9e00595262e41b8cb839a55d1bc7bc54979a1ff76b5993273d82eea1f4

$ docker run -it --rm --network ipv6exposed
  --ip6 2001:db8:fbc0:2:646f:636b:6572:d002 alpine

# wget -q -O- https://www.cloudflare.com/cdn-cgi/trace | grep ip
ip=2001:db8:fbc0:2:646f:636b:6572:d002

However, when I repeated the same setup on another KVM server, things didn't go well: the container cannot reach the IPv6 Internet at all.

$ docker run -it --rm --network ipv6exposed
  --ip6 2001:db8:f440:2:646f:636b:6572:d003 alpine

/ # ping -c 4 ipv6.google.com
PING ipv6.google.com (2607:f8b0:400a:809::200e): 56 data bytes

--- ipv6.google.com ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss

What's Wrong with ndppd?

Why ndppd works on the first server, but does not work on the second server?
What's the difference?
We need to go deeper, so I turned to tcpdump.

On the first server, I see:

$ sudo tcpdump -pi uplink icmp6
19:13:17.958191 IP6 2001:db8:fbc0::1 > ff02::1:ff72:d002:
    ICMP6, neighbor solicitation, who has 2001:db8:fbc0:2:646f:636b:6572:d002, length 32
19:13:17.958472 IP6 2001:db8:fbc0:2::2 > 2001:db8:fbc0::1:
    ICMP6, neighbor advertisement, tgt is 2001:db8:fbc0:2:646f:636b:6572:d002, length 32
  • The neighbor solicitation from the router comes from a global IPv6 address.
  • The server responds with a neighbor advertisement from its global IPv6 address.
    Note that this address differs from the container's address.

  • IPv6 works in the container.

On the second server, I see:

$ sudo tcpdump -pi uplink icmp6
00:07:53.617438 IP6 fe80::669d:99ff:feb1:55b8 > ff02::1:ff72:d003:
    ICMP6, neighbor solicitation, who has 2001:db8:f440:2:646f:636b:6572:d003, length 32
00:07:53.617714 IP6 fe80::216:3eff:fedd:7c83 > fe80::669d:99ff:feb1:55b8:
    ICMP6, neighbor advertisement, tgt is 2001:db8:f440:2:646f:636b:6572:d003, length 32
  • The neighbor solicitation from the router comes from a link-local IPv6 address.
  • The server responds with a neighbor advertisement from its link-local IPv6 address.
  • IPv6 does not work in the container.

Since IPv6 has been working on the second server for IPv6 addresses assigned to the server itself, I added a new IPv6 address and captured its NDP exchange:

$ sudo tcpdump -pi uplink icmp6
00:29:39.378544 IP6 fe80::669d:99ff:feb1:55b8 > ff02::1:ff00:a006:
    ICMP6, neighbor solicitation, who has 2001:db8:f440:2::a006, length 32
00:29:39.378581 IP6 2001:db8:f440:2::a006 > fe80::669d:99ff:feb1:55b8:
    ICMP6, neighbor advertisement, tgt is 2001:db8:f440:2::a006, length 32
  • The neighbor solicitation from the router comes from a link-local IPv6 address, same as above.
  • The server responds with a neighbor advertisement from the target global IPv6 address.
  • IPv6 works on the server from this address.

In IPv6, each network interface can have multiple IPv6 addresses.
When the Linux kernel responds to a neighbor solicitation in which the target address is assigned to the same network interface, it uses that particular address as the source address.
On the other hand, ndppd transmits neighbor advertisements via a PF_INET6 socket and does not specify the source address.
In this case, some complicated rules for default address selection come into play.

One of these rules is preferring a source address that has the same scope as the destination address (i.e. the router).
On my first server, the router uses a global address, and the server selects a global address as the source address on its neighbor advertisement.
On my second server, the router uses a link-local address, and the server selects a link-local address, too.

In an unfiltered network, the router wouldn't care where the neighbor advertisements come from.
However, when it comes to a KVM server on Virtualizor, the hypervisor would treat such packets as attempted IP spoofing attacks, and drop them via ebtables rules.
Consequently, the neighbor advertisement never reaches the router, and the router has no way to know how to reach the container's IPv6 address.

ndpresponder: NDP Responder for KVM VPS

I tried a few tricks such as deprecating the link-local address, but none of them worked.
Thus, I made my own NDP responder that sends neighbor advertisements from the target address.

ndpresponder is a Go program using the GoPacket library.

  1. The program opens an AF_PACKET socket, with a BPF filter for ICMPv6 neighbor solicitation messages.
  2. When a neighbor solicitation arrives, it checks the target address against a user-supplied IP range.
  3. If the target address is in the range used for Docker containers, the program constructs an ICMPv6 neighbor advertisement messages and transmits it through the same AF_PACKET socket.

A major difference from ndppd is that, the source IPv6 address on a neighbor advertisement message is always set to the same value as the target address of the neighbor solicitation, so that the message wouldn't be dropped by the hypervisor.
This is made possible because I'm sending the message via an AF_PACKET socket, instead of the AF_INET6 socket used by ndppd.

ndpresponder operates similarly as ndppd in "static" mode.
It does not forward neighbor advertisements to the destination subnet like ndppd does in its "auto" mode, but this feature isn't important on a KVM server.

If ndppd doesn't seem to work on your KVM VPS, give ndpresponder a try!
Head to my GitHub repository for installation and usage instructions:
https://github.com/yoursunny/ndpresponder

The end is nigh for Ubuntu 16.04. Providers still offering Ubuntu 16.04 past EOL will be ashamed.

Tagged:

Comments

  • Nice. Candidate for the Blog?

    Keith

  • ehabehab Content Writer

    interesting,. i will read more about it-
    1 question what tool did you use to draw the netwrok ( Let me explain in IPv4 terms first: )

  • @lleddewk said:
    Nice. Candidate for the Blog?

    I don't read the LES blog myself.
    I'd like the Google Search result to index my blog instead.


    @ehab said:
    1 question what tool did you use to draw the netwrok ( Let me explain in IPv4 terms first: )

    You do not need a tool to type an ASCII art.
    The Overtype mode for Visual Studio Code is very helpful.

    Occasionally I type SVG source code to create more complicated artwork.
    What is a "Face" in Named Data Networking? article has two samples.

    Thanked by (2)Brueggus Not_Oles

    The end is nigh for Ubuntu 16.04. Providers still offering Ubuntu 16.04 past EOL will be ashamed.

  • ehabehab Content Writer

    @yoursunny said:
    You do not need a tool to type an ASCII art.
    The Overtype mode for Visual Studio Code is very helpful.

    if you made the above by hand then i salute you. Hopefully you can join the content writers group and look forward for new info from you.

  • I spent the whole day getting ndpresponder to work on Webhosting24 Cloud.
    @tomazu gives everyone a /48, but the router does not deliver neighbor solicitation packets to the server if I ping one of the addresses in my subnet from a client machine.
    Nevertheless, adding an IPv6 address with ip addr add command works.

    After carefully comparing every address, every packet, and every bit, I found the difference.
    Their router expects the KVM server to transmit a neighbor solicitation from the newly added IPv6 address targeting the router, and then the router would deliver a neighbor solicitation to my new IPv6 address, after that the address becomes reachable.

    I had to rewrite half of ndpresponder to adjust to this procedure: the program now hooks onto Docker event stream.
    This allows the program to know when a new container is connected, so that it can transmit a neighbor solicitation packet on behalf of the container and let the router know the new address.


    Next year, I'll ask for routed IPv6.

    Thanked by (2)tomazu Not_Oles

    The end is nigh for Ubuntu 16.04. Providers still offering Ubuntu 16.04 past EOL will be ashamed.

  • @yoursunny said:
    Next year, I'll ask for routed IPv6.

  • Routed IPv6 Hall of Fame

    Include routed IPv6, at least /64 subnet, to get listed.

    The end is nigh for Ubuntu 16.04. Providers still offering Ubuntu 16.04 past EOL will be ashamed.

  • tomazutomazu Hosting Provider
    edited April 27

    @yoursunny said:
    I spent the whole day getting ndpresponder to work on Webhosting24 Cloud.
    @tomazu gives everyone a /48, but the router does not deliver neighbor solicitation packets to the server if I ping one of the > addresses in my subnet from a client machine.

    well thank you for trying without opening a ticket, but you could have asked :-)

    Just to be sure, is this in Munich or in Singapore?

    After carefully comparing every address, every packet, and every bit, I found the difference.
    Their router expects the KVM server to transmit a neighbor solicitation from the newly added IPv6 address targeting the router, and then the router would deliver a neighbor solicitation to my new IPv6 address, after that the address becomes reachable.

    isn't that the way it is supposed to be? Otherwise you would need a single IPv6 assigned out of a IPv6 subnet and your /48 IPv6 subnet routed to that!?

    Thanked by (1)Not_Oles

    Webhosting24 Munich Cloud Servers - Including IPv4 and /48 IPv6 subnet, NVMe and Unmetered Bandwidth
    Webhosting24 Singapore Launch Thread - Premium Connectivity, IPv4 and /48 IPv6 subnet, Ryzen NVMe

  • @tomazu said:

    @yoursunny said:
    I spent the whole day getting ndpresponder to work on Webhosting24 Cloud.
    @tomazu gives everyone a /48, but the router does not deliver neighbor solicitation packets to the server if I ping one of the > addresses in my subnet from a client machine.

    well thank you for trying without opening a ticket, but you could have asked :-)

    If I open a ticket, you fiddle some settings without telling me what, and then I run into same problem in another network, the cycle repeats.
    That's why I try to poke around first, and hope the identified solutions can work in more places.

    Unless, there's a hardware fault:
    https://talk.lowendspirit.com/discussion/comment/62635/#Comment_62635

    Just to be sure, is this in Munich or in Singapore?

    Munich, wh24-1617893523.local in billing panel.

    After carefully comparing every address, every packet, and every bit, I found the difference.
    Their router expects the KVM server to transmit a neighbor solicitation from the newly added IPv6 address targeting the router, and then the router would deliver a neighbor solicitation to my new IPv6 address, after that the address becomes reachable.

    isn't that the way it is supposed to be?

    For on-link IPv6, this setup differs from other providers, such as Nexril and Evolution Host and WebHorizon:

    1. If any IPv6 address is accessed from outside, the router transmits a neighbor solicitation packet to the KVM server.
    2. If the KVM server responds with a neighbor advertisement packet, the incoming packets start arriving, although the first few might be lost.
    3. If the KVM server does not respond, the router assumes the IPv6 address does not exist, and will be deliver incoming packets.

    The setup at Webhosting24 is that, the KVM server must actively declare the existence of an IPv6 address by transmitting:

    1. a gratuitous neighbor solicitation targeting the new IPv6 address
    2. a neighbor solicitation from the new IPv6 targeting the router

    Without those, no incoming packets or neighbor solicitation will be delivered.

    I'm not well versed in IPv6 related protocols.
    If your Cisco router expects these packets, I suppose some RFC requires the host system to transmit them upon address assignment.
    ndppd would not transmit them because it's a hack, not a fully compliant implementation.

    Otherwise you would need a single IPv6 assigned out of a IPv6 subnet and your /48 IPv6 subnet routed to that!?

    For routed IPv6, what TunnelBroker does is giving each client two prefixes in separate ranges:

    • /64 on-link prefix, in which ::1 is the router and ::2 is the client; other addresses are not usable.
    • /64 or /48 routed prefix, entirely controlled by the client.

    However, I'm told that Virtualizor lacks support for routed prefix, if live migration is needed:
    https://www.lowendtalk.com/discussion/comment/3209268/#Comment_3209268

    Thanked by (1)Not_Oles

    The end is nigh for Ubuntu 16.04. Providers still offering Ubuntu 16.04 past EOL will be ashamed.

  • Mr_TomMr_Tom Hosting ProviderOG

    Interesting read.

    @yoursunny said: For routed IPv6, what TunnelBroker does is giving each client two prefixes in separate ranges:

    My ISP does a similar thing, a /64 for ND with a separate /48 for PD.

    We assign a random IPv6 address from a /96 when the VM is deployed, with a /64 routed to be available to the VM - although the /96 and each /64 come out of the same "larger" range (except the Helsinki Storage as each /64 comes from a separate /56). We're working on adding the ability to route large prefixes if required.

    VM Specialist - Custom, managed and storage VM solutions. | Latest Offers

  • AbdullahAbdullah Hosting ProviderOG

    @Mr_Tom said:
    Interesting read.

    @yoursunny said: For routed IPv6, what TunnelBroker does is giving each client two prefixes in separate ranges:

    My ISP does a similar thing, a /64 for ND with a separate /48 for PD.

    We assign a random IPv6 address from a /96 when the VM is deployed, with a /64 routed to be available to the VM - although the /96 and each /64 come out of the same "larger" range (except the Helsinki Storage as each /64 comes from a separate /56). We're working on adding the ability to route large prefixes if required.

    Hey, you use virtualizor? I'm curious how :)

  • Mr_TomMr_Tom Hosting ProviderOG

    We add aaaa:bbbb:cccc:dddd:eeee:ffff:gggg:1/96 as an IP onto an interface on the host. Create an IP Pool and fill in the first part of the "Generate IPv6" section with the details from the /96 and set the gateway/etc.
    We always leave the "Generate subnets" bit blank as the /64 side per client is all done manually. It adds a bit of work but it was the best way with Virtualizor to give each VM it's own /64.

    Some people are happy enough with just having IPv6 connectivity so they don't use the /64 but it's there if they do.

    I've been doing some testing of using a setup like what hetzner do with using fe80::1 as the gateway so there's need for the VM to have an IP address and a separate /64 but I'm not sure how this would play with Virtualizor either.

    Thanked by (2)yoursunny Abdullah

    VM Specialist - Custom, managed and storage VM solutions. | Latest Offers

  • tomazutomazu Hosting Provider

    @yoursunny said:

    @tomazu said:
    Just to be sure, is this in Munich or in Singapore?

    Munich, wh24-1617893523.local in billing panel.

    OK, could you please confirm that your IPv6 gateway is ending in ::fffe ?

    I'm not well versed in IPv6 related protocols.
    If your Cisco router expects these packets, I suppose some RFC requires the host system to transmit them upon address assignment.
    ndppd would not transmit them because it's a hack, not a fully compliant implementation.

    if you have your "main" IPv6 address up & running and you are not using any "strange" IPv6 gateway, then this should work. Otherwise I would have to use another configuration (like the ones mentioned with a manual /64 IPv6 + subnet routed over that and/or using the "universal" gateway fe80::1).

    Please let me know about the IPv6 gateway, thank you for providing feedback regarding this!

    Webhosting24 Munich Cloud Servers - Including IPv4 and /48 IPv6 subnet, NVMe and Unmetered Bandwidth
    Webhosting24 Singapore Launch Thread - Premium Connectivity, IPv4 and /48 IPv6 subnet, Ryzen NVMe

  • AbdullahAbdullah Hosting ProviderOG

    @Mr_Tom said:
    We add aaaa:bbbb:cccc:dddd:eeee:ffff:gggg:1/96 as an IP onto an interface on the host. Create an IP Pool and fill in the first part of the "Generate IPv6" section with the details from the /96 and set the gateway/etc.
    We always leave the "Generate subnets" bit blank as the /64 side per client is all done manually. It adds a bit of work but it was the best way with Virtualizor to give each VM it's own /64.

    Some people are happy enough with just having IPv6 connectivity so they don't use the /64 but it's there if they do.

    I've been doing some testing of using a setup like what hetzner do with using fe80::1 as the gateway so there's need for the VM to have an IP address and a separate /64 but I'm not sure how this would play with Virtualizor either.

    nice hack, thanks for sharing. I can imagine it may be a pain to manage manually for monthly committed orders, but it works.! :)

  • @tomazu said:

    @yoursunny said:

    @tomazu said:
    Just to be sure, is this in Munich or in Singapore?

    Munich, wh24-1617893523.local in billing panel.

    OK, could you please confirm that your IPv6 gateway is ending in ::fffe ?

    The gateway is xxxx:xxxx:0:100::1, which is not in my subnet.
    When the server was delivered, it's already installed from a template (most likely Debian 10), and I copied the settings from there.

    I tried xxxx:xxxx:yyyy::fffe, and it is also usable as a gateway.

    Once again, we blame Virtualizor for:

    • Putting a strange gateway in the initial template installation.
    • Not displaying any information about IPv4+IPv6 gateway, and letting people guess.

    if you have your "main" IPv6 address up & running and you are not using any "strange" IPv6 gateway, then this should work.

    Please let me know about the IPv6 gateway, thank you for providing feedback regarding this!

    Even if I have xxxx:xxxx:yyyy::1/48 assigned to the KVM server, the container is still unreachable unless I actively declare its presence by transmitting a gratuitous neighbor solicitation targeting the new IPv6 address and a neighbor solicitation from the new IPv6 targeting the router.
    This condition is the same regardless of whether xxxx:xxxx:0:100::1 or xxxx:xxxx:yyyy::fffe is used as gateway.

    Otherwise I would have to use another configuration (like the ones mentioned with a manual /64 IPv6 + subnet routed over that and/or using the "universal" gateway fe80::1).

    You should offer routed IPv6, which would earn you a spot in the routed IPv6 Hall of Fame.

    The end is nigh for Ubuntu 16.04. Providers still offering Ubuntu 16.04 past EOL will be ashamed.

  • tomazutomazu Hosting Provider

    @yoursunny said:
    You should offer routed IPv6, which would earn you a spot in the routed IPv6 Hall of Fame.

    I know this will not comfort you, but I set this up as routed IPv6 in Virtualizor, so I was sure this was tested and working and honestly I never had any problem with the configuration when using ::fffe as default gateway.

    Will create a Debian VM myself and perform some additional tests.

    Webhosting24 Munich Cloud Servers - Including IPv4 and /48 IPv6 subnet, NVMe and Unmetered Bandwidth
    Webhosting24 Singapore Launch Thread - Premium Connectivity, IPv4 and /48 IPv6 subnet, Ryzen NVMe

Sign In or Register to comment.