Using network namespaces for ipv4 only

With an assist from capabilities

The problem

I use the Ookla Speedtest CLI in a cron job to get an idea of the speed of my internet connection (Verizon FIOS), and spot if there are problems. Why? Because why not :-)

It let’s me draw graphs like this.

Speedtest

However, recently I was starting to get error messages that the command wasn’t able to reach speedtest.net to get the configuration. It wasn’t happening every time; sometimes it would go hours without issue, other times it would fail 3 or 4 times in succession.

When I looked into this, it turned out that the code was preferring IPv6 over IPv4. Now my home network is dual stack with IPv6 being provided by a HEnet tunnel. Because tunnels aren’t necessarily as performant as native networking I configure my machines to prefer IPv4; e.g. setting gai.conf

precedence ::ffff:0:0/96  100
scopev4 ::ffff:0.0.0.0/96       14

However it seems this program is ignoring that. When I ran speedtest manually I could see using the tunnel!

$ speedtest

   Speedtest by Ookla

      Server: Greenlight Networks - Binghamton, NY (id: 3156)
         ISP: Hurricane Electric IPv6 Tunnel Broker

Solution attempt 1

So my first attempt to solve this was to use LD_PRELOAD to intercept the name resolver calls and force it to only return IPv4 addresses. This is what gai.conf is meant to do (getaddrinfo(3)) but let’s force it.

Fortunately someone had already done the hard work so I just took their code. And it worked well with telnet.

e.g. LD_PRELOAD=$PWD/force_ipv6.so telnet speedtest.net 80 would force IPv6 and LD_PRELOAD=$PWD/force_ipv4.so telnet speedtest.net 80 would force IPv4.

Except this was wasted time; the speedtest binary is statically compiled (was it written in Go?) and so LD_PRELOAD has no affect. Oops!

Solution attempt 2

So if we can’t make the application work how we want, maybe we can change the run time environment.

This is exactly what Linux namespaces do. Namespaces are one the core underlying technologies that allow solutions like docker to run; it helps isolate applications from each other, and there are multiple different namespaces (cgroups, IPC, network, mount, PID, user, UTS).

For what we need here, a network namespace is all we need.

Network namespaces can be created and entered with the ip netns command. I’m going to call this namespace ip4only; it’s nicely descriptive!

ip netns del ip4only
ip netns add ip4only

In order to let the namespace talk to the rest of the network it needs a veth endpoint inside the namespace. This basically creates a virtual “point to point” connection, and you can place one end inside the namespace and leave the other in the “root” namespace. Let’s call the two network points ip4only-root and ip4only-ns.

ip link add ip4only-root type veth peer name ip4only-ns
ip link set ip4only-ns netns ip4only

Now each interface needs an IP address. Let’s use the 192.168.200.0/24 network. And while we’re at it we can add a loopback and a default route inside the container.

ip addr add 192.168.200.1/24 dev ip4only-root
ip link set ip4only-root up

ip netns exec ip4only ip addr add 192.168.200.2/24 dev ip4only-ns
ip netns exec ip4only ip link set ip4only-ns up
ip netns exec ip4only ip link set lo up
ip netns exec ip4only ip route add default via 192.168.200.1

So how does this look?

$  ip addr show dev ip4only-root
22: ip4only-root@if21: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether ee:f6:02:dc:35:90 brd ff:ff:ff:ff:ff:ff link-netns ip4only
    inet 192.168.200.1/24 scope global ip4only-root
       valid_lft forever preferred_lft forever

$ sudo ip netns exec ip4only sh
# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
21: ip4only-ns@if22: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 86:58:d8:c4:b7:47 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.200.2/24 scope global ip4only-ns
       valid_lft forever preferred_lft forever
    inet6 fe80::8458:d8ff:fec4:b747/64 scope link 
       valid_lft forever preferred_lft forever

# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.200.1   0.0.0.0         UG    0      0        0 ip4only-ns
192.168.200.0   0.0.0.0         255.255.255.0   U     0      0        0 ip4only-ns

# ping -c 1 192.168.200.1
PING 192.168.200.1 (192.168.200.1) 56(84) bytes of data.
64 bytes from 192.168.200.1: icmp_seq=1 ttl=64 time=0.036 ms

--- 192.168.200.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.036/0.036/0.036/0.000 ms

# exit

It looks like it has an IPv6 address, but this is just a “link local” one and isn’t used for access to the wider network.

Of course this network can’t reach the internet because nothing knows how to get to the 192.116.200.2 address. We can solve that with some NATting in the root.

Since I have no iptable rules on this machine I can build a simple set. In my case br-lan is the bridge I have connected to my LAN network, so the rules would look something like

echo 1 > /proc/sys/net/ipv4/ip_forward

# Flush forward rules, policy DROP by default.
iptables -P FORWARD DROP
iptables -F FORWARD

# Flush nat rules.
iptables -t nat -F

# Enable masquerading of 192.168.200.0
iptables -t nat -A POSTROUTING -s 192.168.200.0/255.255.255.0 -o br-lan -j MASQUERADE

# Allow forwarding between br-lan and ip4only-root
iptables -A FORWARD -i br-lan -o ip4only-root -j ACCEPT
iptables -A FORWARD -o br-lan -i ip4only-root -j ACCEPT

And now, with all of that, the namespace can reach the internet

$ sudo ip netns exec ip4only ping -c 1 www.google.com
PING www.google.com (142.251.40.132) 56(84) bytes of data.
64 bytes from lga25s80-in-f4.1e100.net (142.251.40.132): icmp_seq=1 ttl=117 time=4.47 ms

--- www.google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 4.474/4.474/4.474/0.000 ms

If we try to use ipv6 then we get an error

$ sudo ip netns exec ip4only ping -6 -c 1 www.google.com
connect: Network is unreachable

Great, this is working!

Reducing permissions.

Unfortunately to enter a namespace with ip netns exec requires root and the command we run inside the namespace is run as root (we can see that from the prompt changing from $ to # in the earlier output).

Fortunately we don’t need root if we have the cap_sys_admin permission.

This is easily to do with a simple C program:

// ip4only
//   gcc -o ip4only ip4only.c
//   sudo setcap cap_sys_admin+ep ./ip4only
//
// Now we can do "ip4only command" (eg "ip4only ip addr")
// and it will run in the ip4only network namespace

// We need this for setns()
#define _GNU_SOURCE

#include <fcntl.h>
#include <sched.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

#define errExit(msg) { perror(msg); exit(EXIT_FAILURE); }

int main(int argc, char *argv[])
{ 
  int fd;

  if (argc < 2)
  {
      fprintf(stderr, "%s cmd args...\n", argv[0]);
      exit(EXIT_FAILURE);
  }

  // To set a namespace we need to have an open file handle.
  // Network namespaces live in /var/run/netns so that's easy
  
  fd = open("/var/run/netns/ip4only", O_RDONLY);
  if (fd == -1)
    errExit("open");

  // Join the namespace
  if (setns(fd, 0) == -1)
    errExit("setns");

  // Run the specified command
  execvp(argv[1], &argv[1]);

  // If we got here, there's an error!
  errExit("execvp");
}

And that’s all there is to it. We can see it works by trying to access IPv6; we expect it to fail.

$ ip4only telnet -6 www.google.com 80
Trying 2607:f8b0:4006:820::2004...
telnet: connect to address 2607:f8b0:4006:820::2004: Network is unreachable

And it works.

I can now do ip4only speedtest and it now connects every time using IPv4

$ ip4only speedtest

   Speedtest by Ookla

      Server: DediPath - Secaucus, NJ (id: 22774)
         ISP: Verizon Fios

Conclusion

There is a side effect; “ping” no longer has the permissions it needs to run, and it gives a socket: Operation not permitted error. However if an application isn’t trying to use DGRAM or RAW sockets then this type of configuration is a nice way of using namespaces and capabilities to force the application into an IPv4 only setup.