Infrastructure failure resolved, downloads.vyos.io is back now

We have managed to resolve the infrastructure problems and bring the affected machines back intact. Now downloads.vyos.io and ci.vyos.net are back online, and our build hosts with all the build dependencies setup and uncommited code are back online too, so luckily we can resume the work from the point where it was stopped by the RAID issue.

We are not ready to give a complete post mortem on the actual issue yet (may not even be able to at all) since we have limited data and the service provider support was not exactly helpful. What we can say now is that what was thought to be RAID5 actually was a RAID1 with an additional hot spare drive, and the failure mode included a RAID controller glitch apart from drive failure. The hot spare drive is what allowed us to bring the data back intact.

While this issue was resolved without any data loss, it definitely prompts us to reconsider many things about our infrastructure, including backup strategy, deployment mechanisms, and service provider choice.

We are going to write new posts as we decide upon the options and roll out improvements. Infrastructure deployment is also a good area for contributions, and some people already offered help.

Infrastructure failure

Hi everyone,

We've had a catastrophic failure on one of our hosts: two drives in a RAID5 failed simultaneously. A number of VMs related to the development infrastructure are permanently lost and need to be restored since we didn't have backups of them.

The only piece public infrastructure affected by it is downloads.vyos.io. You can use the old packages.vyos.net server or one of the mirrors to download the stable (1.1.8) or historical VyOS releases.

The other pieces of infrastructure that need to be restored are the Jenkins server, the build machines, and the repositories server. We'll have to restore them before we can continue development. I hope we'll get it done by Monday.

Building VyOS images with custom packages just got simpler

While the new build scripts first introduced when we migrated the development branch to jessie made things much simpler for developers and for people who just want to build the latest VyOS image from source, building an image even simply with a package available from Debian Jessie repos but not present in the VyOS package set by default was still quite an ordeal for a person not familiar with live-build and the structure of our build scripts.

Well, until now. Yesterday I've added ./configure script options that should allow everyone to build a custom image without ever touching the plumbing of the build scripts.

The simplest example, building an image with packages available in Debian Jessie:

./configure --custom-packages "bsdgames robotfindskitten"
sudo make iso

A more interesting example, adding a package from a third party repo signed with its own key. In this case, salt-minion:

wget https://repo.saltstack.com/apt/debian/8/amd64/2017.7/SALTSTACK-GPG-KEY.pub
./configure --custom-apt-entry "deb http://repo.saltstack.com/apt/debian/8/amd64/2017.7 jessie main" --custom-packages "salt-minion" --custom-apt-key ./SALTSTACK-GPG-KEY.pub
sudo make iso

Of course it doesn't guarantee that your image will build or work, but at least it will get you to the debugging phase faster,

VyOS 1.2.0 development news in June

If we haven't written any new blog posts in a while, that's because we've all been busy with development. Here is what happened since the last status update.

RADIUS authentication and authorization

We've had nominal support for RADIUS authentication for system login for a long time, but it was essentially useless because it required that the user account to already exist in VyOS to actually work, it was just a password checking method.

Kim Hagen found a way to implement real support for it and it appears to work fine (but of course needs testing). Apart from authentication, his implementation also supports authorization through the priv-lvl attribute. Privilege level 15 is mapped to admin users, anything below it is operator. This seems to be the most common way to do that, after Cisco.

Correct naming of PPTP and L2TP interfaces works again

The idea to switch to the upstream PPPD turned out to be premature — while it does have support for pre-up scripts, it still makes an assumption that the interface name is not changed by pre-up script, so remote access VPN interfaces were named pppX, which broke the "run show vpn remote-access" commands, and would break zone-based firewalls of people who rely on that naming.

We've re-applied the old patches for it to the latest upstream PPPD and opened pull requests for them, so if they are merged, we can finally stop building our own, and if not, we have clean patches that should be easy to apply to future releases.

Writing migration scripts (and manipulating VyOS config files outside VyOS) just got easier

Long story short

VyOS 1.2.0-rolling (starting with the next nightly build) includes a library for parsing and manipulating config files without loading them into the system config. It can be used for automatically converting configs from old versions in case an incompatible change was made, and for standalone utilities. Motivation and history are discussed below.

Here is an example of interacting with the new library:
>>> from vyos import configtree

>>> c = configtree.ConfigTree("system { host-name vyos \n } interfaces { dummy dum0 { address 192.0.2.1/24 \n address 192.0.2.20/24 \n disable \n } }  /* version: 1.2.0 */")

>>> print(c.to_string())
system {
    host-name vyos
}
interfaces {
    dummy dum0 {
        address 192.0.2.1/24
        address 192.0.2.20/24
        disable { }
    }
}

 /* version: 1.2.0 */

>>> c.set(['interfaces', 'dummy', 'dum0', 'address'], value='293.0.113.3/32', replace=False)

>>> c.delete_value(['interfaces', 'dummy', 'dum0', 'address'], '192.0.2.1/24')

>>> c.delete(['interfaces', 'dummy', 'dum0', 'disable'])

>>> c.is_tag(['interfaces', 'dummy'])
True

>>> c.exists(['interfaces', 'dummy', 'dum0', 'disable'])
False

>>> c.list_nodes(['interfaces', 'dummy'])
['dum0']

>>> print(c.to_string())
system {
    host-name vyos
}
interfaces {
    dummy dum0 {
        address 192.0.2.20/24
        address 293.0.113.3/32
    }
}

 /* version: 1.2.0 */

As you can see, it largely mimics the API you get for the running config. The only notable differences are that the "set" method requires that you specify the path and the value separately, and to have nodes formatted as tag nodes (i.e. "ethernet eth0 { ..." as opposed to "ethernet { eth0 { ..." you need to mark them as such with "set_tag", unless they were originally formatted that way in the config you parsed.

Things the new style configuration mode definitions intentionally do not support

I've made three important changes to the design of the configuration command definitions, and later I realized that I never wrote down a complete explanation of the changes and the motivation behind them.

So, let's make it clear: these changes are intentional and they shouldn't be reintroduced. Here's the details:

The "type" option

In the old definitions, you can see the "type:" option in the node.def files very often. In the new style XML definitions, there's no equivalent of it, and the type is always set to "txt" in autogenerated node.def's for tag and leaf nodes (which means "anything" to the configuration backend).

I always felt that the "type" option suffers from two problems: scope creep and redundancy.

The scope creep is in the fact that "type" was used for both value validation and generating completion help in "val_help:" option. Also, the "u32" type (32-bit unsigned integer) has a little known undocumented feature: it could be used for range validation in form of "type: u32:$lower-$upper" (e.g. u32:0-65535). It has never been used consistently even by the original Vyatta Core authors, plenty of node.def's use additional validation statements instead.

Now to the redundancy: there are two parallel mechanisms for validations in the old style definitions. Or three, depending on the way you count them. There are "syntax:expression:" statements that are used for validating values at set time, and "commit:expression:" that are checked at commit time.

My feeling from working with the system for scary amount of time was that the "type" option alone is almost never suffucient, and thus useless, since actual, detailed validation is almost always done elsewhere, in those "syntax/commit:expression:" statements or in the configuration scripts. Sometimes a "commit:expression:" is used where "syntax:expression:" would be more appropriate (i.e. validation is delayed) but let's focus on set-time validation only.

But without data to back it up, a feeling is just a feeling, so I made up a quick and dirty script to do some analysis. You can repeat what I've done easily with "find /opt/vyatta/share/vyatta-cfg/templates/ -type f -maxdepth 100 -name 'node.def' | nodecheck.py".

On VyOS 1.1.8 (which doesn't include any rewritten code) the output is:
Has type: 2737
Has type txt: 1387
Has type other than txt: 1350
Has commit or syntax expression: 1700
Has commit or syntax expression and type txt: 740
Has commit or syntax expression and type other than txt: 960

While irrelevant to the problem on hand, the total count of node.def's is 4293). In other words, of all nodes that have the type option, 50% have it set to "txt". Some of them are genuinely "anything goes" nodes such as "description" options, but most use it as a placeholder.

68% of all nodes that have a type are also using either "syntax:expression:" or "commit:expression:". Of all nodes that have a type more specific than "txt", 73% have additional validation. This means that even for supposedly specific types, type alone is enough only in 23% cases. This raises the question whether we need types at all.

Sure, we could introduce more types and add support for something of a sum type, but is it worth the trouble if validation can be easily delegated to external scripts? Besides, right now types are built in the config backend, which means adding a new one requires modifying it starting from the node.def parser.

In the new style definitions, I felt like the only special case that is special enough is regular expression. This is how value constraints checked at set time are defined:

<leafNode name="foo">
  <properties>
    <constraint>
      <regex>(baz|baz)</regex>
      <validator>quux</validator>
    </constraint>
  </properties>
<leafNode>

Here the "validator" tag contains a reference to a script stored in /usr/libexec/vyos/validators/. Since adding a new validator is easy, there's no reason to hesitate to add new ones for common (and even not so common) cases. Note that "regex" option is automatically wrapped in ^$, so there's no need to do it by hand every time.

Default values

The old definitions used to support "default:" option for setting default values for nodes. It looks innocous on the surface, but things get complicated when you look deeper into its behaviour.

You may think a node either exists, or it does not. What is the value of a node that doesn't exist? Sounds rather like a Zen koan, but here's cheap enlightenment for you: it depends on whether it has a default value or not.

Thus, nodes effectively have three states: "doesn't exist", "exists", and "default". As you can already guess, it's hard to tell the latter two apart. It's also very hard to see if a node was deleted from a config or just reset to a default value. It also means that every node lookup cannot operate on the running config tree alone and has to consult the command definitions as well, which is very bad if you plan to use the same code for the CLI and for standalone config handling programs such as migration scripts.

Last time people tried to introduce rollback without reboot, the difficulties of handling the third virtual "default" state was one of the biggest problems, and it's still one of the reasons we don't have a real rollback. VyConf has no support for default values for this reason, so we should eliminate them as we rewrite the code.

Defaults should be handled by config scripts now. Sure, we lose "show -all" and the ability to view defaults, but the complications that come with it hardly make it worth the trouble. There are also many implicit defaults that come from underlying software options anyway.

Embedded shell scripts

That's just a big "no". Have you ever tried to debug code that is spread across multiple node.def's in nested directories and that cannot be executed separately or stepped through?

While it's tempting to allow that for "trivial" scripts, the code tends to grow and things get ugly. Look the the implementation of PPPoE or tunnel interfaces in VyOS 1.1.8.

If it's more than one command, make it an external script, and you'll never regret the decision when it begins to grow.

New-style operational mode command definitions are here

We've had a convertor from the new style configuration command definitions in XML to the old style "templates" for a while in VyOS 1.2.0. As I predicted, a number of issues were only discovered and fixed as we have rewritten more old scripts, but by now they should be fully functional. However, until very recently, there was no equivalent of it for the operational mode commands. Now there is.

The new style

In case you've missed our earlier posts, here's a quick review. The configuration backend currently used by VyOS keeps command definitions in a very cumbesome format with a lot of nested directories where a directory represents a nesting level (e.g. "system options reboot-on-panic" is in "templates/system/options/reboot-on-panic"), a file with command properties must be present in every directory, and command definition files are allowed to have embedded shell scripts.

This makes command definitions very hard to read, and even harder to write. We cannot easily change that format until a new configuration backend is complete, but we can abstract it away, and that's what we did.

The new command definitions are in XML, and a RelaxNG schema is provided, so everyone can see its complete grammar and automatically check if their definitions are correctly written. Automated validation is already integrated in the build process.

Rewriting the command definitions goes along with rewriting the code they call. New style code goes to the vyos-1x package.

VyOS 1.2.0 status update

While VyOS 1.2.0 nightly builds have been fairly usable for a while already, there are still some things to be done because we can make a named release candidate from it. These are the things that have been done lately:

EC2 AMI scripts retargeting and clean up

The original AMI build scripts had been virtually unchanged since their original implementation in 2014, and by this time they've had ansible warnings at every other step, which prompted us to question everything they do, and we did. This resulted in a big spring cleanup of those scripts, and now they are way shorter, faster, and robust.

Other than the fact that they now work with VyOS 1.2.0 properly, one of the biggest improvements from the user point of view is that it's now easy to build an AMI with a custom config file simply by editing the file at playbooks/templates/config.boot.default.ec2

The primary motivation for it was to replace cumbersome in-place editing of the config.boot.default from the image with a single template, but in the end it's a win-win solution for both developers and users.

The original scripts were also notorious for their long execution time and fragility. What's worse is that when they failed (and it's usually "when" rather than "if"), they would leave behind a lot of gargabe they couldn't automatically clean up, since they used to create a temporary VPC complete with an internet gateway, subnet, and route table, all just for a single build instance. They also used a t2.medium instance that was clearly oversized for the task and could be expensive to leave running if clean up failed.

Now they create the build instance in the first available subnet of the default VPC, so even if they fail, you only need to delete a t2.micro instance, a key pair, and a security group.

It is no longer possible to build VyOS 1.1.x images with those scripts from the baseline code, but I've created a tag named 1.1.x from the last commit where it was possible, so you can do it if you want to — without these recent improvements of course.

Package upgrades and new drivers

We've upgraded StrongSWAN to 5.6.2, which hopefully will fix a few longstanding issues. Some enthusiastic testers are already testing it, but everyone is invited to test it as well.

SR-IOV is now basically a requirement for high performance virtualized networking, and it needs appropriate drivers. Recent nightly builds include a newer version of Intel's ixgbe and Mellanox OFED drivers, so the support for recent 10gig cards and SR-IOV in particular has improved.

A step towards using the master branch again

The "current" git branch we use throughout the project where everyone else uses "master" was never intended to be a permanent setup: it always was a workaround for the master branch in packages inherited from Vyatta Core being messed up beyond any repair. It will take quite some time to get rid of the "current" branch completely and we'll only be able to do it when we finally consolidate all the code under vyos-1x, but we've made jenkins builds correctly put the packages built from the "master" branch in our development repository, so we'll be able to do it for packages that do not include any legacy code at least.

IPv6 VRRP status

This is the most interesting part. IPv6 VRRP is perhaps a single most awaited feature. Originally it was blocked by lack of support for it in keepalived. Now keepalived supports it, but integrating it will need some backwards-incompatible changes.

Originally, keepalived allowed mixing IPv4 and IPv6 in the same group, but it no longer allows it (curiously, the protocol standard does allow IPv4 advertisments over IPv6 transport, but I can see why they may want to keep these separate). This means to take advantage of the improvements it made, we also have to disallow it, thus breaking the configs of people who attempted to use it. We've been thinking about keeping the old syntax while generating different configs from it, or automated migration, but it's not clear if automated migration is really feasible.

An incompatible syntax change is definitely needed because, for example, if we want to support setting hello source address or unicast VRRP peer address for both IPv4 and IPv6, we obviously need separate options.

Soon IPv6 addresses in IPv4 VRRP groups will be disallowed and syntax for IPv6-only VRRP groups will be added alongside the old vrrp-group syntax. If you have ideas for the new syntax, the possible automated migration, or generally how to make the transition smooth, please comment on the relevant task.

PowerDNS recursor instead of dnsmasq

The old dnsmasq (which I, frankly, always viewed as something of a spork, with its limited DHCP server functionality built into what's mainly a caching DNS resolver), has been replaced with PowerDNS recursor, a much cleaner implementation.

On security of GRE/IPsec scenarios

As we've already discussed, there are many ways to setup GRE (or something else) over IPsec and they all have their advantages and disadvantages. Recently an issue was brought to my attention: which ones are safe against unencrypted GRE traffic being sent?

The reason this issue can appear at all is that GRE and IPsec are related to each other more like routing and NAT: in some setups their configuration has to be carefully coordinated, but in general they can easily be used without each other. Lack of tight coupling between features allows greater flexibility, but it may also create situations when the setup stops working as intended without a clear indication as to why it happened.

Let's review the knowingly safe scenarios:

VTI

This one is least flexible, but also foolproof by design: the VTI interface (which is secretly simply IPIP) is brought up only when an IPsec tunnel associated with it is up, and goes down when the tunnel goes down. No traffic will ever be sent over a VTI interface until IKE succeeds.

Tunnel sourced from a loopback address

If you have missed it, the basic idea of this setup is the following:

set interfaces dummy dum0 address 192.168.1.100/32

set interfaces tunnel tun0 local-ip 192.168.1.100/32
set interfaces tunnel tun0 remote-ip 192.168.1.101/32 # assigned to dum0 on the remote side

set vpn ipsec site-to-site peer 203.0.113.50 tunnel 1 local prefix 192.168.1.100/32
set vpn ipsec site-to-site peer 203.0.113.50 tunnel 1 remote prefix 192.168.1.101/32

Most often it's used when the routers are behind NAT, or one side lacks a static address, which makes selecting traffic for encryptions by protocol alone impossible. However, it also introduces tight coupling between IPsec and GRE: since the remote end of the GRE tunnel can only be reached via an IPsec tunnel, no communication between the routers over GRE is possible unless the IPsec tunnel is up. If you fear that any packets may be sent via the default route, you can nullroute the IPsec tunnel network to be sure.

The complicated case

Now let's examine the simplest kind of setup:

set interfaces tunnel tun0 local-ip 192.0.2.100 # WAN address
set interfaces tunnel tun0 remote-ip 203.0.113.200

set vpn ipsec site-to-site peer 203.0.113.200 tunnel 1 protocol gre

In this case IPsec is setup to encrypt the GRE traffic to 203.0.113.200, but the GRE tunnel itself can work without IPsec. In fact, it will work without IPsec, just without encryption, and that is the concern for some people. If the IPsec tunnel goes down due to misconfiguration, it will fall back to the common, unencrypted GRE.

What can you do about it?

As a user, if your requirement is to prevent unencrypted traffic from ever being sent, you should use VTI or use loopback addresses for tunnel endpoints.

For developers this question is more complicated.

What should be done about it?

The opinions are divided. I'll summarize the arguments here.

Arguments for fixing it:

  • Cisco does it that way (attempts to detect that GRE and IPsec are related — at least in some implementations and at least when it's referenced as IPsec profile in the GRE tunnel)
  • The current behaviour is against user's intentions

Arguments against fixing it:

  • Attempts to guess user's intentions are doomed to fail at least some of the time (for example, what if a user intentionally brings an IPsec tunnel down to isolate GRE setup issues?)
  • The only way to guarantee that unencrypted traffic is never sent is checking for a live SA matching protocol and source before forwarding every packet — that's not good for performance).

Practical considerations:

  • Since IKE is in the userspace, the kernel can't even know that an SA is supposed to exist until IKE succeeds: automatic detection would be a big change that is unlikely to be accepted in the mainline kernel.
  • Configuration changes required to avoid the issue are simple
If you have any thoughts on the issue, please share with us!

When VyOS CLI isn't enough

Sometimes a particular configuration option is supported by the software that VyOS uses, but the CLI does not expose it. Since VyOS is open source, you can always fix that, but sometimes you need it by tomorrow, and there's simply no time to do it.

In a number of places, we've left an escape hatch for it that allows bypassing the CLI and including a raw snippet into the generated config. Of course, you give up the sanity checks of the CLI and take full responsibility for the correctness of the resulting config, but sometimes it's necessary.

OpenVPN

In openvpn, we have an option called "openvpn-option". You can pass any options to OpenVPN process with it, but note that in the current versions, it has to follow the command line rather than config file option, i.e. prepend it with "--". See this example:

set interfaces openvpn vtun10 openvpn-option "--connect-freq 10 60"

Note that the "push" option em is supported. I see OpenVPN configs with openvpn-option heavily overused once in a while — before including an option, make sure what you need to do is really not supported.

DHCP server

In the DHCP server, there is not one, but too escape hatches. One is the "subnet-parameters" option under "subnet". Another one is a "global-parameters" under "shared-network-name".

See an example:

set service dhcp-server shared-network-name LAN subnet 10.0.0.0/24 subnet-parameters "ping-timeout 5;"

Since dhcpd.conf syntax is more complex than just a list of options, it's important to make sure that generated config will be valid. It's easy to make your DHCP server stop loading and spend some time reading the log to see what is wrong, so be careful here.

Note that these options are not supported by the DHCPv6 server. Anyone thinks we should support it?

Dynamic DNS

In dynamic DNS, you can use the generic HTTP method if your provider and protocol is not supported.

set service dns dynamic interface eth0 use-web url http://dyndns.example.com/?update

Since no one can possibly support all providers, I believe it will remain a necessary option forever.

When all else fails: the postconfig script

If something is not supported and doesn't have a handy escape hatch, you still can implement it with the postconfig script. That script is found at /config/scripts/vyatta-postconfig-bootup.script and runs after config.boot loading is complete, so it's particularly conductive to manipulating things like raw iptables rules.

VyOS doesn't delete or overwrite anything in the global netfilter tables after boot, so it's safe to put your commands there, for example "/sbin/iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu" for global MSS clamping.

You still need to be careful not to conflict with any of the rules inserted by VyOS though, in general, so always make sure to check what exactly VyOS generates before using the postconfig script.

Looking forward

VyOS 1.2.0 brings improvements to the postconfig script execution and adds some more escape hatches — stay tuned for updates.