VyOS development news in August and September

Most importantly: all but one blockers for the 1.2.0 release candidate are now resolved. Quite obviously, for the release candidate, we want all features that worked in 1.1.8 to work fully.

New release naming scheme

While we are at it, I'd like to announce a small cosmetic change. Until now, our release branches were named after chemical elements. This naming scheme is getting a bit too common though (OpenDayLight is a well known example, but there are more), we decided to change it to something else to avoid confusion and be a bit more original.

The new branch theme is constellations sorted by area (in square degrees), from the smallest to the largest. The 1.2.0 release will be named Crux. Crux, also known as the Southern Cross, is a small but bright and iconic constellation that is depicted on flags of many countries of the southern hemisphere, such as Australia and New Zealand.

The 1.3.0 release will be named Equuleus, which is the latin for little horse (no relation to My Little Pony).

Migration to FRR from Quagga

We have resolved most of the migration problems and latest nightly builds already use FRR instead of our aged Quagga.

It will open a path to implementing many new protocols and features, such as BFD, PIM-SM, and more. What kept us from migrating was lack of support for multiple routing tables, which we need for PBR. FRR added it recently, and by now the last known issue that blocked migration (routes from the default table unintentionally leaking into non-default tables) has been resolved, so we finally can migrate without losing any features.

While I do feel somewhat uneasy about licensing of certain daemons, that are included in the source tree but use a permissive open source license even though they are linked against GPL libraries, we do not believe there's a GPL violation in it as long as the license of the binary package is GPL. Not sharing a modified source code of those daemons with users of the binary package would be a GPL violation, but we keep all source code of every VyOS component public.

New BGP address-family syntax

This is still in the works, but it will make it to the nightly builds soon.

Originally, VyOS used to have IPv6-specific BGP options under "address-family ipv6-unicast", but IPv4 options were directly under neighbor. The historical reason is that originally IPv6 BGP was not supported at all. This syntax was rather inconsistent, and made it hard to quickly see which options are address family specific. We used to stick with that inconsistent syntax just because it was always done that way.

One behaviour change in FRR made us reconsider that. As you may know, in BGP, routing information exchange is completely orthogonal to the session transport: IPv4 routes can be exchanged over a TCP connection established between IPv4 addresses and vice versa. The default behaviour of most, if not all, BGP implementation is to enable both address families regardless of the session transport.

That behaviour can be changed by an option, in VyOS, that's "set protocols bgp ... parameters default no-ipv4-unicast". The old behaviour of Quagga was to apply that only to sessions whose transport is IPv6, which is just as inconsistent. FRR takes that option literally and disabled IPv4 route advertisments for all peers if it's active, unless peers are explicitly activated for the IPv4 address family.

Making VyOS play well with that development requires an option to do that, and "address-family ipv4-unicast" is an obvious candidate, but introducing a special case doesn't feel write. I think moving original options to that subtree is a cleaner solution. Yes, it does require reprogramming your fingers, but when we start adding support for more address families, the original syntax will only start looking even more like an atavism.

This is what the new syntax will look like:
dmbaturin@vyos# show protocols bgp 
 bgp 64444 {
     address-family {
         ipv4-unicast {
             network 192.168.2.0/24 {
             }
         }
     }
     neighbor 10.91.19.1 {
         address-family {
             ipv4-unicast {
                 allowas-in {
                     number 3
                 }
                 as-override
                 default-originate {
                     route-map Foo
                 }
                 maximum-prefix 50
                 route-map {
                     export Bar
                     import Baz
                 }
                 weight 10
             }
         }
         ebgp-multihop 255
         remote-as 64793
     }
}

Node renaming in migration scripts

Renaming nodes is a very common task in config syntax migration, but until now it could only be done very indirectly. The old XorpConfigParser simply could not separate names from values and renaming nodes was usually done by regex replace. In the new vyos.configtree you'd need to delete the old node and recreate it from scratch.

Until now. Lately we introduced a function that does it one step. If you, for whatever reason, wanted to rename "service ssh" subtree to "service secure-shell", you could do it like this:

with open("/config/config.boot") as f:
    config_text = f.read()
config = vyos.configtree.ConfigTree(config.text)

config.rename(["service", "ssh"], "secure-shell")

print(config.to_string())

One of the reason for introducing it is to make it easier to clean up the DHCP server syntax.

DHCP server rewrite

While we are waiting for the FRR fixes, we (Christian Poessinger and I mainly) decided to eliminate one more bit of the legacy code and give DHCP server scripts a rewrite. We also decided to clean up its syntax.

One of the things that always annoyed me was nested nodes for address ranges: "subnet 192.0.2.0/24 start 192.0.2.100 stop 192.0.2.100". Now start and stop will be different nodes, so that they are easy to change independently: "subnet 192.0.2.0/24 range Foo start 192.0.2.100; ... stop 192.0.2.200".

We will also rename the unwieldy "shared-network-name" to "pool". Operational mode commands always used the "pool" terminology, so it will also improve command consistency.

Wireguard support

Thanks to our contributor who goes by hagbard, VyOS now supports wireguard. The work on it is nearly complete, and will be covered in a separate post.

TFTP server support

Thanks to Christian Poessinger, VyOS now has TFTP server. It was a frequently requested feature, and I think it makes sense for people who keep DHCP on the router and do not want to setup another machine for provisioning phones, think clients and so on.

This is an example of TFTP server with all options set:

service {
 tftp-server {
     allow-upload
     directory /config/tftp
     listen-address 192.0.2.10
     port 69
 }
}

DMVPN works again

Thanks to our contributor Runar Borge, we have identified the cause and fixed the issues that broke DMVPN after upgrading to the latest upstream StrongSWAN. It should now work as expected.

L2TP/IPsec works again

One of the blockers introduced by upgrade to StrongSWAN 5.6 was broken L2TP/IPsec. We've adjusted the config to use the new syntax and now it works again.

More to come

We are actively working on getting the codebase ready for the release candidate. Stay tuned for new updates!

VyOS 1.2.0 development news in July

Despite the slow news season and the RAID incident that luckily slowed us down only for a couple of days, I think we've made good progress in July.

First, Kim Hagen got cloud-init to work, even though it didn't make it to the mainline image, and WAAgent required for Azure is not working yet. Some more work, and VyOS will get a much wider cloud platform support. He's also working on Wireguard integration and it's expected to be merged into current soon.

The new VRRP CLI and IPv6 support is another big change, but it's got its own blog post, so I won't stop there and cover things that did not get their own blog posts instead.

IPsec and VTI

While I regard VTI as the most leaky abstraction ever created and always suggest using honest GRE/IPsec instead, I know many people don't really have any choice because their partners or service providers are using it. In older StrongSWAN versions it used to just work.

Updating StrongSWAN to the latest version had an unforeseen and very unpleasant side effect: VTI tunnels stopped working. A workaround in form of "install_routes = no" in /etc/strongswan.d/charon.conf was discovered, but it has an equally bad side effect: site to site tunnels stop working when it's applied.

The root cause of the problem is that for VTI tunnels to work, their traffic selectors have to be set to 0.0.0.0/0 for traffic to match the tunnel, even though actual routing decision is made according to netfilter marks. Unless route insertion is disabled entirely, StrongSWAN thus mistakenly inserts a default route through the VTI peer address, which makes all traffic routed to nowhere.

This is a hard problem without a workaround that is easy and effective. It's an architectural problem in the new StrongSWAN, according to our investigation of its source code and its developer responses, there is simply no way to control route insertion per peer. One developer responded to it with "why, site to site and VTI tunnels are never used on the same machine anyway" — yeah, people are reporting bugs just out of curiosity.

While there is no clean solution within StrongSWAN, this definitely has been a blocker for the release candidate. Reimplementing route insertion with an up/down script proved to be a hard problem since there are lots of cases to handle and complete information about the intended SA may not always be available to scripts. Switching to another IKE implementation seems like an attractive option, but needs a serious evaluation of the alternatives, and a complete rewrite of the IPsec config scripts — which is planned, but will take a while because the legacy scripts is an unmaintainable mess.

I think I've found a workable (even if far from perfect workaround) — instead of inserting missing routes, delete the bad routes. I've made a test setup and it seems to work reasonably well. The obvious issue is that it doesn't prevent bad things from happening, but rather undoes the damage, so there may still be a brief traffic disruption when VTI tunnels go up. Another problem is a possible race condition between StrongSWAN inserting routes and the script deleting them, though I haven't seen it in practice yet and I hope it doesn't exist. But, at least you can now use both VTI and site to site tunnels on the same machine.

For people who want to use VTI exclusively, there is now "set vpn ipsec options disable-route-autoinstall" option that disables route insertion globally, thus removing the possible disruption, at cost of making site to site tunnels impossible to use. That option is disabled by default.

I hope it will be good enough until we find a better solution. Your testing is needed to confirm that it is!

On "run reset vrrp master"

I've just exorcised a ghost of the old VRRP CLI implementation — the "reset vrrp master" command. I thought it would go away with the vyatta-vrrp package, but in fact it was in vyatta-op. It made me remember that I was going to write about it in the original blog post, but somehow I forgot about it.

Here is why that command was not reimplemented. First, it never worked with preempt to begin with, and with preempt being the default, its usefulness was already limited.

A more serious reason, however, is that it was a rather horrible (even if ingenious) kludge. This is how it worked: first it tried to locate the VRRP group in keepalived.conf, then it would remove it from the config, restart keepalived, insert it again, and restart keepalived again. It sort of worked, but you can see how fragile this approach is. If anything at any stage would go wrong, it would leave VRRP in an inconsistent state.

A much cleaner and general way to do it is to just disable the VRRP group in conf mode (set high-availability vrrp group Foo disable) and commit the change.



New VRRP CLI is here (with IPv6 support)

Ever since I started with Vyatta, I've had a problem with commands for features unrelated to interfaces being defined inside interfaces. I'm sure the person who came up with that arrangement meant well and thought it would be familiar for Cisco and Juniper users, but the more I lived with it, the more I thought it creates more problems than it solves.

From the user perspective, it's hard to easily view the complete configuration of those features. It's also much harder to clone a feature config to another machine. And if you ever want to move some connection to a different NIC, things get even more fun.

For developers, however, it's even worse. First, it means commands for those features needs to be duplicated for every interface type, which makes adding new interfaces much harder. Second, configuration scripts end up more complex due to paths that can be nested quite deeply. Third, with the current config backend specifically, lack of nested end nodes can lead to very interesting tracking of the state to avoid repeated service restarts.

Until recently there was a token excuse for leaving unfortunate UI decisions alone — the difficulty of writing migration scripts. Luckily, it's no longer the case, so we can start cleaning it up. Ok, it is hard and you need to take care of many details, but at least you are not wrestling with a library that is simply inadequate for the task. Now we can go on a quest to remove excessive nesting and redesign the UI is an easier to use, more logical fashion.

VRRP looked a good feature to start the clean up with — we need to get  IPv6 VRRP support to work in the end, its scripts have accumulated quite some cruft, and, well, it really has nothing to do with interface settings since it's a protocol of its own implemented by a userspace daemon.

Today I've rolled out the new implementation and it is already in the latest rolling release image, ready for your testing. Let's walk through the changes.

Making first boot scripts just got easier (but building vyos-1x got a bit harder)

As you probably know already, we are working on integrating cloud-init into VyOS, which will allow us to support multiple cloud platforms, and get rid of the custom script for EC2. The hard part of this project is that just allowing cloud-init to do what it normally does in Debian would not produce desired results, we need to make it modify the config.

This raises a question when this should occur and how it should be done. Since modifying running config with scripts has its difficulties in the current backend, and even if it didn't, it still could potentially clash with user's commits, we thought we may want to modify the config.boot file before it's loaded instead.

One advantage is that once we have common functionality implemented, it can be reused not only in cloud-init, but also in the installer, and in custom first boot scripts if someone wants them.

To test this concept, I've added a library names vyos.initialsetup that includes a collection of functions for common settings such as user passwords and keys, host name, default route, name servers, and interface addresses.

Here's an example of a script you can run on your system for demonstration (adjust user name and do ssh-keygen if necessary):

#!/usr/bin/env python3

import vyos.configtree as vct
import vyos.initialsetup as vis

with open('/opt/vyatta/etc/config.boot.default') as f:
    config_string = f.read()

with open('/home/dmbaturin/.ssh/id_rsa.pub') as f:
    key_string = f.read()

config = vct.ConfigTree(config_string)

vis.set_user_password(config, 'vyos', 'qwerty')
vis.set_user_ssh_key(config, 'vyos', key_string)

# Default level is admin
vis.create_user(config, 'dmbaturin', password=None, key=key_string)

# Default type is ethernet
vis.set_interface_address(config, 'eth0', '192.0.2.10/24')

vis.set_default_gateway(config, '192.0.2.1')

vis.set_name_servers(config, ['203.0.113.10', '203.0.113.20'])

vis.set_host_name(config, 'vyos-test')

print(str(config))

The script will print a customized config based on the default config.

Building vyos-1x

This is the good thing. The bad, or rather somewhat inconvenient thing is that vyos-1x package build now depends on the libvyosconfig0 package that provides the library behind the vyos.configtree module, and it's essential for running unit tests for those modules.

You should add the "deb http://dev.packages.vyos.net/repositories/current/vyos/ current main" repository to the sources.list on your build machine and install libvyosconfig0 with APT, or simply take the file from the repo and install it by hand with dpkg.

I hope the increased reliability we gain from those unit test outweighs the inconvenience of additional setup.


Infrastructure failure resolved, downloads.vyos.io is back now

We have managed to resolve the infrastructure problems and bring the affected machines back intact. Now downloads.vyos.io and ci.vyos.net are back online, and our build hosts with all the build dependencies setup and uncommited code are back online too, so luckily we can resume the work from the point where it was stopped by the RAID issue.

We are not ready to give a complete post mortem on the actual issue yet (may not even be able to at all) since we have limited data and the service provider support was not exactly helpful. What we can say now is that what was thought to be RAID5 actually was a RAID1 with an additional hot spare drive, and the failure mode included a RAID controller glitch apart from drive failure. The hot spare drive is what allowed us to bring the data back intact.

While this issue was resolved without any data loss, it definitely prompts us to reconsider many things about our infrastructure, including backup strategy, deployment mechanisms, and service provider choice.

We are going to write new posts as we decide upon the options and roll out improvements. Infrastructure deployment is also a good area for contributions, and some people already offered help.

Infrastructure failure

Hi everyone,

We've had a catastrophic failure on one of our hosts: two drives in a RAID5 failed simultaneously. A number of VMs related to the development infrastructure are permanently lost and need to be restored since we didn't have backups of them.

The only piece public infrastructure affected by it is downloads.vyos.io. You can use the old packages.vyos.net server or one of the mirrors to download the stable (1.1.8) or historical VyOS releases.

The other pieces of infrastructure that need to be restored are the Jenkins server, the build machines, and the repositories server. We'll have to restore them before we can continue development. I hope we'll get it done by Monday.

Building VyOS images with custom packages just got simpler

While the new build scripts first introduced when we migrated the development branch to jessie made things much simpler for developers and for people who just want to build the latest VyOS image from source, building an image even simply with a package available from Debian Jessie repos but not present in the VyOS package set by default was still quite an ordeal for a person not familiar with live-build and the structure of our build scripts.

Well, until now. Yesterday I've added ./configure script options that should allow everyone to build a custom image without ever touching the plumbing of the build scripts.

The simplest example, building an image with packages available in Debian Jessie:

./configure --custom-packages "bsdgames robotfindskitten"
sudo make iso

A more interesting example, adding a package from a third party repo signed with its own key. In this case, salt-minion:

wget https://repo.saltstack.com/apt/debian/8/amd64/2017.7/SALTSTACK-GPG-KEY.pub
./configure --custom-apt-entry "deb http://repo.saltstack.com/apt/debian/8/amd64/2017.7 jessie main" --custom-packages "salt-minion" --custom-apt-key ./SALTSTACK-GPG-KEY.pub
sudo make iso

Of course it doesn't guarantee that your image will build or work, but at least it will get you to the debugging phase faster,

Versions mystery revealed

Over last few years, I saw many combinations and theories around VyOS, Vyatta, EdgeOS that I decided to shed light on this and also explain current and future VyOS versions


First was Vyatta

You can read the detailed history here at Wikipedia, and I just tell that all started back in 2006 (12 years ago!) and with release v6.0  in 2010 Vyatta Community renamed to Vyatta Core to highlight the fact that the Vyatta Subscription was no longer a separate product, but the open source version extended with proprietary add-ons. In retrospect, this is often seen as the beginning of the end. At the time the future of the open source Vyatta didn’t look as bleak though, and while since 6.0 Vyatta officially became “open core” (aka “freemium”), that release incorporated features formerly available only in the proprietary version, along with significant new features like image-based upgrade, and it seems like a sustainable model for a time.


In 2011 Ubiquiti Networks launched their EdgeMax line of products. EdgeOS™ is the essential part of the product line and is a fork of Vyatta Core 6.3 that exclusively runs on Cavium backed hardware produced by Ubiquiti Networks (EdgeRouter Lite, PoE, Pro) Since then they migrated to Debian 7 and replaced quagga with proprietary ZebOS™


In 2012 Vyatta was acquired by Brocade, and in April 2013 they renamed Vyatta Subscription Edition (VSE) to the Brocade Vyatta 5400 vRouter (and later also 5600) those a no longer open source. By that time all community resources were wiped, and future of Vyatta Core was obvious.


In late 2013 Daniil initiated fork of Vyatta Core 6.6 under VyOS name. It's a start of development of so-called old stable 1.1.x series which based on Debian 6 (patched all the way) the latest version is 1.1.8 and now in Extended support



fall of 2015, we began an upgrade of VyOS project in many senses, that is a time when 1.2.x started (decision to skip Debian 7 and go directly to Debian 8 which includes migration to systemd among other things)  enhancements and new development happening in the rolling version that we introduced some time ago.


Now that know the background, it’s a good time to describe the details on the new VyOS release model.


Rolling

You can already download binary images of the rolling release of VyOS 1.2.x here, which are snapshots of the current development state. All new features, refactoring of old code, improvements of the OS package base go there before they can become a part of a release candidate or a stable release.

Rolling release builds may break occasionally, and new features may not work as expected. They are meant for contributors and networking enthusiasts who are willing to help us test those features.


Release Candidate (RC)

About every six months, we will start a new release cycle. The cycle will begin with branching off the rolling release and preparing a release candidate. Release candidates are neither code nor feature frozen, as we fix bugs or design issues, or new features in rolling become stable and well tested, we may prepare a new RC that incorporates them.

Release candidates are supposed to be stable enough for non-critical production, but if you prefer stability over new features, you may want to stick with EPA or LTS.


Early Production Access (EPA)

After RC cycle is over and the release is sufficiently stable, it will become the basis for the next LTS release.

EPA is for customers and community members who want new features before they are available in the LTS, and are willing to help us weed out last bugs in those new features.


Long Term Support (LTS)

LTS version of VyOS will be receiving security patches and high priority fixes from rolling releases.

They are meant for enterprise and service provider users who value stability over everything.



We’d like to encourage everyone to install the least stable version to help us with testing.

As bonus 

VyOS to Debian version map



All product names, logos, and brands are property of their respective owners.