Change is coming to VyOS project

People often ask us the same questions, such as if we know about Debian 6 EOL, or when 1.2.0 will be released, or when this or that feature will be implemented. The short answer, for all of those: it depends on you. Yes, you, the VyOS users.

Here’s what it takes to run an open source project of this type. There are multiple tasks, and they all have to be done:

  • Emergency fixes and security patches

  • Routine bug fixes, cleanups, and refactoring

  • Development of new features

  • Documentation writing

  • Testing (including writing automated tests)


All those tasks need hands (ideally, connected to a brain). Emergency bug fixes and security patches needs a team of committed people who can do this job on a short notice, which is attainable in two ways, either there are people for whom it’s their primary job, or the team of committed people is large enough to have people with spare time at any given moment.

Cleanups and refactoring are also things that need a team of committed people because those are things that no one benefits from in a short run, it’s about making life easier for contributors and improving the sustainability of the project, keeping it from becoming an unmanageable mess. Development of new features needs people who are personally interested in those features and have the expertise to integrate them in a right way. It’s perfect if they also maintain their code, but if they simply hand documented and maintainable code to the maintainers team, that’s good enough.

Now, the sad truth is that VyOS Project has none of those. The commitment to using it among its users greatly exceeds the commitment to contributing to it. While we don’t know for certain how many people are using VyOS, we have at least some data. At the moment, there are 600 users of the official AMI on AWS. There were 11k+ users last month on user guide page and it’s constantly growing since the time when I took up the role of the community manager of the VyOS project. We are also aware about companies that have around 1k VyOS instances and companies that rely on VyOS in their business operations in one way or another. But still, if we talk about consumers vs. contributors, we see 99% consumers vs 1% contributors relation.


My original idea was to raise awareness of the VyOS project by introducing a new website, refreshing the forum look, activating social media channels and introducing modern collaboration tools to make participation in the project easier, open new ways how users and companies can participate and contribute. Finally bigger user base means there’s a larger pool of people and companies who can contribute to the project. We also launched commercial support with idea that if companies that using VyOS for their businesses can’t or just don’t want to participate in the project directly, the may be willing to support the project by purchasing support subscriptions.




10 months later I can admit that I was partially wrong in my thoughts. While consumer user base growing rapidly, i just can’t tell the same about contributors and this is a pity. Sure, we got a few new contributors, some of them contribute occasionally, other are more active, and some old contributors are back (Thank you guys for joining/re-joining VyOS!). We are also working with several companies that are showing interest in VyOS as a platform and contribute to the project in commercial means and via human resources, and that is great, however, it’s not enough at this scale.


At this point, I started thinking that current situation is not something that can be considered as fair and not really make sense.


This are just some of questions that came to my mind frequently:


  • Why those who not contributing literally nothing to the project, getting the same as others who spend their time and resources?

  • Why companies like ALP group using VyOS in their business and claiming publicly that they will return improvements to upstream when they are not actually returning anything? Why do some people think that they can come to IRC/Chat and demand something without contributing anything?

  • Why are those cloud providers that using VyOS for their businesses not bothering to support the project in any way?


I would like to remind you of the basic principles of the VyOS philosophy established from its start:


VyOS is a community driven project!

VyOS always will be an open source project!

VyOS will not have any commercial or any other special versions!


However, if we all want VyOS to be a great project, we all need to adhere to those principles, otherwise, nothing will happen. Community driven means that the driving force behind improvements should be those interested in them. Open source means we can’t license a proprietary component from a third party if existing open source software does not provide the feature you need. Finally, free for everyone means we all share responsibility for the success or failure of the project.


I’m happy and proud to be part of VyOS community and I really consider as my duty to help the project and the community grow. I’m doing what I can, and I expect that if you also care about the project, you will participate too.


We all can contribute to the project, no matter if you are developer or network engineer or neither of this.


There are many tasks that can be done by individuals with zero programming involved:


  • Documentation (documenting new features, improving existing wiki pages, or rewriting old documentation for Vyatta Core)

  • Support community in forums/IRC/chat (we have English and localized forums, and you can request a channel in your native language like our Japanese community did)

  • Feature requests (well described use cases from which can benefit all our community: note that a good feature request should make it easier for developers to implement it, just saying you want MPLS is not quite the same as researching existing open source implementations, trying them out, making sample configs contributors with coding skills can use for the reference, drafting CLI design and so on!)

  • Testing & Bug reports

  • Design discussions, such as those in the VyOS 2.0 development digest


If you work at the company that uses VyOS for business needs please consider  talking with CEO/CTO about:

  • Providing full/part time worker(s) to accomplish tasks listed above

  • Provide paid accounts in common clouds for development needs

  • Provide HW and license for laboratory (we need quite a lot of HW to support all of the hypervisors, same is true about licenses for interworking testing)

  • Buy commercial support & services



In January, we’d like to have a meeting with all current contributors to discuss what we can do to increase participation in the project.

Meanwhile, I would like to ask you to share this blog post to all whom it may concern.

All of the VyOS users (especially those companies that use VyOS in their business) should be aware that is time to start participate in the project if you want to keep using VyOS and rely on it in the future.


Brace yourself.

Change is coming!

VyOS 2.0 development digest #5: doing 1.2.x and 2.0 development in parallel

There was a rather heated discussion about the 1.2.0 situation on the channel, and valid points were definitely expressed: while 2.0 is being written, 1.2.0 can't benefit from any of that work, and it's sad. We do need a working VyOS in any case, and we can't just stop doing anything about it and only work on 2.0. My original plan was to put 1.2.0 in maintenance mode once it's stabilized, but it would mean no updates at all for anyone, other than bugfixes. To make things worse, some things do need if not rewrite, but at least very deep refactoring bordering on rewrite just to make them work again, due to deep changes in the configs of e.g. StrongSWAN.

There are three main issues with reusing the old code,  as I already said: it's written in Perl, it mixes config reading and checking with logic, and it can't be tested outside VyOS. The fourth issue is that the logic for generating, writing, and applying configs is not separated in most scripts either so they don't fit the 2.0 model of more transactional commits. The question is if we can do anything about those issues to enable rewriting bits of 1.2.0 in a way that will allow reusing that code in 2.0 when the config backend and base system are ready, and what exactly should we do.

My conclusion so far is that we probably can, with some dirty hacks and extra care. Read on.

The language

I guess by now everyone agrees that Perl is a bad idea. There are few people who know it these days, and there is no justification for knowing it. The language is a minefield that lacks proper error reporting mechanism or means to convey the semantics.

If you are new to it, look at this examples:

All "error reporting" enabled, let's try to divide a string by an integer.

$ perl -e 'use strict; use warnings; print "foobar" / 42'
Argument "foobar" isn't numeric in division (/) at -e line 1.
0

A warning indeed... Didn't prevent program from producing a value though: garbage in, garbage out. And, my long time favorite: analogous issues bit me in real code a number of times!

$ perl -e 'print reverse "dog"'
dog

Even if you know that it has to do with "list context", good luck finding information about default context of this or that function in the docs. In short, if the language of VyOS 1.x wasn't Perl, a lot of bugs would be outright impossible.

Python looks like a good candidate for config scripts: it's strongly typed, the type and object system is fairly expressive, there are nice unit test libraries and template processors and other things, and it's reasonably fast. What I don't like about it and dynamically typed languages in general is that it needs a damn good test coverage because the set of errors it can detect at compile time is limited and a lot of errors make it to runtime, but there are always compromises.

But, we need bindings. VyConf will use sockets and protobuf messages for its API which makes writing bindings for pretty much any language trivial, but in 1.x.x it's more complicated. The C++/Perl library from VyOS backend is not really easy to follow, and not trivial to produce bindings for. However, we have cli-shell-api, which is already used in config scripts written in shell, and behaves as it should. It also produces fairly machine-friendly output, even though its error reporting is rudimantary (then again, error reporting of the C++ and Perl library isn't all that nice either). So, for a proof of concept, I decided to make a thin wrapper around cli-shell-api: later it can be rewritten as a real C++ binding if this approach shows its limitations. It will need some C++ library logic extraction and cleanup to replicate the behaviour (why the C++ library itself links against Perl interpreter library? Did you know it also links to specific version of the apt-pkg library that was never meant for end users and made no promise of API stability, for its version comparison function that it uses for soring names of nodes like eth0? That's another story though).

Anyway, I need to add the Python library to the vyatta-cfg package which I'll do soon, for the time being you can put the file to your VyOS (it works in 1.1.7 with python2.6) and play with it:  

Right now it exposes just a handful of functions: exists(), return_value(), return_values(), and list_nodes(). It also has is_leaf/is_tag/is_multi functions that it uses internally to produce somewhat better error reporting, though they are unnecessary in config scripts, since you already know that about nodes from templates. Those four functions are enough to write a config script for something like squid, dnsmasq, openvpn, or anything else that can reload its config on its own. It's programs that need fancy update logic that really need exists_orig or return_effective_value. Incidentally, a lot of components that need that rewrite to repair or could seriously benefit from serious overhaul are like that: for example. iptables is handled by manipulating individual rules right now even though iptables-restore is atomic, likewise openvpn is now managed by passing it the config in command line options while it's perfectly capable of reloading its config and this would make tunnel restarts a lot less disruptive, and strongswan, the holder of the least maintainable config script, is indeed capable of live reload too.

Which brings us to the next part...

The conventions

To avoid having to do two rewrites of the same code instead of just one, we need to make sure that at least substantial part of the code from VyOS 1.2.x can be reused in 2.0. For this we need to setup a set of conventions. I suggest the following, and let's discuss it.

Language version

Python 3 SHALL be used.

Rationale: well, how much longer can we all keep 2.x alive if 3.0 is just a cleaner and nicer implementation?

Coding standard

No single function shall SHOULD be longer than 100 lines.

Rationale: https://github.com/vyos/vyatta-cfg-vpn/blob/current/scripts/vpn-config.pl#L449-L1134 ;)

Logic separation and testability

This is the most important part. To be able to reuse anything, we need to separate assumptions about the environment from the core logic. To be able to test it in isolation and make sure most of the bugs are caught on developer workstations rather than test routers, we need to avoid dependendies on the global state whenever possible. Also, to fit the transactional commit model of VyOS 2.0 later, we need to keep consistency checking, generating configs, and restarting services separated.

For this, I suggest that we config scripts follow this blueprint:


def get_config():
    foo = vyos.config.return_value("foo bar")
    bar = vyos.config.return_value("baz quux")
    return {"foo": foo, "bar": bar} # Could be an object depending on size and complexity...

def verify(config):
    result do_some_checks(config)
    if checks_succees(result):
        return None
    else:
        raise ScaryException("Some error")

def generate(config):
    write_config_files(config)

def apply(config):
    restart_some_services(config)

if __name__ == '__main__':
    try:
       c = get_config()
       verify(c)
       generate(c)
       apply(c)
    except ScaryException:
        exit_gracefully()

This way the function that process the config can be tested outside of VyOS by creating the same stucture as get_config() would create by hand (or from file) and passing it as an argument. Likewise, in 2.0 we can call its verify(), update(), and apply() functions separately.

Let me know what you think.

VyOS 2.0 development digest #4: simple tasks for beginners, and the reasons to learn (and use) OCaml

Look, I lied again. This post is still not about the config and reference tree internals. People in the chat and elsewhere started showing some interest in learning OCaml and contributing to VyConf, and Hiroyuki Satou even already made a small pull request (it's regarding build docs rather than the code itself, but that's a good start and will help people with setting up the environment), so I decided to make a post to explain some details and address common concerns.

The easy tasks

There are a couple of tasks that can be done completely by analogy, so they are good for getting familiar with the code and making sure your build environment actually works.

The first one is about new properties of config tree nodes, "inactive" and "ephemeral", that will be used for JunOS-like activate/deactivate functionality, and for nodes that are meant to be temporary and won't make it to the saved config respectively.

The other one is about adding "hidden" and "secret" properties to the command definition schema and the reference tree, "hidden" is meant for hiding commands from completion (for feature toggles, or easter eggs ;), and "secret" is meant to hide sensitive data from unpriviliged users or for making public pastes.

Make sure you reference the task number in your commit description, as in "T225: add this and that" so that phabricator can automatically connect the commits with the task.

If you want to take them up and need any assistance, feel free to ask me in phabricator or elsewhere.

VyOS 2.0 development digest #3: questions for you, vyconf daemon config format, appliance directory structure, and external validators lookup

Ok, I changed my mind: before we jump into the abyss of datastructures and look how the config and reference trees work, I'll describe the changes I made in the last few days.

Also, I'd like to make it clear that if you don't respond to design questions I state in these posts, I, or whoever takes up the task, will just do what they think is right. ;)

I guess I'll start with the questions this time. First, in the comments to the first post, the person who goes by kglkgvkvsd544 suggested two features: commit-confirm by default, and an alternative solution to the partial commit where instead of loading the failsafe config, the system loads a previous revision instead, in the same way as commit-confirm does. I don't think any of those should be the only available option, but having them as configurable options may be a good idea. Let me know what you think about it.

Another very important question: we need to decide on the wire protocol that vyconf daemon will use for communication with its clients (the CLI tool, the interactive shell, and the HTTP bridge). I created a task for it, https://phabricator.vyos.net/T216, let me know what you think about it. I'm inclined towards protobuf myself.

Now to the recent changes.

vyconfd config

As I already said, VyConf is supposed to run as a daemon and keep the running config tree, the reference tree, and the user sessions in memory. Obviously, it needs to know where to get that data. It also needs to know where to write logs, where the PID file and socket file should be, and other things daemons are supposed to know. Here's what the vyconf config for VyOS may look like:

[appliance]
name = "VyOS"

data_dir = "/usr/share/vyos/"
program_dir = "/usr/libexec/vyos"
config_dir = "/etc/vyos"

# paths relative to config_dir
primary_config = "config.boot"
fallback_config = "config.failsafe"

[vyconf]
socket = "/var/run/vyconfd.sock"
pid_file = "/var/run/vyconfd.pid"
log_file = "/var/log/vyconfd.log"
That INI-like language is called TOML, it's pretty well specified and capable of some fancy stuff like arrays and hashes, apart from simple (string, string) pairs. The best part is that there are libraries for parsing it for many languages, and the one for OCaml is particularly nice and idiomatic (it uses lenses and option types to access a nested and possibly underdefined datastructure in a handy and typesafe way), like:
let pid_file = TomlLenses.(get conf (key "vyconf" |-- table |-- key "pid_file" |-- string)) in
match pid_file with
| Some f -> f
| None -> "/var/run/vyconfd.pid"
The config format and this example reflects some decisions. First, the directory structure if more FHS-friendly. What do we need /opt/vyos for, if FHS already has directories meant for exactly what we need: architecture-independent data (/usr/share), programs called by other programs and not directly by users (/usr/libexec), and config files (/etc)?
Second, all important parameters are configurable. The primary motivation for this is making VyConf usable for every appliance developer (most appliances will not be called VyOS for certain), but for ourselves and for every contributor to VyOS it's a reminder to avoid hardcoded paths anywhere: if changing it is just one line edit away, a hardcoded path is an especially bad idea.

Here's the complete directory structure I envision:
$data_dir/
  interfaces/ # interface definitions
  components/ # Component definitions
$program_dir/
  components/  # Scripts/programs that verify the appliance config, generate actual configs, and apply them
  migration/       # Migration scripts (convert appliance config if syntax changes)
  validators/       # Value validators
$config_dir/
  scripts/              # User scripts
  post-config.d/   # Post-config hooks, like vyatta-postconfig-bootup.script
  pre-commit.d/   # Pre-commit hooks
  post-commit.d/ # Post-commit hooks
  archive/             # Commit archive
This is not an exhaustive list of directories an appliance can have of course, it's just directories that have any significance for VyConf. I'm also wondering if we should introduce post-save hooks for those who want to do something beyond the built-in commit archive functionality.

External validators

As you remember, my idea is to get rid of the inflexible system of built-in "types" and make regex the only built-in constraint type, and use external validators for everything else.

External validators will be stored in the $program_dir/validators. Since many packages use the same types of values (IP addresses is a common example), and in VyOS 1.x we already have quite a lot of templates that reference the same validation scripts, making it a separate entity will simplify reusing them, since it's easy to see what validators exist, and you can also be sure that they behave like you expect (or if they don't, it's a bug).
Validator executable take two arguments, first argument is constraint string, and the second is the value to be validated (e.g. range "1-99" "42").The expected behaviour is to return 0 if the value is valid and a non-zero exit code otherwise. Validators should not produce any output, instead the user will get the message defined in the constraintError tag of the interface definition (this approach is more flexible since different things may want to use different messages, e.g. to clarify why exactly the value is not valid).

That's all for today. Feel free to comment and ask questions. The next post really will be about the config tree and the way set commands work, stay tuned.

VyOS 2.0 development digest #2

In the previous post we talked about the reasons for rewrite, design and implementation issues, and basic ideas. Now it's time to get to details. Today we'll mostly talk about command definitions, or rather interface definitions, since set commands is just one way to access the configuration interface.

Let's review the VyConf architecture (I included a few things in the diagram that we haven't discussed yet, ignore them for now):

At startup, VyConf will load the main config (or the fallback config, if that fails). But to know whether the config is valid, and to know what programs to call to actually configure the target applications, it needs additional data. We'll call that data "interface definitions" since it defines the configuration interface. Specifically, it defines:

  1. What config nodes (paths) are allowed (e.g. "interfaces ethernet", or "protocols ospf")
  2. What values are valid for that nodes (e.g. any IPv4 or IPv6 address for "system name-server")
  3. What script/program should be called when this or that part of the config is changed

The old way

Before we get to the new way, let's review the old way, the way it's one in the current VyOS implementation. In the current VyOS, those definitions are called "templates", no one remembers why.

This is a typical template file:
vyos@vyos# cat /opt/vyatta/share/vyatta-cfg/templates/interfaces/ethernet/node.tag/speed/node.def 
type: txt
help: Link speed
default: "auto"
syntax:expression: $VAR(@) in "auto", "10", "100", "1000", "2500", "10000"; "Speed must be auto, 10, 100, 1000, 2500, or 10000"
allowed: echo auto 10 100 1000 2500 10000

commit:expression: exec "\
	/opt/vyatta/sbin/vyatta-interfaces.pl --dev=$VAR(../@) \
	--check-speed $VAR(@) $VAR(../duplex/@)"

update: if [ ! -f /tmp/speed-duplex.$VAR(../@) ]; then
	   /opt/vyatta/sbin/vyatta-interfaces.pl --dev=$VAR(../@) \
	   	--speed-duplex $VAR(@) $VAR(../duplex/@)
	   touch /tmp/speed-duplex.$VAR(../@)
	fi

val_help: auto; Auto negotiation (default)
val_help: 10; 10 Mbit/sec
val_help: 100; 100 Mbit/sec
val_help: 1000; 1 Gbit/sec
val_help: 2500; 2.5 Gbit/sec
val_help: 10000; 10 Gbit/sec

We can spot a few issues with it already. First, the set of definitions is a huge directory tree where each directory represents a config node, e.g. "interfaces ethernet address" will be "interfaces/ethernet/node.tag/address/node.def". This makes them hard to navigate, you need to read a lot of small files to get the whole picture, and you need to do a lot of file/directory hopping to edit them. Now, try mass editing, or checking for mistakes before release...

Next, they use a custom syntax, which needs custom lexer and parser, and custom documentation (in practice, the source is your best bet, though I did write something of a syntax reference). No effortless parsing, no effortless analysis or transformation either.

Next, the value checking has its peculiarities. There is concept of type (it can be txt, u32, ipv4, ipv4net, ipv6, ipv6net, and macaddr), and there is "syntax:expression:" thing which partially duplicate each other. The types are hardcoded in the config backend and cannot be added without modifying it, even though they only define validation procedure and are not used in any other way. The "syntax:expression:" can be either "foo in bar baz quux", or "pattern $regex", or "exec $externalScript".

But, the "original sin" of those files is that they allow embedded shell scripts, as you can see. Mixing data with logic is rarely a good idea, and in this case it's especially annoying because a) you cannot test such code other than on a live system b) you have to read every single file in a package to get the complete picture, since any of them may have embedded shell.

Now to the new way.

Why services and not donations

This was originally intended as a reply to a comment in the VyOS 2.0 digest #1 post, but that would be a rather long reply to post in comments I guess. This question is brought up once every few months perhaps, but I never got to publishing my own position on it.

Dave Smarts asked there:

>Where / how do we donate? Is there an obvious personal vs. corp. donation system?

Contributions and commitment to development are best donations

Myself I think the best thing individuals can do for the project is contribute to the design and the code. For 2.0, we are especially in need of commited people who can join the core team, so we can work together and make it happen (given the amount of design work and CS heavy lifting it needs). The foundation needs to be at least somewhat complete before "casual" contributors can add small improvement, in the initial stage every developer needs to know the full context.

While the jessie migration work is more friendly to occasional contributors who can add a fix or two when they find a bug, it also needs a lot of dedicated and unrewarding work on cleaning up the mess we inherited and removing the assumptions that are no longer true in jessie.

While we are at it, there's an area that needs contributions especially badly: documentation. If you've got a minute, come to wiki.vyos.net and document some feature. A stab page with some config examples is better than no page at all. If you don't know what to write about, you can use the old Vyatta Core docs as a reference (just don't copy anything verbatim from there! They are not under an open source license, that would be a copyright violation).

Donations create legal issues and paperwork

To be fair, I have no idea how donations must be handled, in any country. It's only any obvious if we start a non-profit "VyOS Foundation", even then starting a non-profit is quite an undertake that comes with loads of paperwork and expenses. Remember PDPC (the foundation set up by Freenode)? It had to be dissolved because donations to Freenode couldn't cover the expenses of running it.

Additionally, whether legally required or not, I believe anyone who receives donations is morally required to publish a complete financial report and disclose what they received and what they spent it on. Non-profits are legally required to do that, in any case people who donated have the right to know what happened to their donations.

Donations create a moral problem

I'll be using Wikipedia as a model project that relies on donations.

For them, donations are unquestionably morally correct because the primary purpose of donations is to stay online, and what it takes to stay online is obvious (servers, hosting, bandwidth...). MediaWiki software (for the most part) and the content of Wikipedia are community effort, and people aren't donating towards better content, they are donating towards keeping the servers online, and as long as they are online, their expectations are met. Note that anything they do with cash left after the hosting bills is often criticized, such as the salaries of the paid foundation executives, and their research projects. But, at least the funds spent on hosting are unquestinably spent the intended way.

Donations towards development, however, are a lot like the part Wikimedia Foundation is criticized for. Donations create expectations (justifiably!), but in this case the expectations are left implicit, and open to interpretations. What exactly constitutes spending money towards development? If the donation is exactly to the VyOS project (even assuming VyOS Foundation did exist), who is eligible to receive shares of it? Is programming the only appropriate activity, or it's right to hire a graphics designer to improve the website? Myself, breaking the expectations of people who donate is the last thing I want, but without knowing their expectations it's impossible to avoid.

The other issue is common for donations and crowdfunding. What's our responsibility if we fail to do something in time, or at all?

For a small project, donations from individuals are not financially viable

VyOS is a small project, whether we like it or not. Making overly optimistic estimates, let's say we have 10 000 users, and if every hundredth user donates $10 a year, we are left with $1000. Just enough to pay the bills of one particularly modest maintainer for a month.

Why services are better

  • For companies, authorizing service purchase is normally a lot easier than authorizing a donation
  • You get real value for your money, rather than a vague promise
  • Since the obligations of each party are well defined, expectations will not be broken accidentally

What to do

If you want to contribute to VyOS as an individual, join the development. If you want to contribute financially, try to persuade your company to buy support from Sentrium: at the very least is will help me (dmbaturin) pay my bills, and if we get enough customers, build a team of fulltime (or at least part time) maintainers.

As an alternative way to get guaranteed developer time, if your company is using VyOS and you have people who already contributed to it or want to, consider allocating some of their paid time for it, we will be more than grateful if you do.

VyOS 2.0 development digest #1

I keep talking about the future VyOS 2.0 and how we all should be doing it, but I guess my biggest mistake is not being public enough, and not being structured enough.

In the early days of VyOS, I used to post development updates, which no one would read or comment upon, so I gave up on it. Now that I think of it, I shouldn't have expected much as the size of the community was very small at the time, and there were hardly many people to read it in the first place, even though it was a critical time for the project, and input from the readers would have been very valuable.

Well, this is a critical time for the project too, and we need your input and your contributions more than ever, so I need to get to fixing my mistakes and try to make it easy for everyone to see what's going on and what we need help with.

Getting a steady stream of contributions is a very important goal. While the commercial support thing we are doing may let the maintainers focus on VyOS and ensure that things like security fixes and release builds get guaranteed attention in time, without occasional contributors who add things they personally need (while maintainers may not, I think myself I'm using maybe 30% of all VyOS features any often) the project will never realize its full potential, and may go stale.

But to make the project easy to manage and easy to contribute to, we need to solve multiple hard problems. It can be hard to get oneself to do things that promise no immediate returns, but if you looks at it the other way, we have a chance to build a system of our dreams together. As of 1.1.x and 1.2.x (the jessie branch), we'll figure it out how to maintain it until we solve those problems, but that's for another post. Right now we are talking about VyOS 2.0, which gets to be a cleanroom rewrite.

Why VyOS isn't as good as it could be, and can't be improved

I considered using "Why VyOS sucks" to catch reader's attention. It's a harsh word, and it may not be all that true, given that VyOS in its current state is way ahead of many other systems that don't even have system-wide config consistency checks, or revisions, or safe upgrades, but there are multiple problems that are so fundamental that they are impossible to fix without rewriting at least a very large part of the code.

I'll state the design problems that cannot be fixed in the current system. They affect both end users and contributors, sometimes indirectly, but very seriously.

Design problem #1: partial commits

You've seen it. You commit, there's an error somewhere, and one part of the config is applied, while the other isn't. Most of the time it's just a nuisance, you fix the issue and commit again, but if you, say, change interface address and firewall rule that is supposed to allow SSH to it, you can get locked out of your system.

The worst case, however, is when commit fails at boot. While it's good to have SSH at least, debugging it can be very frustrating, when something doesn't work, and you have no idea why, until you inspect the running config and see that something is simply missing (if you run into it in VyOS 1.x, do "load /config/config.boot" and commit, this will either work or show you why it failed). It's made worse by lack of notifications about config load failure for remote users, you can only see that error on the console.

The feature that can't be implemented due to it is what goes by "commit check" in JunOS. You can't test if your configuration will apply cleanly without actually commiting it.

It's because in the scripts, the logic for consistency checking and generating real configs (and sometimes applying them too) is mixed together. Regardless of the backend issues, every script needs to be taken apart and rewritten to separate that logic. We'll talk more about it later.

Design problem #2: read and write operations disparity

Config reads and writes are implemented in completely different ways. There is no easy programmatic API for modifying the config, and it's very hard to implement because binaries that do it rely on specific environment setup. Not impossible, but very hard to do right, and to maintain afterwards.

This blocks many things: network API and thus an easy to implement GUI, modifying the config script scripts in sane ways (we do have the script-template which does the trick, kinda, but it could be a lot better).

Design problem #3: internal representation

Now we are getting to really bad stuff. The running config is represented as a directory tree in tmpfs. If you find it hard to believe, browse /opt/vyatta/config/active, e.g. /opt/vyatta/config/active/system/time-zone/node.val

Config levels are directories, and node values are in node.val files. For every config session, a copy of the active directory is made, and mounted together with the original directory in union mount through UnionFS.

There are lots of reasons why it's bad:

  • It relies on behaviour of UnionFS, OverlayFS or another filesystem won't do. We are at mercy of unionfs-fuse developers now, and if they stop maintaining it (and I can see why they may, OverlayFS has many advantages over it), things will get interesting for us
  • It requires watching file ownership and permissions. Scripts that modify the config need to run as vyattacfg group, and if you forget to sg, you end up with a system where no one but you (or root) can make any new commits, until you fix it by hand or reboot
  • It keeps us from implementing role-based access control, since config permissions are tied to UNIX permissions, and we'd have to map it to POSIX ACLs or SELinux and re-create those access rules at boot since the running config dir is populated by loading the config
  • For large configs, it creates a fair amount of system calls and context switches, which may make system run slower than it could

Design problem #3: rollback mechanism

Due to certain details (mostly handling of default values), and the way config scripts work too, rollback cannot be done without reboot. Same issue once made Vyatta developers revert activate/deactivate feature.

It makes confirmed commit a lot less useful than it should be, especially in telecom where routers cannot be rebooted at random even in maintenance windows.

Implementation problem #1: untestable logic

We already discussed it a bit. The logic for reading the config, validating it, and generating application configs is mixed in most of the scripts. It may not look like a big deal, but for the maintainers and contributors it is. It's also amplified by the fact that there is not way to create and manipulate configs separately, the only way you can test anything is to build a complete image, boot it, and painstakingly test everything by hand, or have expect-like tool emulate testing it by hand.

You never know if your changes may possibly work until you get them to a live system. This allows syntax errors in command definitions and compilation errors in scripts to make it into builds, and it make it into a release more than one time when it wasn't immediately apparent and only appread with certain combination of options.

This can be improved a lot by testing components in isolation, but this requires that the code is written in appropriate way. If you write a calculator and start with add(), sub(), mul() etc. functions, and use them in a GUI form, you can test the logic on its own automatically, e.g. does add(2,3) equal 5, and does mul(9, 0) equal 0, does sqrt(-3) raise an exception and so on. But if you embed that logic in button event handlers, you are out of luck. That's how VyOS is for the most part, even if you mock the config subsystem so that config read functions return the test data, you need to redo the script so that every function does exactly one thing testable in isolation.

This is one of the reasons 1.2.0 is taking so long, without tests, or even ability to add them, we don't even know what's not working until we stumble upon it in manual testing.

Implementation problem #2: command definitions

This is a design problem too, but it's not so fundamental. Now we use custom syntax for command definitions (aka "templates"), which have tags such as help: or type: and embedded shell scripts. There are multiple problem with it. For example, it's not so easy to automatically generate at least a command reference from them, and you need a complete live system for that, since part of the templates is autogenerated. The other issue is that right now some components feature very extensive use of embedded shell, and some things are implemented in embedded shell scripts inside templates entirely, which makes testing even harder than it already is.

We could talk about upgrade mechanism too, but I guess I'll leave it for another post. Right now I'd like to talk about proposed solutions, and what's being done already, and what kind of work you can join.

Sentrium Cyber Monday promotion: get some free hands on assistance hours with every VyOS commercial support plan

Hi everyone,

This summer,  Sentrium, an IT consulting company ran by some of long time VyOS developers and users, launched commercial support for VyOS, something that we think will make VyOS more attractive for enterprise and service provider networks and give the VyOS project some funding and ensure its sustainable development and growth.

We also would like to say huge thanks to all of our existing customers for the commitment to the VyOS project and for your trust in us! You are contributing to the VyOS development and we are really happy to see such interest from companies around the world. Thank you!

Based on these few months of experience with VyOS support, we decided to adjust our support plans based on customer feedback You can view the new plans at this web page: https://sentrium.io/vyos-commercial-support/

We tried to cover common use cases which we observing:

  • Small companies and nonprofit organizations with budget constraints
  • Companies with internal expertise which just need a “formal” support contract due to the business requirements.
  • Businesses who need phone support and hands-on assistance with strict SLAs for mission critical applications.

So, at a glance, how our support from vendors:

  • All support plans cover all routers in the company, no matter if this production instance, or you just spin up another VyOS in your lab. You don’t need get to pay more when your network grows, or wait to get support contracts for new routers.

  • Initial configuration and environment review is included in each plan, and this allows us to suggest config improvements and also shorten resolution times (we can go directly to issue, bypassing topology discovery)

  • We only employ people with real networking knowledge and do not do anything to artificially make response time appear shorter than it is (e.g. by sending a form reply). We do not offer guaranteed short response time for all plans, but you can be sure the first reply you get includes some useful information about your problem. For email support, relatively long response time also allows us to do some research about your problem, try a solution in a lab, or consult a maintainer or a contributor, if we cannot offer a useful response offhand.

From Cyber Monday and until end of the year we’ll be running a promotion: everyone who buys a support subscription also gets free hand on assistance hours. The basic plan comes with one free hour, the standard plan comes with four, and the production support plan comes with eight.

We know there are still a number of people running Vyatta Core and who may want to switch to VyOS. Switching to a new system is always a concern, so if you are using Vyatta Core, you can use the hands on assistance hours for migration to VyOS. But with all improvements and, most importantly, security fixes that have been added, and will be added by 1.1.8 and then 1.2.0, upgrade is very important.

If you are running Vyatta Core 6.5 or 6.6, it can be upgraded to VyOS as if it was a new Vyatta version, with a few minor caveats. If your version is 6.4 or earlier, there are some features that may require manual config rewrite, which shouldn’t take too long unless your config is particularly large. We have successfully upgraded versions as old as 6.0 to VyOS without any problems. If, no matter how unlikely, you are still using VC 5.0, things get interesting, since it didn’t have the image upgrade yet, but even that is doable and has been done before (though if you can, we suggest that you reinstall and we can help with migrating the config).