Compiling Quickjoin and file formats

Problems with building qjoin and getting it to read stockholm files.

Quickjoin / qjoin is an excellent commandline program for rapid construction of neighbour-joining trees. However, while using it recently, I had a few problems getting it to read Stockholm files, the most accessible of the formats it can use.

The initial problem was that qjoin uses the popt library for commandline option parsing. This library is on most Unix(-like) systems, but if not available, qjoin will compile anyway but with a much reduced series of options and capabilities. Two of the missing capabilities are the ability to read stockholm formatted files and do bootstraps. So, installing popt is advised. However this can a lengthy and intricate procedure, so getting an install from a package manager like MacPorts is advised.

(In the authors defence, this is made clear in the documentation, although the implications aren't clear.)

The next problem is that sometimes the software will refuse to compile. The ./configure command can fail with an error like:

./config.status: line 426: syntax error near unexpected token `}' ./config.status: line 426: `} >&5'

This is not a problem with qjoin, but with the underlying configuration software which can't handle being in a directory with an apostrophe in it's name (i.e. "can't compile qjoin"). Changing the name of the directory (i.e. "cant compile qjoin") fixed things.

A final issue occurred when it failed to read certain input sequence files, reporting that no sequences were present and:

Assertion failed: (seq), function read_alignment, file alignment.cc, line 234.

This issue here is that qjoin has a maximum allowed line length when reading Stockholm files and if the line exceeds that length, it is just skipped. This behaviour (and the length limit) can be changed by editing line 192 in the file alignment.cc:

#define MAX_LINE_LEN 4096

to the value required.

One more thing

More recent versions of gcc have done some header cleanup (see http://gcc.gnu.org/gcc-4.3/porting_to.html), which means that certain functions which used to be automatically or incidentally included are no longer, which means that a world of pain and editing the source to get this to compile. You need to add explicit inclusions of libraries to a number of files. These are all of the form:

#include <foo>

So, profile.cc and profile2.cc need:

#include <cstring>

alignment-test.cc needs:

#include <cstdio>

qjoin.cc needs:

#include <memory> #include <cstring>

result-tree.cc needs:

#include <algorithm>

result-tree.cc needs:

#include <algorithm>

matrix.cc needs:

#include <cstdio> #include <memory>

alignment.cc needs:

#include <memory> #include <cstring> #include <cstdio>

(For reasons that aren't clear to me, a different installation of qjoin required only a few of these to be fixed. Subtly different versions of gcc?)