Installing Galaxy
This presents one way to create an optimized production Galaxy instance. Variations are certainly possible and some of the choices presented are/were dictated by local culture. Certain settings may be more suitable for production or development environments. Nonetheless, this presents a start-to-stop process for installation and setup.
Note: this is a living document, will change across time, and is occasionally terse or cryptic where I have yet to fill it out.
Assumptions
- Running on unix on Fedora or a similar system
- Galaxy will be running off a suburl (e.g. http://foobar/galaxy)
- Superuser privileges may be required at some points
- Apache is used as a frontend server.
Prepping environment
Create a user for Galaxy to run as and under. Galaxy will be installed into this users account:
% /usr/sbin/useradd galaxy % passwd galaxy
Note that we don't install Galaxy "inside" Apache, as this would expose all of Galaxy (including datasets) to anyone on the web.
Install virtualenv, so we can later create a sandboxed python interpreter:
% yum install python-virtualenv.noarch
See http://virtualenv.openplanning.org
If needed, install mercurial, so the Galaxy repository can be used for installation (and later updating):
% yum install mercurial
Change to the galaxy user and into its home directory. Clone the Galaxy repository:
% hg clone https://bitbucket.org/galaxy/galaxy-dist
This will create galaxy-dist in the home directory.
Create a local sand-boxed Python interpreter for the galaxy user. We'll install all local data in the "local" dir:
% virtualenv --no-site-packages local
Then activate this interpreter, which will modify $PATH so the sand-boxed python is used by galaxy:
% source ./local/bin/activate
Alternatively, an entirely separatePython interpreter could be installed for Galaxy. You could use the system interpreter, but either of these two schemes avoids library collision or dueling versions.
Check this installation by running Galaxy:
% cd galaxy-dist % sh run.sh
It may report that numerous "eggs" (Python librararies) are being updated, before saying that the server is starting. At that point, it may also report a network error (socket error 98) if another applpication is using the default socket (8080) or it is blocked from connecting to it.
Adjusting network settings
Galaxy settings are editted in the file universe_wsgi.ini. Edit the port Galaxy will use:
port = 7070
And the addresses it will listen to. With the default settings, Galaxy only listens to localhost and is not accessible over the network:
host = 0.0.0.0
Connecting to a real database
Using Postgres, or an equivalent real db, create a database for Galaxy's use:
CREATE DATABASE galaxy_prod;
Give the user permissions to create tables and so on:
GRANT ALL ON galaxy_prod.* TO galaxy_prod_user@localhost IDENTIFIED BY foobar;
Edit the database connection in universe_wsgi.ini to give the connection as an SQLAlchemy URI string:
db = postgres://galaxy_prod_user:foobar@123.45.67.89:5432/galaxy_prod
Note that the example given in the Galaxy documentation is wrong, or at least opaque.
Start up the system and see that it works. There will be an extended period of migrating tables.
Optimizing database use
Many database settings can be tweaked to speed Galaxy. Some are:
Reduce connection overhead by using only one connection to the database per thread:
database_engine_option_strategy = threadlocal
Large queries or datasets may cause issues, so Postgres database cursors should be cached:
database_engine_option_server_side_cursors = True
If plagued by errors of insufficient database pool connections, increase these:
#database_engine_option_pool_size = 5 #database_engine_option_max_overflow = 10
Setting up a proxy
Galaxy can run off its own internal webserver, but in production it is far preferable to use a proper server as a proxy. These instructions assume this is Apache and the server is to be accessed at http://fobarbaz.com/galaxy-inst. See https://bitbucket.org/galaxy/galaxy-central/wiki/Config/ApacheProxy and http://docs.uabgrid.uab.edu/wiki/Galaxy#Apache_and_Postgres_Setup
Edit the httpd.conf file:
% vi /etc/httpd/conf/httpd.conf
Rewrite requests on the standard port and the suburl to Galaxy:
<VirtualHost *:80> ServerName 158.119.147.41 RewriteEngine on #RewriteLog "/etc/httpd/logs/rewrite_log" #RewriteLogLevel 9 RewriteRule ^/galaxy$ /galaxy/ [R] #RewriteRule ^/galaxy/static/style/(.*) /home/galaxy/galaxy_dist/static/june_2007_style/blue/$1 [L] #RewriteRule ^/galaxy/static/scripts/(.*) /home/galaxy/galaxy_dist/static/scripts/packed/$1 [L] #RewriteRule ^/galaxy/static/(.*) /home/galaxy/galaxy_dist/static/$1 [L] #RewriteRule ^/galaxy/favicon.ico /home/galaxy/galaxy_dist/static/favicon.ico [L] #RewriteRule ^/galaxy/robots.txt /home/galaxy/galaxy_dist/static/robots.txt [L] RewriteRule ^/galaxy(.*) http://localhost:7070$1 [P] </VirtualHost>
RewriteLog commands can be used to debug the rewrites. See below for other commented out lines.
Restart the apache server:
% /etc/init.d/httpd restart
Ideally you'd like to serve static content (images, style sheets etc.) straight through Apache to take the load off Galaxy. The commented lines above show failed attempts to do so. The error seems to be outside of Galaxy in Apache and results in none of the static content showing up and the error log shows “(13) permission denied” errors. Things tried to diagnose and correct this:
- Logged the rewrite calls to see they rewrite to the correct paths for the static content
- Checked the proxy-filter and filter-with declarations
- Checked the unix permissions on the static dir
- Tested for non-existent or incorrect paths (which generate a different error)
- Inserted directory declarations to “Allow from all” for the static dir
- Checked for and tried .htaccess files
- Checked SELinux is disabled
Branding
In universe_wsgi.ini, edit the name of the site:
brand = HPA Bioinformatics
and url linked to by the logo:
logo_url = http://www.hpa-bioinfotools.org.uk
and the "email comments" address:
bugs_email = mailto:paul-michael.agapow@hpa.org.uk
Running Galaxy
You can run Galaxy as a detached daemon:
% sh ./run.sh --daemon % sh ./run.sh --stop-daemon % sh ./run.sh --status % sh ./run.sh --monitor-restart # restart if stopped
Create a startup (init) script:
% vi /etc/init.d/galaxy
and write it as something like this:
#!/bin/bash # # /etc/rc.d/init.d/galaxy # # Manages the Galaxy webserver # Based on http://www.sensi.org/~alec/unix/redhat/sysvinit.html # # chkconfig: 2345 80 20 # description: Manages the Galaxy webserver # The chkconfig is levels, strat priority, stop priority. Last two should add to 100. # You get an error/failure if you try to restrat a stopped service. # Source function library. . /etc/rc.d/init.d/functions GALAXY_USER=galaxy GALAXY_DIST_HOME=/home/galaxy/galaxy_dist GALAXY_RUN="${GALAXY_DIST_HOME}/run.sh" GALAXY_PID="${GALAXY_DIST_HOME}/paster.pid" case "$1" in start) echo -n "Starting galaxy services: " daemon --user $GALAXY_USER "${GALAXY_RUN} --daemon --pid-file=${GALAXY_PID}" touch /var/lock/subsys/galaxy ;; stop) echo -n "Shutting down galaxy services: " daemon --user $GALAXY_USER "${GALAXY_RUN} --stop-daemon" rm -f /var/lock/subsys/galaxy ;; status) daemon --user galaxy "${GALAXY_RUN} --status" ;; restart) $0 stop; $0 start ;; reload) $0 stop; $0 start ;; *) echo "Usage: galaxy {start|stop|status|reload|restart" exit 1 ;; esac
Set the permissions as 755. Check the owner:
% chmod 755 /etc/init.d/galaxy % ls -la /etc/init.d/galaxy
Add to the system services and check:
% /sbin/chkconfig --add galaxy % /sbin/chkconfig --list galaxy
Start the service with:
% /etc/init.d/galaxy start
Misc
Set Galaxy to use a local area as temporary storage:
% vi ~/.bash_profile
then:
TEMP=$HOME/galaxy_dist/database/tmp export TEMP
Don't forget:
% source ~/.bash_profile
The front page can be customized by editing static/welcome.html. You should at least edit out the "customize this page" message ...
Other style customizations are possible. Note that some may be cached by the system and take some time to show up.
Notes
Some documentation refers to your installation dir as "galaxy_dist", others as "galaxy-dist". Look out for this causing errors.