Create G-OnRamp Production Image
===================================
Requirements
------------
1. Download and install Virtual Box => https://www.virtualbox.org
2. Download Ubuntu Server 16.04.1 LTS => https://www.ubuntu.com/download/server
Step by step instruction
-------------------------
Install Ubuntu server to the VirtualBox
******************************************
See `How to Install Ubuntu on VirtualBox `_
Additional settings when installing the Ubuntu
- Hostname: ubuntus
- Full name: galaxyadmin
- User name: galaxyadmin
- password: 1234
- Use entire disk and set up LVM
- No proxy configured
- Install security updates automatically
- Install LAMP server, PostgreSQL database, OpenSSH server
- Create new GRUB boot loader
Initial Ubuntu setup
***********************
Start the Ubuntu and login in to update the system.
Update packages::
$ sudo apt-get update
$ apt-get --with-new-pkgs upgrade
Install dependencies::
$ sudo apt-get install build-essential
$ sudo apt-get install cmake
$ sudo apt-get install zlib1g-dev
$ reboot
Set up internet host-only adapter (used to connect guest VM from host by ssh)
********************************************************************************
Shutdown the virtual machine and add vboxnet0 to Adapter 2 as Host-only Adapter. Then restart the virtual machine.
.. image:: network.png
:width: 500px
:align: center
:alt: add vboxnet0 to Adapter 2
On the host, type command::
$ ifconfig
Find the vboxnet0 ip::
vboxnet0: flags=8943 mtu 1500
ether 0a:00:27:00:00:00
inet 192.168.56.1 netmask 0xffffff00 broadcast 192.168.56.255
On the Ubuntu guest, list interfaces by typing the command::
$ ip addr
You should see three interfaces like lo, enp0s3, enp0s8. We will use the third.
Edit the interfaces file by::
$ cd /etc/network/interfaces
Add following enp0s8 configuration to the file::
auto enp0s8
iface enp0s8 inet static
address 192.168.56.11
netmask 255.255.255.0
Then activate the interface::
$ sudo ifup enp0s8
Check if enp0s8 got correct address. You should see your ip by typing::
$ ip addr show enp0s8
...
inet 192.168.56.11/24 brd 192.168.56.255 scope global secondary enp0s8
If not correct, you may run::
$ sudo ifdown enp0s8
$ sudo ifup enp0s8
$ reboot
Now you can access to Ubuntu guest through host by::
$ ssh galaxyadmin@192.168.56.11
Install Galaxy
*****************
Running as an existing user will cause problems down the line when you want to grant or restrict access to data.
Create a NON-ROOT user called galaxy::
$ sudo adduser galaxy
password: 2016
Add galaxy to sudo
$ usermod -aG sudo galaxy
Then login with galaxy
$ su - galaxy
Make sure Galaxy is using a clean Python interpreter. Conflicts in $PYTHONPATH or the interpreter's site-packages/ directory could cause problems. Galaxy manages its own dependencies for the framework, so you do not need to worry about these.
The easiest way to do this is with a virtualenv::
$ sudo apt-get update
$ sudo apt-get install python-pip
$ pip install --upgrade pip
$ sudo pip install virtualenv
$ virtualenv gonramp
$ source gonramp/bin/activate
Galaxy requires a few things to run: a virtualenv, configuration files and dependent python modules. Starting the server at the first time will set these thing up.
Download Galaxy 17.01 and rename galaxy folder::
$ git clone -b release_17.01 https://github.com/galaxyproject/galaxy.git
$ mv galaxy/ galaxy-dist
Basic configure Galaxy (galaxy.ini)::
[server:main]
# The address on which to listen. By default, only listen to localhost (Galaxy
# will not be accessible over the network). Use '0.0.0.0' to listen on all
# available network interfaces.
host = 192.168.56.11
debug = False
use_interactive = False
# filter-with = gzip
cleanup_job = onsuccess
Set up PostgreSQL database
**************************
Install PostgreSQL::
$ sudo apt-get update
$ sudo apt-get install postgresql postgresql-contrib
Once installed, create a new database user and new database which the new user is the owner of. No further setup is required, since Galaxy manages its own schema. If you are using a UNIX socket to connect the application to the database (this is the standard case if Galaxy and the database are on the same system), you'll want to name the database user the same as the system user under which you run the Galaxy process.
Create a database and a new user for the database::
$ sudo -u postgres createuser --superuser galaxy
$ sudo -u galaxy createdb galaxy
$ psql -U galaxy
galaxy=# \password
Enter new password: 1234
In galaxy.ini, set::
database_connection = postgresql://galaxy:1234@localhost/galaxy
Run Galaxy::
$ cd galaxy
$ sh run.sh
Set up proxy nginx
******************
`reference of how to install nginx `_
Install nginx
##############
Install nginx from pre-build package::
$ sudo systemctl stop apache2.service
$ sudo apt-get update
$ sudo apt-get install nginx
$ sudo apt-get install nginx-extras
..
# Connecting to VM via FTP
# Check the network adapter 1. change it to "Bridged"
# $ reboot
# Common used commands
# ref: http://nginx.org/en/docs/beginners_guide.html
# nginx -s signal
# Where signal may be one of the following:
# stop — fast shutdown
# quit — graceful shutdown
# reload — reloading the configuration file
# reopen — reopening the log files
# example: nginx -s reload
Add galaxy server block
########################
add server block::
$ cd /etc/nginx/sites-available/
$ sudo vim galaxy
add following server block to galaxy file::
server {
listen 80;
root /var/www/html;
# maximum file upload size
client_max_body_size 10G;
# pass most requests to the proxied Galaxy application
location /gonramp {
proxy_pass http://192.168.56.11:8080;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
# directly serve static content in nginx
location /gonramp/static {
alias /home/galaxy/galaxy-dist/static;
expires 24h;
}
location /gonramp/static/style {
alias /home/galaxy/galaxy-dist/static/style/blue;
expires 24h;
}
location /gonramp/static/scripts {
alias /home/galaxy/galaxy-dist/static/scripts;
expires 24h;
}
location /gonramp/favicon.ico {
alias /home/galaxy/galaxy-dist/static/favicon.ico;
expires 24h;
}
location /gonramp/robots.txt {
alias /home/galaxy/galaxy-dist/static/robots.txt;
expires 24h;
}
}
enable galaxy server::
$ cd /etc/nginx/sites-enabled/
$ sudo ln -s /etc/nginx/sites-available/galaxy /etc/nginx/sites-enabled/galaxy
Make sure that you either comment out or modify line containing default configuration for enabled sites. in /etc/nginx/nginx.conf, include /etc/nginx/sites-enabled/::
$ cd /etc/nginx/site-enabled
$ rm default
$ sudo service nginx restart
Galaxy application needs to be aware that it is running with a prefix (for generating URLs in dynamic pages). This is accomplished by configuring a Paste proxy-prefix filter in the [app:main] section of config/galaxy.ini and restarting Galaxy::
[server:main]
host = 192.168.56.11
[filter:proxy-prefix]
use = egg:PasteDeploy#prefix
prefix = /gonramp
[app:main]
filter-with = proxy-prefix
cookie_path = /gonramp
..
Change the root of nginx to /var/www/html/nginx, in order to distinct from apache page::
$ cd /etc/nginx/site-available
$ sudo vim galaxy #change root to /var/www/html/nginx
$ cd /var/www/html
$ sudo mkdir /var/www/html/nginx
$ mv index.nginx-debian.html nginx/
$ sudo systemctl restart nginx
Compression and caching
#######################
`nginx_ref `_
All of Galaxy's static content can be cached on the client side, and everything (including dynamic content) can be compressed on the fly. This will decrease download and page load times for your clients, as well as decrease server load and bandwidth usage. To enable, you'll need nginx gzip support (which is standard unless compiled with --without-http_gzip_module), and the following in your nginx.conf::
#!highlight nginx
http {
gzip on;
gzip_disable "msie6";
gzip_vary on;
gzip_proxied any;
gzip_comp_level 4;
gzip_buffers 16 8k;
gzip_http_version 1.1;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
}
For caching, you'll need to add an expires directive to the location /static { } blocks (already added, see server block)
Sending files using nginx
#########################
Add following to galaxy server block::
server {
location gonramp/_x_accel_redirect/ {
internal;
alias /;
}
}
And the following to the [app:main] section of config/galaxy.ini::
nginx_x_accel_redirect_base = /_x_accel_redirect
Receiving files using nginx
############################
To enable it, you must first download, compile and install nginx_upload_module. This means recompiling nginx.
TODO: need to install nginx from the source and recompile
Rotate log files
****************
To use logrotate to rotate Galaxy log files, add a new file named "galaxy" to /etc/logrotate.d/ directory with something like::
PATH_TO_GALAXY_LOG_FILES {
weekly
rotate 8
copytruncate
compress
missingok
notifempty
}
FTP server
**********
`Reference Enabling upload to Galaxy via FTP _`
in the config file, galaxy.ini, set::
ftp_upload_dir = /home/galaxy/galaxy-dist/database/ftp/
ftp_upload_site = 192.168.56.11
..
Allow your FTP server to read Galaxy's database
################################################
You'll need to grant a user access to read emails and passwords from the Galaxy database. Although the user Galaxy connects with could be used, I prefer to use a least-privilege setup wherein a separate user is created for the FTP server which has permission to SELECT from the galaxy_user table and nothing else. In postgres this is accomplished with:
(TODO: I used user galaxy instead of galaxyftp. Try to figure out why galaxyftp won't work)
postgres@dbserver% createuser -SDR galaxyftp
postgres@dbserver% psql galaxydb::
Welcome to psql 8.X.Y, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit
galaxydb=# ALTER ROLE galaxyftp PASSWORD 'dbpassword';
ALTER ROLE
galaxydb=# GRANT SELECT ON galaxy_user TO galaxyftp;
GRANT
set up FTP server
#################
`FTP server `_
Install ProFTPD::
$ sudo apt-get install proftpd
$ sudo apt-get install proftpd-mod-pgsql
$ sudo nano /etc/proftpd/proftpd.conf
Configure /etc/proftpd/proftpd.conf::
# Includes DSO modules
Include /etc/proftpd/modules.conf
# Basics, some site-specific
ServerName "Public Galaxy FTP"
ServerType standalone
DefaultServer on
Port 21
Umask 022
SyslogFacility DAEMON
SyslogLevel debug
MaxInstances 30
User galaxy
Group galaxy
# Passive port range for the firewall
PassivePorts 30000 40000
# Cause every FTP user to be "jailed" (chrooted) into their home directory
DefaultRoot ~
# Automatically create home directory if it doesn't exist
CreateHome on dirmode 700
# Allow users to overwrite their files
AllowOverwrite on
# Allow users to resume interrupted uploads
AllowStoreRestart on
# Bar use of SITE CHMOD
DenyAll
# Bar use of RETR (download) since this is not a public file drop
DenyAll
# Do not authenticate against real (system) users
AuthPAM off
# By default, Galaxy stores passwords using PBKDF2.
# Configuration that handles PBKDF2 encryption
SQLPasswordEngine on
SQLPasswordEncoding base64
SQLPasswordPBKDF2 SHA256 10000 24
SQLPasswordUserSalt sql:/GetUserSalt
# Set up mod_sql to authenticate against the Galaxy database
SQLEngine on
SQLBackend postgres
SQLConnectInfo galaxy@/var/run/postgresql galaxy 1234
SQLAuthTypes PBKDF2
SQLAuthenticate users
# An empty directory in case chroot fails
SQLDefaultHomedir /var/opt/local/proftpd
# Define a custom query for lookup that returns a passwd-like entry. Replace 1001s with the UID and GID of the user running the Galaxy server (to find out: $ id galaxy)
SQLUserInfo custom:/LookupGalaxyUser
SQLNamedQuery LookupGalaxyUser SELECT "email, (CASE WHEN substring(password from 1 for 6) = 'PBKDF2' THEN substring(password from 38 for 69) ELSE password END) AS password2,1001,1001,'/home/galaxy/galaxy-dist/database/ftp/%U','/bin/bash' FROM galaxy_user WHERE email='%U'"
SQLNamedQuery GetUserSalt SELECT "(CASE WHEN SUBSTRING (password from 1 for 6) = 'PBKDF2' THEN SUBSTRING (password from 21 for 16) END) AS salt FROM galaxy_user WHERE email='%U'"
Configure /etc/proftpd/modules.conf, add::
LoadModule mod_sql.c
LoadModule mod_sql_passwd.c
LoadModule mod_sql_postgres.c
LoadModule mod_sftp_sql.c
When we are ready with the configuration we can start up the ProFTPD server::
$ sudo /etc/init.d/proftpd start
or $ sudo service proftpd restart
..
Make sure that the default FTP port 21 is opened on the server::
$ sudo iptables -A INPUT -p tcp --dport 21 -j ACCEPT
$ sudo iptables-save
# Start Up ProFTPD automatically on server boot
$ sudo update-rc.d proftpd defaults
Scaling and Load Balancing
**************************
`reference1 `_
`reference2 `_
Enable multiple cores in the virtual environment
#. Stop the VM and go to Settings -> System -> Process
#. Change the processors number. I changed to 4.
#. Check "Enable PAE/NX", as "Some operating systems (such as Ubuntu Server) require PAE support from the CPU and cannot be run in a virtual machine without it." ( `reference `_)
Set up uWSGI
############
In galaxy.ini, define one or more [server:...] sections::
#Two are shown, you should create as many as are suitable for your usage and hardware.
[server:web0]
use = egg:Paste#http
port = 8080
host = 192.168.56.11
use_threadpool = True
threadpool_workers = 7
[server:web1]
use = egg:Paste#http
port = 8081
host = 192.168.56.11
use_threadpool = True
threadpool_workers = 7
In galaxy.ini, define a [uwsgi] section::
[uwsgi]
processes = 4
stats = 192.168.56.11:9191
socket = 192.168.56.11:4001
pythonpath = lib
threads = 4
logto = /home/galaxy/gonramp/logs/uwsgi.log #anywhere you like
master = True
Port numbers for stats and socket can be adjusted as desired. Moreover, in the [app:main] section, you must set::
static_enabled = False
track_jobs_in_database = True
Install wusgi::
# use pip install in the virtual environment
$ source gonramp/bin/activate
$ pip install uwsgi
The web processes can then be started under uWSGI using::
$ cd /path/to/galaxy-dist
$ PYTHONPATH=eggs/PasteDeploy-1.5.0-py2.7.egg uwsgi --ini-paste config/galaxy.ini
#Once started, a proxy server (typically Apache or nginx) must be configured to proxy requests to uWSGI (using uWSGI's native protocol). Configuration details for these can be found in Proxy section.
Job Handler(s)
##############
In galaxy.ini, define one or more additional [server:...] sections::
[server:handler0]
use = egg:Paste#http
port = 8090
host = 192.168.56.11
use_threadpool = true
threadpool_workers = 5
[server:handler1]
use = egg:Paste#http
port = 8091
host = 192.168.56.11
use_threadpool = true
threadpool_workers = 5
Configure job_conf.xml
######################
`Configure job reference `_
Uncomment in galaxy.ini::
job_config_file = config/job_conf.xml
Configure galaxy-dist/conf/job_conf.xml::
..
# Sample for using gridengine
2
1
24:00:00
..
TODO: Install and configure Grid Engine
(Maybe not need for localsystem)
#######################################
Install Grid engine::
$ sudo apt-get install gridengine-master gridengine-exec gridengine-client gridengine-drmaa1.0
#You'll be asked a series of configuration questions for Postfix and Grid Engine at this point. The following answers are suitable for this workshop:
General type of mail configuration: Local only
System mail name: galaxy
Configure SGE automatically?: Yes
SGE cell name: default
SGE master hostname: galaxy
Start and Stop with supervisord
###############################
Since you need to run multiple processes, the typical run.sh method for starting and stopping Galaxy won't work. The current recommended way to manage these multiple processes is with Supervisord.
Install supervisor in virtualenv::
$ pip install supervisor
Creating a Configuration File::
echo_supervisord_conf > /etc/supervisord.conf
# Configure
$ vim /etc/supervisord.conf
# add program galaxy_uwsgi, handler and group section
[program:galaxy_uwsgi]
command = /home/galaxy/gonramp/bin/uwsgi --virtualenv /home/galaxy/galaxy-dist/.venv --ini-paste /home/galaxy/galaxy-dist/config/galaxy.ini
directory = /home/galaxy/galaxy-dist
umask = 022
autostart = true
autorestart = true
startsecs = 20
user = galaxy
environment = PATH=/home/galaxy/galaxy-dist/.venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin, PYTHONHOME=/home/galaxy/galaxy-dist/.venv
numprocs = 1
stopsignal = INT
startretries = 15
[program:handler]
command = /home/galaxy/galaxy-dist/.venv/bin/python ./lib/galaxy/main.py -c ./config/galaxy.ini --server-name=handler%(process_num)s --log-file=/home/galaxy/galaxy-dist/handler%(process_num)s.log
directory = /home/galaxy/galaxy-dist
process_name = handler%(process_num)s
numprocs = 2
unmask = 022
autostart = true
autorestart = true
startsecs = 20
user = galaxy
environment = PYTHONHOME=/home/galaxy/galaxy-dist/.venv
[group:galaxy]
programs = handler, galaxy_uwsgi
Start the program::
$ supervisorctl start galaxy:*
# Check status
$ supervisorctl status
# Log file
$ tail -f /tmp/supervisord.log
Proxy uWSGI with nginx
######################
Add server block below in /etc/nginx/nginx.conf and comment out #include /etc/nginx/sites-enabled/ ::
server {
listen 80;
root /var/www/html;
# maximum file upload size
client_max_body_size 10G;
uwsgi_read_timeout 180;
location /gonramp {
include uwsgi_params;
uwsgi_pass 192.168.56.11:4001;
uwsgi_param UWSGI_SCHEME $scheme;
}
# directly serve static content in nginx
location /gonramp/static {
alias /home/galaxy/galaxy-dist/static;
expires 24h;
}
location /gonramp/static/style {
alias /home/galaxy/galaxy-dist/static/style/blue;
expires 24h;
}
location /gonramp/static/scripts {
alias /home/galaxy/galaxy-dist/static/scripts;
expires 24h;
}
location /gonramp/favicon.ico {
alias /home/galaxy/galaxy-dist/static/favicon.ico;
expires 24h;
}
location /gonramp/robots.txt {
alias /home/galaxy/galaxy-dist/static/robots.txt;
expires 24h;
}
}
Restart nginx::
$ sudo service nginx restart
Access to gonramp
#################
Start then access through browser::
$ supervisord
# Check status
$ supervisorctl status
# Restart after changing configuration
$ supervisorctl restart galaxy:*
# Stop Galaxy
$ supervisorctl stop galaxy:*
Goto http://192.168.56.11/gonramp
6. Set up G-OnRamp
******************
Become an Admin
###############
In order to install tools, you have to become administrator for your Galaxy instance. First start the server, go to http://192.168.56.11:8080/gonramp, and register as a new user with your email address::
username: galaxyadmin@gonramp.org
public name: galaxyadmin
password: 12341234
Go to galaxy folder and find a sub-folder called config. Add a new file named galaxy.ini in the config folder. You can copy the content of galaxy.ini.sample into galaxy.ini. In galaxy.ini, search for the line containing “admin_users”. Add your user email address to admin users. (Replace None to your email address). You can add multiple admin users by appending another email and separating them with a comma::
# this should be a comma-separated list of valid Galaxy users
admin_users = galaxyadmin@gonramp.org
Set up conda
############
In galaxy.ini, uncomment and edit the following conda configuration::
conda_ensure_channels = conda-forge,r,bioconda,iuc
conda_auto_install = True
conda_auto_init = True
..
Connect your Galaxy to the test tool shed
#########################################
Galaxy is connected to the Main Tool Shed by default. Since some tools in G-OnRamp workflow are in the Test Tool Shed, you need to connect to the Test Tool Shed by modifying the “tool_sheds_conf.xml” in config folder::
# Copy the “tool_sheds_conf.xml.sample” and rename it to “tool_sheds_conf.xml”
$ cd /home/galaxy/config
$ cp tool_sheds_conf.xml.sample tool_sheds_conf.xml
# Open the file
$ vim tool_sheds_conf.xml
Uncomment the lines for the Test Tool Shed::
To::
..
Add necessary datatypes
#######################
Copy the “datatypes_conf.xml.sample” and rename it to “datatypes_conf.xml”.
Add the line below in between ::
..
Other files should be ready (copy from .sample)
################################################
::
$ cp dependency_resolvers_conf.xml.sample dependency_resolvers_conf.xml
Restart the server after you modified the configuration files. You can hit Ctrl-c to stop the server and then start again.
7. Install G-OnRamp tools
*************************
Go to Admin page, and click on Search Tool Shed. Click on the Tool Sheds to search and install. You can add all G-OnRamp tools in a separate panel section by adding a new tool panel section when you install the first tool and then add all the rest tools in the same panel.
Click on Galaxy Main Tool Shed to install
#########################################
- ncbi_blast_plus (by devteam)
- augustus
- hisat2
- stringtie
- blastXmlToPsl (by yating-l)
- trfbig (by yating-l)
- pslToBed
- bamtobigwig (by yating-l)
- hubarchivecreator
- multi_fasta_glimmer_hmm (by yating-l)
- snap
- psltobigpsl (by yating-l)
- jbrowsearchivecreator
- gbtofasta
- regtools_junctions_extract (by yating-l)
- rename_scaffolds
- ucsc_blat
- ucsc_pslcdnafilter
- uscs_pslpostarget
- uscs_pslcheck
Tools need advanced configuration
#################################
1. multi_fasta_glimmer_hmm
Make a Dependencies folder in "/home/galaxy" and download Glimmer3 inside the folder by::
$ mkdir Dependencies
$ cd ~/Dependencies
$ wget ftp://ccb.jhu.edu/pub/software/glimmerhmm/GlimmerHMM-3.0.4.tar.gz
You need to use a trained organism by adding them as reference data in Galaxy. Add the glimmer_hmm_trained_dir data table to tool_data_table_conf.xml in $GALAXY_ROOT/config/::
Configure the glimmer_hmm.loc file referencing your trained organism, in tool-data. Uncomment the species and add the path to trained_dir, for example::
#TAB separated
human Human /home/galaxy/Dependencies/GlimmerHMM/trained_dir/human
celegans Celegan /home/galaxy/Dependencies/GlimmerHMM/trained_dir/celegans
arabidopsis Arabidopsis /home/galaxy/Dependencies/GlimmerHMM/trained_dir/arabidopsis
rice Rice /home/galaxy/Dependencies/GlimmerHMM/trained_dir/rice
zebrafish Zebrafish /home/galaxy/Dependencies/GlimmerHMM/trained_dir/zebrafish
2. jbrowsearchivecreator
Install JBrowse-1.12.1 at /var/www/html::
$ cd /var/www/html
$ wget --trust-server-names http://jbrowse.org/wordpress/wp-content/plugins/download-monitor/download.php?id=105
$ sudo apt-get install unzip
$ sudo unzip JBrowse-1.12.1.zip
$ cd JBrowse-1.12.1
$ sudo ./setup.sh
# add a subdir to store hub data
$ sudo mkdir data
$ sudo chown -R galaxy:galaxy data
Add G-OnRamp plugins::
$ cd JBrowse-1.12.1/plugins
$ sudo git clone https://github.com/Yating-L/JBrowse_plugins.git G-OnRamp_plugin
Add a plugins configuration variable in your jbrowse_conf.json file in the top-level JBrowse directory, and add an entry telling JBrowse where the plugin is. Example::
{
"plugins": [ 'G-OnRamp_plugin' ]
}