SequenceServer BLAST search made easy!
SequenceServer lets you rapidly set up a BLAST+ server with an intuitive user interface for use on your local machine and for sharing with colleagues locally or over the web. We designed SequenceServer to do this more elegantly and efficiently than other solutions (e.g., GMOD, Galaxy) or traditional publicly available BLAST front ends.
Easy to use:
- Minimalistic, clutter-free design so you can focus on the biology.
- Smart user interface automagically figures out the appropriate BLAST method to use based on your input and selected databases.
- Common mistakes prevented. For example, SequenceServer warns if you mix nucleotide and protein sequences.
- Use advanced parameters (e.g. -task blastn-short -evalue 1.0e-20) as you would in the command line.
- Simple and intuitive overview of results.
- Elegant, streamlined BLAST report.
- View sequence and download FASTA of up to 30 hits.
- Download tab-delimited reports (standard 12 column or all 44 possible columns) or XML BLAST report.
- Input query sequences by pasting FASTA sequence or drag-and-dropping a FASTA file.
- Guided interactive setup in very few steps — only requires your sequence files.
- Automatic detection of BLAST software; if absent the correct version is automatically downloaded.
- Automatical detection of existing BLAST databases.
- Automatic detection of FASTA files not yet BLAST-formatted. SequenceServer detects sequence type helps and interactively helps to create BLAST databases.
- Web server is built in. Apache or Nginx are unnecessary (but can be used for specifc uses).
- Use on your personal computer; share with colleagues locally or worldwide.
- Easily customizable hyperlinks to search hits (e.g. to your a genome browser or custom database).
To get the latest release of SequenceServer, run the following from a command line.
$ gem install sequenceserver
If that doesn't work, try sudo gem instead of gem.
To configure and launch SequenceServer, run the following from a command line.
SequenceServer will automatically guide you through an interactive setup process to help locate or download BLAST+ binaries, ask for the location of FASTA files (or BLAST+ databases), and offer to create BLAST+ databases from FASTA files.
That's it! Open http://localhost:4567/ and start BLAST-ing!
Linux or Mac OSX. While we would like to also support Windows, our resources are limited and we prefer to first concentrate on making SequenceServer great on fewer platforms. It should be possible to run SequenceServer on Windows using Cygwin. Note that you can connect to SequenceServer running on other machines from your browser in Windows.
or or 7+ or 9+
Creating BLAST+ databases
SequenceServer provides commands to find unformatted FASTA files and guide you through a formatting process. The program detects whether they are protien or nucleotide and suggests appropriate names.
To set up BLAST databases from a directory of FASTA sequence files, use:
$ sequenceserver -m
By default this will use the database directory from the configuration file. An alternative directory can be explictly provided:
$ sequenceserver -m -d=path/to/directory_with_fasta_files
Alternatively, use BLAST's
makeblastdb on a single FASTA
sequence files. For example:
$ makeblastdb -dbtype <db type> -title <db title> -in <db> -parse_seqids
<db type>is either
nucldepending on the type of sequence
<db title>is what users will see
<db>is the path to the FASTA file
-parse_seqidsis required to generate links for downloading search hits (yes, it's a bit slow on large files).
Additional options at
To see the current list of BLAST databases found by SequenceServer, run
$ sequenceserver -l
Sequenceserver provides a powerful yet flexible plugin system
to create custom hyperlinks for your BLAST hits. You can leverage the
comprehensive information, for example, database details, which database a hit came
from, to generate more complex and highly specific links. Not only that, but you
can also personalize your interface by controlling the order or appearance of links.
Please have a look at our
to understand how to write your own plugin.
Once you are done writing the code, load your plugin using the
-r=require_file command line switch.
Your methods will be loaded automatically and used to generate appropriate links for each hit. The source file includes extensive inline documentation which explains how to do this.
Command line interface
In most cases, it is sufficient to run SequenceServer using parameters specified in its single configuration file.
For other scenarios, SequenceServer provides an advanced command line interface through which you can not only customize the startup of program but also use specific utilities provided by it.
Below we give a complete list of commands with brief description of each one.
|-c=config_file||Provide path location of your custom configuration file|
|-b=bin||Provide path to your BLAST+ binaries*|
|-d=database_dir||Provide path to your BLAST+ databases*|
|-n=num_threads||Number of threads to run BLAST search*|
|-H=host||Host to run SequenceServer on*|
|-p=port||Port to run SequenceServer on*|
|-r=require_file||Load extension from this file|
|-s||Set configuration value in default or given config file|
|-m||Create BLAST databases|
|-l||List found BLAST databases|
|-u||List unformatted FASTA files|
|-i||Run SequenceServer in interactive mode|
|-D||Run SequenceServer in development (debug) mode|
|-v or --version||Print version number of SequenceServer that will be loaded|
|-h or --help||Display this help message|
|* Command line values take precedence over configuration file|
An example startup command will look like:
$ sequenceserver -d ~/home/biodb -n 4 -p 7777
Setup on Apache with Phusion Passenger (Optional)
Most of the time running the SequenceServer as above is more than enough. But if you need to use SequenceServer as part of a standard website, SequenceServer can be deployed on Apache with Phusion Passenger™ (a.k.a mod_rails or mod_rack).
Set up Phusion Passenger™
Install passenger gem:
gem install passenger
Install passenger module for Apache
Follow instructions on screen. The installer will let you know if any software that is still required to setup passenger, and how to install it. After the installer has finished building modules corresponding to your webserver, it will ask you to edit your webserver configuration files so that your webserver knows how to load passenger.
For Apache, you can do something like this:
- Create a new file called passenger.load in /etc/apache2/mods-available, and add the lines there.
- Run a2enmod passenger. This will ask you to restart apache2 with instructions on how to do the same. On my machine, I had to run /etc/init.d/apache2 restart. Running touch tmp/restart.txt may required when changing configuration.
Apache Root URI (eg:antgenomes.org or localhost).
<VirtualHost *:80> ServerName localhost DocumentRoot /home/yeban/src/sequenceserver/public <Directory /home/yeban/src/sequenceserver/public> Allow from all Options -MultiViews </Directory> </VirtualHost>
Apache Sub URI (e.g.: antgenomes.org/blast or localhost/sequenceserver).
<VirtualHost *:80> ServerName localhost DocumentRoot /var/www/ <Directory /var/www> Allow from all </Directory> RackBaseURI /sequenceserver <Directory /var/www/sequenceserver> Options -MultiViews </Directory> </VirtualHost>
Have an issue in deploying SequenceServer? Something is not working as expected? Have a tip? A feature request? Or just want to encourage further development? Post it to SequenceServer Google Group and we will work something out.
We are also available for consulting which can range from custom support to server deployment and administration to implementing specific features. Get in touch.
FAQ Frequently Asked Questions
- Q. What are the alternatives to SequenceServer?
- There are several. They can look and behave a bit too vintage, and their installation and configuration is challenging.
- Q. I swear I installed BLAST but SequenceServer can't find it!
Make sure you downloaded NCBI's BLAST+ (BLAST
plus!) and not legacy BLAST. The old legacy BLAST had
programs such as blastall and
formatdb. The programs in BLAST+ have names
such as makeblastdb, blastn, tblastx etc.,
Also, check that your SequenceServer configuration file points at the bin directory, e.g.:
- Q. How can I add custom links to hits?
- Easily. See the details in plugin system.
- Q. What is an interactive mode? How do I use it?
- SequenceServer's interactive mode ( sequenceserver -i) is handy when you are debugging SequenceServer or writing a custom module such as a custom link generator. Interactive mode lets you access all methods, call them and inspect their output.
- Q. How do I tell SequenceServer to read a different configuration file?
-ccommand line switch. For example:
$ sequenceserver -c sequenceserver_ants.conf
- Q. How do I change the server port?
SequenceServer will by default be accessible on port 4567
Change the port by editing
~/.sequenceserver.conf. Also, have a look at our CLI.
- Q. Can my colleagues access the server I set up on my computer?
- Yes, if they can access your machine. This usually requires being in the same subnetwork, or asking IT services to open your machine to the outside world. You'll have to tell your colleagues your IP address (find it in the Network or Sharing section of System Preferences) and the port. The complete URL they will have to use might be http://192.168.134.13:4567/ or http://myblastserver.local:4567/. It will be easier for everyone if you can get a fixed IP.
- Q. Do I need to install Apache or Nginx?
- SequenceServer has a web server built-in. There is no need of installing Apache or Nginx. Nevertheless, we provide Apache instructions as part of advanced setup guidelines.
- Q. Can I integrate SequenceServer into Apache?
- Yes. We provide instructions above .
- Q. Can I use multiple cores/threads?
- Yes either in the configuration file or via the CLI. You should. Everything will go much faster if you do.
- Q. Should I activate hyper-threading to give BLAST more virtual cores?
- Q. View sequence link is disabled for some hits.
- A. SequenceServer disables view sequence link if the length of the hit exceeds 10,000 residues - ok if target sequences are proteins or contigs. We feel this mode of analysing sequences is not optimal for very long sequences (e.g: scaffolds).
- Q. Download FASTA of all hits is disabled.
- A. Download FASTA of all hits or bulk FASTA download of selected hits works only for 30 or less hits at a time. This is due to a technical limitation that length of URLs should not exceed 2083 characters. We are trying to remove this limitation.
- Q. Can I have the source code? Can I contribute?
- SequenceServer's source code is on GitHub. Downloading, forking, pulling, filing issues and contributing is most welcome!
- Q. What is a BLAST database? Can I make a custom BLAST database?
The BLAST search algorithms don't directly understand FASTA files. But BLAST
includes the makeblastdb tool that reformats FASTA
into the optimized BLAST-friendly format.
SequenceServer additionally provides the sequenceserver -m command to facilitate the conversion from FASTA to BLAST database.
- Q. Can I use a preformatted BLAST database? Can I use alias?
Preformatted BLAST databases can also be used with
SequenceServer (e.g., nr...). Most preformatted database ar split between
multiple files (like .00, .01, ..). Thanks to
Mark Anthony Gibbins'
contributions, SequenceServer can correctly understand these.
If a custom
.palBLAST database alias file as well as the linked databases are in SequenceServer's
dbdirectory, SequenceServer will display both the alias and the linked databases. To display only the alias, move the linked databases from
dbto a different place, then edit the
DBLISTline in the
.nalfile so that it contains the complete path of each of the database files. For example,
DBLIST species1_custom_blast_db species2_custom_blast_databasemay become:
DBLIST /Users/blastuser/mydbs/species1_custom_blast_db /Users/blastuser/mydbs/species2_custom_blast_database
- Q. How does BLAST identify similar regions? Why don't the ends match? What is the bitscore? What is the e-value that BLAST returns? What do these numbers mean?
Many detailed explanations exist egarding such questions, including
and in the original BLAST articles. Here's our executive
BLAST is a heuristic, i.e., it is fast and approximate instead of being slow and perfect. It starts by looking for a minimal 100% match (e.g., 11 consecutive nucleotides with 100% identity between your query and the database sequence). If it finds none its over (thus if exactly every 10th base is different, BLAST finds no results). If it does find a match, it extends that in both directions: identical (or similar) bases add points; differences are negative points. If too many points are lost, it stops aligning. BLAST might not stop at the exact best place, alignment ends might be wrong.
The bitscore is the total number of points for the aligning region. The bigger it is, the stronger the alignment. But the bitscore doesn't take into account sequence length nor database size. The E-value does take these into account. It is better to look at E-values than bitscores. The E-value represents the number of times the observed alignment would be expected to occur by chance (it is not a p-value!); depends on the bitscore, the length of the query sequence, and the cumulative length of all sequences in the database. It is easier to talk about strong E-values (e.g. 1e-100 = 10-100 = almost zero; impossible to obtain by chance) vs weak E-evalues (e.g 0.1; for similarity that may be due to chance) that small vs large (which is always a bit confusing).
- Q. What is the difference between BLAST and BLAST+? What about WU-BLAST or AB-BLAST?
- BLAST was rewritten several times - most recently by NCBI as BLAST+. NCBI now use and recommend using BLAST+. The BLAST+ publication explains why BLAST+ is easier to use and faster than the old legacy BLAST. WU-BLAST is now commerical and called AB-BLAST. There is probably no good reason to use either alternative. Note that the output formats change slightly from one BLAST implementation to the next.
- NCBI's BLAST+ is most actively developed, just use it. SequenceServer only supports BLAST+.
SequenceServer is distributed under the GNU AGPL version 3. SequenceServer is free to use by any individual or organization for all purposes. If you modify SequenceServer's source code or use it for custom application development, you must release the source code of the derivative product under AGPL license to comply with AGPL. You are obligated to release source code of the derivative product even if you are distributing it over a network.
Users & Citations (apologies - list is outdated)
- Brandl et al (2015) PlanMine – a mineable resource of planarian biology and biodiversity Nucleic Acids Research
- Kirmitzoglou I (2015) LCR-eXXXplorer: a web platform to search, visualize and share data for low complexity regions in protein sequences Bioinformatics.
- Elphic MR (2015) Reconstructing SALMFamide neuropeptide precursor evolution in the phylum Echinodermata: ophiuroid and crinoid sequence data provide new insights Frontiers in Endocrinology.
- Gupta Y et al (2015) De novo assembly and characterization of transcriptomes of early-stage fruit from two genotypes of Annona squamosa L. with contrast in seed number BMC Genomics 2015, 16:86.
- Rodrigues M (2014) Molecular biology approaches in bioadhesion research Beilstein J. Nanotechnol. 2014, 5, 983–993.
- Sharma P (2014) WImpiBLAST: Web Interface for mpiBLAST to Help Biologists Perform Large-Scale Annotation Using High Performance Computing PLoS ONE.
- Mondav R et al (2014) Discovery of a novel methanogen prevalent in thawing permafrost Nature Communications 5, 3212.
- Rowe ML et al (2014) Neuropeptides and polypeptide hormones in echinoderms: New insights from analysis of the transcriptome of the sea cucumber Apostichopus japonicus General and Comparative Endocrinology 194, 43-55.
- Chiara M et al (2013) De Novo Assembly of the Transcriptome of the Non-Model Plant Streptocarpus rexii Employing a Novel Heuristic to Recover Locus-Specific Transcript Clusters PLoS ONE.
- Semmens DC et al (2013) Discovery of a novel neurophysin-associated neuropeptide that triggers cardiac stomach contraction and retraction in starfish Journal of Experimental Biology 216, 4047-4053.
- Chiu JC et al (2013) Genome of Drosophila suzukii, the Spotted Wing Drosophila G3 g3.113.008185
- Shreve J et al (2013) A genome-wide survey of small interfering RNA and microRNA pathway genes in a galling insect. Journal of Insect Physiology.
- Berlamino et al (2013) SymGRASS: a database of sugarcane orthologous genes involved in arbuscular mycorrhiza and root nodule symbiosis . BMC Bioinformatics.
- Elphick MR et al (2013) The Evolution and Diversity of SALMFamide Neuropeptides. PLoS ONE.
- Elphick MR (2012) The Protein Precursors of Peptides That Affect the Mechanics of Connective Tissue and/or Muscle in the Echinoderm Apostichopus japonicus. PLoS ONE.
- King Abdullah University of Science and Technology
- King Mongkut's University of Technology Thonburi. Pythium insidiosum (bacteria). (private)
- Labóratorio Nacional de Ciência e Tecnologia do Bioetanol. Sugar cane (plant).
- Max Planck Institute for Developmental Biology. Pristionchus (animal).
- Max Planck Institute of Molecular Cell Biology and Genetics. Planarians (animal).
- Michigan State University. Electric fish (animal).
- Moscow State University. Pleurobrachia bachei (animal).
- New York University. Flies (animal).
- Oregon State University. Drosophila suzukii (animal).
- Peking University. Macaca mulatta (animal).
- Universidad de los Andes, Columbia. Fungi.
- Universidade do Algarve. Ruditapes decussatus (animal).
- Universita degli Studi di Milano. Streptocarpus rexii.
- University of Athens. Fungal mitochondria.
- University of Cambridge. Protists.
- University of Cyprus. UniProt.
- University of Exeter. Blastocladiella emersonii.
- University of Maryland. Astyanax (animal).
- University of Prince Edward Island. Flounder (animal).
- AvianGenomes.org. (private)
- iPlantCollaborative Saccharum Genome Database
- Dr. Wolfang Rumpf @ U Maryland
- Center for Ecological Research, Kyoto University
- European clam database @ CCMAR at University of Algarve
- Spotted Drosophila Flybase @ Oregon State
- Amborella Genome at University of Georgia
- Jekely Lab at Max Planck Institute for Developmental Biology, Tuebigen
- Universitat Greifswald (private)
- Department of Plant Biology and Forest Genetics, Swedish University of Agricultural Sciences (private)
- Norwich BioScience Institutes (private)
- Australia's Commonwealth Scientific and Industrial Research Organisation (private)
and many additional private installations we don't know about...