Installing NCBI blast

The complete C sources are available but it is easier to download compiled executable programs appropriate to your computer. E.g. for 64 bit Centos download ncbi-blast-2.2.25+-x64-linux.tar.gz Then decompress and extract the contents of the .tar file

gunzip -c ncbi-blast-2.2.25+-x64-linux.tar.gz | tar -xvf -
Save blastn to a convenient directory.

Copying NCBI's Human Genome

Use NCBI's update_blastdb.pl perl script to down load the already formatted Human Genome database from ftp://ftp.ncbi.nlm.nih.gov/blast/db/ (Confusingly the NCBI documentation refers to these files as "pre-formatted". The raw sequence is also available but it needs formatting as a database before tools like blast can use it.)

update_blastdb.pl human_genomic
Copying either version to your computer takes several hours.
    The nine tar files copied by update_blastdb.pl using FTP need to be uncompressed and files extracted from them. In the a suitable directory, this can be done by:
gunzip -c human_genomic.00.tar.gz | tar -xvf -
gunzip -c human_genomic.01.tar.gz | tar -xvf -
gunzip -c human_genomic.02.tar.gz | tar -xvf -
gunzip -c human_genomic.03.tar.gz | tar -xvf -
gunzip -c human_genomic.04.tar.gz | tar -xvf -
gunzip -c human_genomic.05.tar.gz | tar -xvf -
gunzip -c human_genomic.06.tar.gz | tar -xvf -
gunzip -c human_genomic.07.tar.gz | tar -xvf -
gunzip -c human_genomic.08.tar.gz | tar -xvf -

Problems Running Blast

Error message: Error: (106.18) NCBI C++ Exception: Error: (CArgException::eConstraint) Argument "task". Illegal value, expected Permissible values: 'blastn' 'blastn-short' 'dc-megablast' 'megablast' 'vecscreen' : `blastn-sort' Error: (CArgException::eConstraint) Application's initialization failed

Error reported by the command line:
~/blastn -task blastn-sort -db human_genomic -query ~/AF241217.FA

Work around

replace blastn-sort by blastn-short

Error message: # 0 hits found

Error reported by the command lines:
~/blastn -task dc-megablast -db human_genomic -query ~/AF241217.FA -outfmt 7 -out new.out
~/blastn -db human_genomic -query ~/AF241217.FA -outfmt 7 -out new.out2

Work around

-task dc-megablast and the default blastn task do not work with sequences that are less than 50 base pairs. (See SEQanswers rglover 08-06-2010.) Use: -task blastn-short, which reports # 1382 hits found
~/blastn -task blastn-short -db human_genomic -query ~/AF241217.FA -outfmt 7 -out ~/AF241217.blastn-short


W.B.Langdon 23 August 2011