Bioinformatics tools and tips

From Darwin

Jump to: navigation, search

Contents

Installed Software

A description of the bioinformatics software installed on the cluster can be found in the list of Available software.

BLAST

BLAST databases

The ROCKS BioRoll installs default BioInformatics apps and sets the default env for all users, eg: BLASTDB /share/bio/ncbi/db This is actually a link to the real data in: /data/genomics/ncbi/blast/db. The db directory always holds the most recent download, previous downloads will be kept in /data/genomics/ncbi/blast/"date_of_download" for users who need to always access the same database. The blast taxonomic databases are located in: /data/genomics/ncbi/blast/tax.

Done:

  1. The following NCBI databases are now stored in the $BLASTDB directory, darwin admin staff will update as requested or quarterly:
    • nt - NCBI non-redundant nucleotide database.
    • nr - NCBI non-redundant protein database.
    • env_nt - NCBI environmental nucleotide database.
    • env_nr - NCBI environmental protein database.
    • gss - NCBI Genome Survey Sequence database

Still ToDo:

    • GS_ALL - GOS reads.
    • GS_orf - GOS open reading frames.
    • GS_pro - GOS peptides.
    • GS_scaf - GOS scaffolds.
    • GOS sequence databases: do not get updated often. Expecting more site releases.

BLAST programs

  1. Visit NCBI BLAST Getting Started and NCBI BLAST Help for tutorials on which BLAST program to use.
  2. Visit here for BLAST command-line options.

Distributing Jobs Using SGE

Click Here to learn more about SGE.
Click here to learn more about simple job arrays

Other topics

Personal tools