Bioinformatics Applications and Databases
This page provides information about the existing bioinformatics applications available on the NGS and the bioinformatics databases available at RAL, Oxford, Leeds and Glasgow-Scotgrid.
The NGS at RAL would like to thank the researchers at the Institute of Grasslands Research (IGER) at Aberystwyth, and at the NERC Environmental Bioinformatics Centre at Oxford University notably Dr Bela Tiwari , for their guidance in setting up these databases on the NGS.
These discussions were partly funded by a BBSRC grant "Supporting Bioinformatics Research on the NGS" which ended in Sept 2007.
Applications
For more information and examples on how to run the applications below, click on the name of the relevant application.
|
![]() |
Databases
The following databases are hosted at STFC/RAL, Oxford OERC, Leeds NGS, Glasgow Scotgrid:
| Database | Location | Description |
|---|---|---|
|
EBI EMBL NUCLEOTIDE |
See the release notes file, titled relnotes.txt for a complete description. | |
| --in | ||
| FASTA format | ${DB}/EBI_NUCLEOTIDE_DB/fasta_DB/ | The files retain the names they have in the mirror site (i.e. according to data class and taxonomic division) with the suffix "em_rel" for the quarterly release and the suffix "em_cum" for the updates) More details can be found in : EMBL Nucleotide Sequence data files in FASTA format |
| BLAST format | ${DB}/EBI_NUCLEOTIDE_DB/blast_DB/ | The files are named after data class and taxonomic division (ie. est_env) The updates have the suffix "_upd" i.e., "est_inv_upd" For more detilas, please read EMBL Nucleotide Sequence data files in BLAST format |
| MPI-BLAST format (not available currently) |
${DB}/EBI_NUCLEOTIDE_DB/mpi-blast_DB/ | The files are split by data class. For more details, please read EMBL Nucleotide Sequence data files in MPI-BLAST format |
| EBI Uniprot Knowledgebase PROTEIN (latest update) | ||
| -- in | ||
| FASTA format | ${DB}/EBI_PROTEIN_DB/fasta_DB/ | For a list of the Protein Sequences files, please refer to EBI Uniprot Protein files |
| BLAST format | ${DB}/EBI_PROTEIN_DB/blast_DB/ | |
| MPI-BLAST format | ${DB}/EBI_PROTEIN_DB/mpi-blast_DB/ | |
| PROSITE (latest update) | ||
| PROSITE uncompressed files | ${DB}/PROSITE_DB/ | Retrieved from this site. |
| PROSITE integrated in EMBOSS | ${EMBOSS}/PROSITE/ | |
| PRINTS (latest update) | ||
| PRINTS uncompressed files | ${DB}/PRINTS_DB/ | Retrieved from this site. |
| PRINTS integrated in EMBOSS | ${EMBOSS}/PRINTS/ | |
| REBASE (latest update) | ||
| REBASE uncompressed files | ${DB}/NEB_REBASE_DB/ | |
| REBASE integrated in EMBOSS | ${EMBOSS}/REBASE/ | Retrieved from this site. |
where ${DB} stands for the local location of the database files eg at RAL /var/data/bioinformatics/db/ . (The following instructions can be applied to a data file containing one or more sequences.)
- To convert an EMBL(or a non-EMBL) format data file into a 'FASTA' formatted file:
/usr/ngs/EMBOSS seqret "Pathname of a data file" "Pathname of output file" -osformat fasta
[N.B. Make sure that your sequence data file can be read by 'seqret'.]
- To convert a 'FASTA' formatted file containing one or many sequences into a 'BLAST 2' database:
- Create a file called '.formatdbrc in the current directory.
[NCBI] Data=/usr/local/applications/bioinformatics/ncbi/data
- Use Command:
/usr/ngs/BLAST-TOOLBOX-NCBI formatdb -i "Pathname of a FASTA file" \ -o T -p F -n -t "Title for database file" \ -v "Size of database(in millions of letters)" \ -l "Pathname of a log file"
Contact
For any difficulties you have using the above software and databases, for more information or for letting us know about other software applications you would like to use, please contact the NGS Helpdesk.
Applications Support
The NGS cannot offer scientific support for applications. However if you require further information or believe there is something wrong with the installation, please contact the NGS support centre.
Acknowledgements
Please note: When publishing work based on use of the NGS, users should acknowledge both the authors of any programs used (see the individual program web sites, or contact the authors directly) and the NGS directly using the following line:
"The authors would like to acknowledge the use of the UK National Grid Service in carrying out this work"
This line must also accompany any use of the NGS logos.


