|
|
Administration Documentation |
The setting up of databases is covered in detail in the Admin Guide.
There is also a short guide in German that shows how to install Kaptain and EMBOSS.
The EMBOSS developers use them to test database indexing and sequence reading.
See directories:
test/data (emrod (DNA) and swnew (protein) are in blast format) test/embl (*.dat for EMBL format, *.ref and *.seq for gcg format) test/pir (*.ref and *.seq for nbrf format) test/swiss (*.dat for swissprot format, 1 file) test/swnew (*.dat for swissprot format, 3 files) test/wormpep (wormpep is in fasta and blast format)
If you use the emboss/emboss.default.template file to create your own emboss.default file, change the definition of emboss_tempdata at the top to point to your test directory and you can use the test databases as "tembl", "tsw" and so on. The databases contain the sequences in the program examples (see the web pages, or run the "tfm" program to see the documentation).
You can also reindex these files yourself to test the dbi* programs and to test writing your own DB definitions for emboss.default.
ftp://iubio.bio.indiana.edu/biomirror-gcg/
Mar 6 22:36 Readme May 17 12:40 emboss.default.gz May 18 02:19 gcgdbconfigure May 18 02:19 gcgembl (release 70, non-redundant w/ genbank) May 18 02:18 gcggenbank1 (core genbank, release 129) May 18 22:37 gcggenbank2 (est,gss of rel 129) May 17 22:13 gcggenpept (release 129) May 17 20:38 gcgpir (release 71) May 17 20:32 gcgswissprot (release 40)
These are gzip compressed, but otherwise should drop into an EMBOSS system with minor editing of the emboss.default file path. Included are EMBOSS package indices with each data set (total size about 60 GB uncompressed; 20 GB compressed).
This is a trial to see if those of you who support EMBOSS want such a pre-digested set of data + indices. Let Don Gilbert know if you find it useful.