Difference between revisions of "Release to BRC"
RobEdwards (talk | contribs) |
RobEdwards (talk | contribs) |
||
Line 11: | Line 11: | ||
nmpdr2gff NMPDR | nmpdr2gff NMPDR | ||
− | This looks through all genomes for the NMPDR flag, and if it is found then a GFF3 file is created. If you suspect that files are not created for some genomes that should have them, then the NMPDR flag has not been set. | + | This looks through all genomes for the NMPDR flag, and if it is found then a GFF3 file is created using the seed2gff command. If you suspect that files are not created for some genomes that should have them, then the NMPDR flag has not been set. If you would like to create a gff3 file of a single organism, you can use seed2gff with just that organism. |
The creation takes about 30-40 seconds per genome, so you can expect it to run for some time. | The creation takes about 30-40 seconds per genome, so you can expect it to run for some time. |
Latest revision as of 09:43, 14 November 2006
How to Create and Release GFF3 files to the BRCs
We regularly release our data to BRC-central via GFF3 files. This page describes the steps to release the data.
Creating the files
Choose a machine that is upto date, and create an empty directory. For this example, we'll use the directory NMPDR
Run the command
nmpdr2gff NMPDR
This looks through all genomes for the NMPDR flag, and if it is found then a GFF3 file is created using the seed2gff command. If you suspect that files are not created for some genomes that should have them, then the NMPDR flag has not been set. If you would like to create a gff3 file of a single organism, you can use seed2gff with just that organism.
The creation takes about 30-40 seconds per genome, so you can expect it to run for some time.
Once complete you will have a directory structure that looks something like this (only the first two genomes are shown for each organism):
- NMPDR
- Campylobacter
- Campylobacter.coli.RM2228.gff3
- Campylobacter.jejuni.subsp.jejuni.84-25.gff3
- ...
- Listeria
- Listeria.innocua.Clip11262.gff3
- Listeria.monocytogenes.EGD-e.gff3
- ...
- Staphylococcus
- Staphylococcus.aureus.RF122.gff3
- Staphylococcus.aureus.subsp.aureus.MRSA252.gff3
- ...
- Streptococcus
- Streptococcus.pneumoniae.R6.gff3
- Streptococcus.pyogenes.MGAS10270.gff3
- ...
- Vibrio
- Vibrio.cholerae.MO10.gff3
- Vibrio.cholerae.O395.gff3
- ...
- Campylobacter
Uploading the files
One the creation of the GFF3 files is complete, you need to use the brc-central validator to validate and upload the data to the site. The one tricky thing about this was that it requires the GO::Parser PERL module. This should be part of the standard install everywhere now, but you may run into problems if it is missing. Please contact Bob for help.
Use this command to validate and upload our data:
gff3_validator.pl -b NMPDR -d /path/to/directory/NMPDR -p CDS
One this has completed you should ftp to [1] and check that the files are correct. If there are problems with the validator or upload you should email Todd Creasy at TIGR for help.