Skip to content
Snippets Groups Projects
Commit a112a9d2 authored by tcarver's avatar tcarver
Browse files

add section on sequence and annotation file formats

parent ee5f74ad
No related branches found
No related tags found
No related merge requests found
......@@ -25,74 +25,10 @@ LINKEND="CONCEPTS-ACTIVEENTRY">) and will be the new default entry (see <XREF
LINKEND="CONCEPTS-DEFAULTENTRY">).
</PARA>
<PARA>
See <XREF LINKEND="FILETYPES">.
This function only reads the feature section of the input file - the sequence
(if any) is ignored.
</PARA>
<PARA>
&prog; can read these feature file formats:
</PARA>
<ITEMIZEDLIST>
<LISTITEM>
<PARA>
<ULINK URL="http://www.ebi.ac.uk/embl/Documentation/">EMBL or GenBank feature tables</ULINK>
</PARA>
</LISTITEM>
<LISTITEM>
<PARA>
<ULINK URL="http://www.sequenceontology.org/gff3.shtml">GFF Version 3 files</ULINK>
</PARA>
</LISTITEM>
<LISTITEM>
<PARA>
FASTA files
</PARA>
</LISTITEM>
<LISTITEM>
<PARA>
Indexed FASTA files can be read in. The files are indexed
using <ULINK URL="http://samtools.sourceforge.net/">SAMtools</ULINK>:
</PARA>
<SYNOPSIS>
samtools faidx ref.fasta
</SYNOPSIS>
</LISTITEM>
<LISTITEM>
<PARA>
The indexed FASTA can be used with an indexed GFF to overlay annotation on the sequence.
To index the GFF first sort and bgzip the file and then use tabix with "-p gff" option (see the
<ULINK URL="http://samtools.sourceforge.net/tabix.shtml">tabix manual</ULINK>):
</PARA>
<SYNOPSIS>
(grep ^"#" in.gff; grep -v ^"#" in.gff | sort -k1,1 -k4,4n) | bgzip > sorted.gff.gz;
tabix -p gff sorted.gff.gz
</SYNOPSIS>
<PARA>
A drop down menu of the contigs or chromosomes sequences is provided in the Entry toolbar
to select the sequence. Using indexed FASTA and indexed GFF files improves the memory management
and enables large genomes to be viewed. Note that as it is indexed the sequence and annotation are
read-only and cannot be edited. When there are many contigs to select from it can be easier
to display the one of interest by typing the name into the drop down list.
</PARA>
</LISTITEM>
<LISTITEM>
<PARA>
The output of <ULINK
URL="http://sonnhammer.sbc.su.se/download/software/MSPcrunch+Blixem/"><COMMAND>MSPcrunch</COMMAND></ULINK>.
<COMMAND>MSPcrunch</COMMAND> must be run with the <COMMAND>-x</COMMAND> or
<COMMAND>-d</COMMAND> flags.
</PARA>
</LISTITEM>
<LISTITEM>
<PARA>
The output of <ULINK URL="http://www.ncbi.nlm.nih.gov/blast/">blastall version
2.2.2</ULINK> or better. <COMMAND>blastall</COMMAND> must be run with the
<COMMAND>-m 8</COMMAND> flag which generates one line of information per HSP.
Note that currently &prog; displays each Blast HSP as a separate feature
rather than displaying each BLAST hit as a feature.
</PARA>
</LISTITEM>
</ITEMIZEDLIST>
</SECT2>
<SECT2 ID="FILEMENU-READ-ENTRY-INTO">
<TITLE>Read Entry Into</TITLE>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment