Skip to content
Snippets Groups Projects
next_gen.sgml 3.39 KiB
Newer Older
  • Learn to ignore specific revisions
  • tjc's avatar
    tjc committed
      <SECT2 ID="FILEMENU-READ-BAM">
        <TITLE>Read BAM / VCF ...</TITLE>
         <PARA>
    
    tjc's avatar
    tjc committed
    &prog; can read in and visualise BAM, VCF and BCF files. These files need
    to be indexed as described below. They require &prog; to be run with at least
    Java 1.6.
         </PARA>
         <PARA>
    BAM files need to be sorted and indexed using <ULINK
    URL="http://samtools.sourceforge.net/">samtools</ULINK>. The index file should be in
    the same directory as the BAM file. This provides an integrated
    <ULINK URL="http://bamview.sourceforge.net/">BamView</ULINK> panel in &prog;, displaying
    sequence alignment mappings to a reference sequence. Multiple BAM files can be loaded
    in from here either by selecting each file individually or by selecting a file of
    path names to the BAM files.
    
    tjc's avatar
    tjc committed
          </PARA>
          <PARA>
    Variant Call Format (<ULINK
    URL="http://1000genomes.org/wiki/doku.php?id=1000_genomes:analysis:vcf4.0">VCF</ULINK>)
    files can also be read. The VCF files need to be compressed and indexed using bgzip and
    tabix respectively (see the <ULINK URL="http://samtools.sourceforge.net/tabix.shtml">tabix manual</ULINK> and
    <ULINK URL="http://sourceforge.net/projects/samtools/files/">download page</ULINK>).
    The compressed file gets read in (e.g. file.vcf.gz) and below are the commands for
    generating this from a VCF file:
            </PARA>
            <SYNOPSIS>
            bgzip file.vcf
            tabix -p vcf file.vcf.gz
            </SYNOPSIS>
          <PARA>
    Alternatively a Binary VCF (<ULINK URL="http://samtools.sourceforge.net/mpileup.shtml">BCF</ULINK>) can
    be indexed with BCFtools and read into Artemis or ACT.
          </PARA>
            <PARA>
    As with reading in multiple BAM files, it is possible to read a number of (compressed and indexed)
    VCF files by listing their full paths in a single file. They then get displayed in separate rows
    in the VCF panel.
            </PARA>
            <MEDIAOBJECT>
            <IMAGEOBJECT>
               <IMAGEDATA FORMAT="png" FILEREF="vcf.png"></IMAGEOBJECT>
            </MEDIAOBJECT>
            <PARA>
    For single base changes the colour represents the base it is being changed to, i.e. T black,
    G blue, A green, C red. There are options available to filter the display by the different
    types of variants. Right clicking on the VCF panel will display a pop-up menu in which
    there is a 'Filter...' menu. This opens a window with check boxes for a number of varaint types and properties that can
    be used to filter on. This can be used to show and hide synonymous, non-synonymous, deletion (grey),
    insertion (yellow), and multiple allele (orange line with a circle at the top)
    variants.  In this window there is a check box to hide the variants that do not overlap CDS features.
    There is an option to mark variants that introduce stop codons (into the CDS
    features) with a circle in the middle of the line that represents the variant. There are also options
    to filter the variants by various properties such as their quality score (QUAL) or their depth across the
    samples (DP).
            </PARA>
            <PARA>
    Placing the mouse over a vertical line shows an overview  of the variation as a
    tooltip. Also right clicking over a line then gives an extra option in the pop-up
    menu to show the details for that variation in a separate window. There are also
    alternative colouring schemes. It is possible to colour the variants by whether they are
    synonymous or non-synonymous or by their quality score (the lower the quality the
    more faded the variant appears).
            </PARA>
      </SECT2>