Skip to content
Snippets Groups Projects
Select Git revision
  • 8392571103001b2b00e50cab66da74b80beaf0a3
  • master default protected
  • gh-pages
  • build-process-upgrade-merge
  • eb-apollo-generate_names
  • BT5_travis
  • hello_github
  • v18.1.0
  • v18.0.3
  • v18.0.2
  • v18.0.1
  • v18.0.0
  • v18.0.0-RC1
  • v17.0.1
  • v17.0.0
  • v16.0.17
  • v16.0.0
  • v15.0.0
  • v14.0.0
  • v13.2.0
20 results

admin.shtml

Blame
  • user avatar
    tjc authored
    git-svn-id: svn+ssh://svn.internal.sanger.ac.uk/repos/svn/pathsoft/artemis/trunk@11153 ee4ac58c-ac51-4696-9907-e4b3aa274f04
    83925711
    History
    admin.shtml 8.54 KiB
    <!--#set var="banner" value="Artemis - Chado Admin/Developer Documentation"-->
    <!--#include virtual="/perl/header"-->
    
    
    <p>
    This page describes some of the development background and admin of
    using and setting up an Artemis and ACT connection with a Chado database.
    
    <ul>
    <li><a href="#DATABASEMANAGER">Opening the Database Manager</a>
    <li><a href="#OPENART">Opening the main Artemis/ACT window</a>
    <li><a href="#CONFIG">Option File Configuration</a>
    <li><a href="#GENEBUILDER">Opening the Standalone Gene Builder</a>
    </ul>
    
    
    
    <h3><a name="DATABASEMANAGER">Opening the Database Manager</a></h3>
    <p>
    The Artemis Database Manager is cached between sessions in the directory
    '.artemis/cache' in the users home directory. There is an option under the File menu
    to clear this cache.
    <p>
    To open the Artemis Database Manager panel (from which the browser is launched), 
    Artemis looks initially for the existence of the cvterm.name = 'top_level_seq' which 
    belongs to cv.name = 'genedb_misc'. If these exist it follows method A:
    
    <ol type="A">
    <li>
    -call 'getTopLevelOrganisms' (in Organism.xml mapping). This relies on the the source 
    features (e.g. chromosome) having a featureprop with a type_id corresponding to 'top_level_seq'.
    <p>
    If the 'top_level_seq' is not implemented in the database it then follows method B:
    <p>
    </li>
    
    <li>
    -call 'getOrganismsContainingSrcFeatures' (in Organism.xml mapping). This searches for those 
    organisms that contain sequences with residues and have a type_id that 
    corresponds to a cvterm name that matches:
    <p>
    *chromosome*, *sequence*, supercontig, ultra_scaffold, golden_path_region, or contig
    <p>
    </li>
    </ol>
    When the organisms with the source feature have been identified these are displayed. When 
    a user clicks on an organism it opens that node and finds the types (e.g. chromosome, contig) of source 
    features and the underlying features that have residues (getResidueFeatures in Feature.xml).
    <p>
    <h3><a name="OPENART">Opening the main Artemis/ACT window</a></h3>
    <p>
    The organismprop's are loaded lazily when a sequence is opened. If an organismprop 
    is of type 'translationTable' the value of the organismprop is then used as the 
    translation table when Artemis opens a sequence from that organism.
    <p>
    When a sequence is double clicked to open it in Artemis, most things for that sequence
    are read from the database. The iBatis statement calls made when reading an entry are
    summarised below.
    <p>
    <table border=1 cellspacing=1 cellpadding=2>
    <tr><td>Statement ID</td><td>SQL Mapping File</td><td>Description</td></tr>
    <tr><td>getFeature                         </td><td>(Feature.xml)</td> 
       <td>Retrieves all the features and their featureloc's, featureprop's, feature_relationship's and primary dbxref</td></tr>
    <tr><td>getFeatureDbXRefsBySrcFeature      </td><td>(FeatureDbXRef.xml)</td>      
        <td>Retrieves all secondary dbxref's</td></tr>
    <tr><td>getFeatureSynonymsBySrcFeature     </td><td>(FeatureSynonym.xml)</td>
        <td>Retrieves feature synonyms</td></tr>
    <tr><td>getFeatureCvTermsBySrcFeature      </td><td>(FeatureCvTerm.xml)</td>
        <td>Retrieves feature_cvterm's, feature_cvtermprop (evidence code, extra qualifiers, date).</td></tr>
    <tr><td>getFeatureCvTermDbXRefBySrcFeature </td><td>(FeatureCvTermDbXRef.xml)</td>
        <td>Retrieves feature_cvterm_dbxref (WITH/FROM column).</td></tr>
    <tr><td>getFeatureCvTermPubBySrcFeature    </td><td>(FeatureCvTermPub.xml)</td>
        <td>Retrieves feature_cvterm_pub's.</td></tr>
    </table>
    <p>
    Artemis constructs an internal GFF3 stream from these calls for the selected sequence. 
    This is then read in the same way as a GFF3 file as an Artemis DatabaseDocumentEntry
    (which extends GFFDocumentEntry) and creating GFFStreamFeatures.
    <p>
    If the lazy load option is selected from the Database Manager's File menu, then only
    getFeature is called. The resulting GFFStreamFeature object is marked as lazy loading
    and FeatureDbXRefs, FeatureSynonyms, FeatureCvTerms, FeatureCvTermDbXRefs
    and FeatureCvTermPubs are read from the database for a feature when the Gene Builder is opened.
    <p>
    The feature_relationship (from getFeature) is used to create the gene hierarchy; 'part_of'a
    and 'derives_from' relationships become Parent and Derives_from in GFF3 terms. If the 
    feature_relationship type_id does not correspond to one of these terms (derives_from, 
    part_of, proper_part_of, partof, producedby) then the object_id is recorded as a qualifer 
    value. This is used to read orthologous_to and paralogous_to relations. The qualifier 
    values for these are lazily stored (as ClusterLazyQualifierValue.java). When Artemis 
    displays these qualifiers in the Gene Builder it then queries the database further to 
    list the related genes.
    <p>
    Other properties that have a featureloc association with a feature
    are found by calling getLazySimilarityMatches (Feature.xml). Artemis then
    constructs lazy loading qualifiers (QualifierLazyLoading.java) from this that query
    the database further only when that qualifier is needed. This is used for
    blast/fasta similarity and polypeptide_domains.
    
    <p>
    The gene hierarchy is stored internally by the ChadoCanonicalGene.java object and is based
    on the Parent/Derives_from relationships. It stores the related children of the gene.
    The spliced features (exon, pseudogenic_exon) are combined into a single Artemis 
    Feature. The joined exons become an Artemis CDS feature (GFFStreamFeature), which stores 
    the uniquenames of the original exons in the database.
    
    <p>
    <h3><a name="CONFIG">Artemis Chado Configuration</a></h3>
    
    This is an example extract from the Artemis options file for the chado related options:
    
    <p>
    <pre style="color: #0000FF;">
    #
    # CHADO DATABASE OPTIONS 
    #
    # chado gene model features default types
    chado_exon_model=CDS
    #chado_transcript=transcript
    
    # infer CDS and UTR features from gene model
    chado_infer_CDS_UTR=no
    
    # provide a list of available servers
    chado_servers = \
      workshop localhost:10101/workshop?user \
      GeneDB db.genedb.org:5432/snapshot?genedb_ro
    
    # define how product qualifiers are stored (as a cv or as a featureprop)
    product_cv=yes
    product_cvname = genedb_products
    # cv containing synonym names
    synonym_cvname = genedb_synonym_type
    
    # set default delete behaviour to make things obsolete, if
    # this is not provided the default is to permanently delete
    set_obsolete_on_delete=yes
    
    # list of features to record residues for in the database
    # - these are included when inserting or updating their featurelocs
    sequence_update_features = polypeptide mRNA rRNA tRNA snRNA snoRNA
    </pre>
    
    <p>
    Artemis combines the exons stored in chado and describes it as a 'CDS' 
    feature by default. The <b>chado_exon_model</b> flag in the options file 
    allows this to be changed. 
    <p>
    When a gene model is created in Artemis it creates the transcript as a 'mRNA' 
    feature by default. The <b>chado_transcript</b> flag in the options file allows this 
    to be changed.
    <p>
    For Artemis the default gene model representation is described in the <a href="overview.shtml#GENE">
    overview</a>. In this representation the UTRs are explicitly created in the database.
    However the gmod loader (gmod_bulk_load_gff3.pl) does not create the UTRs and they can 
    be inferred from the exon and protein features. If the gmod loader is used then Artemis
    can infer the CDS and UTR features by setting <b>chado_infer_CDS_UTR=yes</b> in the 
    options file.
    <p>
    A list of available databases can be configured in the options file with the <b>chado_servers</b> flag.
    For each database an alias is given followed by its location (host:port/database?user), each alias
    is displayed in a drop down menu in the login box.
    <p>
    If product qualifiers are stored as an ontology (in cvterm) then set <b>product_cv=yes</b>
    and set <b>product_cvname</b> is set to the name of the controlled vocabulary (cv) used in chado.
    <p>
    When features are deleted in Artemis the default behaviour can be set to make these
    features obsolete rather than permanently delete them from the database.
    
    <a name="GENEBUILDER"></a><h3>Opening the Standalone Gene Builder</h3>
    <p>
    The Gene Builder can be launched on its own without opening up Artemis. The following
    opens up a window which lets you type in a gene name to be opened:
    
     <pre>
     java -mx500m -Dibatis -Dchado="localhost:5432/database?" \
          -Djdbc.drivers=org.postgresql.Driver -classpath artemis.jar \
          uk.ac.sanger.artemis.components.genebuilder.GeneEdit
     </pre>
    
    Alternatively the gene name can be given as an argument:
     <pre>
     java -mx500m -Dibatis -Dchado="db.genedb.org:5432/snapshot?genedb_ro" \
          -Djdbc.drivers=org.postgresql.Driver -Dshow_log -Dread_only \
          -classpath jar_build/artemis.jar:etc 
           uk.ac.sanger.artemis.components.genebuilder.GeneEdit PFA0010c
     </pre>