Classification hub
Classification hub is a kind of BioHub which used to classify the genes into several groups. This BioHub type is defined in biouml.plugins.enrichment plugin and used for two analysis methods: Functional classification and Enrichment analysis. Currently only Ensembl genes are accepted for classification via classification hubs. If you have other type of identifiers, consider using Convert table analysis.
An example of classification hub is the FunctionalGOHub
, which classifies supplied genes via Gene Ontology categories.
Implementation details
To mark BioHub as classification hub, its BioHub.getPriority
method should return positive priority when "FunctionalClassification"
CollectionRecord
is supplied. The easiest way to implement such hub is to subclass SqlCachedFunctionalHubSupport
abstract class.
When classification hub is requested to provide the matching, it must provide some additional information including total number of Ensembl genes participating in classification; total number of genes in the individual groups and number of input genes which participate in classifications. Also some additional group annotation fields can be added. These fields will appear in the analysis output table.
When user selects "Repository folder" as the classification hub, then RepositoryHub
is used.
Implementation via SqlCachedFunctionalHubSupport
The easiest way to implement your own classification hub is to subclass SqlCachedFunctionalHubSupport
. There must be MySQL database available where this class will automatically create a table with classification information during the first use. You will have to implement the following methods:
- getTableName(): must return a name of the SQL table which will be used to store the classification information.
- getInputReferenceType(): must return a reference type object which represents the input identifiers for your classification. If this reference type differs from Ensembl genes then available matching BioHubs will be used to convert your identifiers.
- getGroups(): must return a collection of
Group
objects. TheGroup
object contains group title, group accession number and list of group member identifiers which belong to the reference type returned by getInputReferenceType(). This method is called only once when SQL table is not created. During subsequent runs the data stored in SQL table is used. If you want to update the classification, you have to delete the table manually and restart BioUML. - annotateElement(Element) (optional): add custom annotation fields to the category element.