Difference between revisions of "Gene expression prediction"
From BioUML platform
(Created page with " {| class="wikitable" !Method, code, references!!Input data!!Algorithm!!Comment |- |INVOKE (R script)<cite>Schmidt217</cite> https://github.com/SchulzLab/TEPIC/tree/master/Mac...") |
|||
(4 intermediate revisions by one user not shown) | |||
Line 1: | Line 1: | ||
{| class="wikitable" | {| class="wikitable" | ||
− | !Method, code, references!!Input data!!Algorithm!!Comment | + | !Method, code, references!!Input data!!Algorithm!!Accuracy!!Comment |
|- | |- | ||
− | |INVOKE (R script)<cite> | + | |INVOKE (R script)<cite>Schmidt2017</cite> |
https://github.com/SchulzLab/TEPIC/tree/master/MachineLearningPipelines/INVOKE | https://github.com/SchulzLab/TEPIC/tree/master/MachineLearningPipelines/INVOKE | ||
| | | | ||
Line 20: | Line 20: | ||
INVOKE offers linear regression with various regularisation techniques (Lasso, Ridge, Elastic net) to infer potentially important transcriptional regulators by predicting gene expression from TEPIC TF-gene scores. | INVOKE offers linear regression with various regularisation techniques (Lasso, Ridge, Elastic net) to infer potentially important transcriptional regulators by predicting gene expression from TEPIC TF-gene scores. | ||
| | | | ||
+ | HepG2 - r=0.68, | ||
+ | <br>K562 - r=0.68, | ||
+ | <br>GM12878 - r =0.58 | ||
+ | | | ||
+ | |||
+ | |||
+ | |- | ||
+ | |PECA - paired expression and chromatin accessibility (MATLAB)<cite>Duren2017</cite> | ||
+ | http://web.stanford.edu/~zduren/PECA/ | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |||
+ | |||
+ | |- | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |||
+ | |||
+ | |- | ||
+ | |2009 - an approach based on feature extraction of ChIP-Seq signals, principal component analysis, and regression-based component selection <cite>Ouyang2009</cite> | ||
+ | |Input: | ||
+ | * ChIP-seq data | ||
+ | * expression data (RNA-seq) | ||
+ | Output: | ||
+ | * log-linear regression model | ||
+ | * principal components with weights of corresponding TFs | ||
+ | | | ||
+ | * for each TF, each gene - compute a TF association strength (TFAS) - the weighted sum of the corresponding ChIP-Seq signal strength, where the weights reflect the proximity of the signal to the gene. | ||
+ | * principal component analysis (PCA) to extract uncorrelated characteristic patterns in the TFAS vectors. | ||
+ | * centered and standardized the TFAS matrix A is decomposed by the singular value decomposition (SVD) | ||
+ | * regression-based component selection | ||
+ | * gene expression is expressed by the log-linear regression model | ||
+ | |mouse ESCs, r=0.806, R<sup>2</sup>=0.65, CV-R<sup>2</sup>=0.64 | ||
+ | | | ||
+ | |||
|} | |} | ||
Line 27: | Line 67: | ||
<biblio> | <biblio> | ||
− | # | + | #Schmidt2017 pmid=27899623 |
− | + | #Duren2017 pmid=28576882 | |
− | + | #Ouyang2009 pmid=19995984 | |
+ | |||
+ | </biblio> |
Latest revision as of 22:09, 1 April 2018
Method, code, references | Input data | Algorithm | Accuracy | Comment |
---|---|---|---|---|
INVOKE (R script)[1]
https://github.com/SchulzLab/TEPIC/tree/master/MachineLearningPipelines/INVOKE |
Input:
Output:
|
INVOKE offers linear regression with various regularisation techniques (Lasso, Ridge, Elastic net) to infer potentially important transcriptional regulators by predicting gene expression from TEPIC TF-gene scores. |
HepG2 - r=0.68,
|
|
PECA - paired expression and chromatin accessibility (MATLAB)[2] |
| |||
| ||||
2009 - an approach based on feature extraction of ChIP-Seq signals, principal component analysis, and regression-based component selection [3] | Input:
Output:
|
|
mouse ESCs, r=0.806, R2=0.65, CV-R2=0.64 |
[edit] References
Error fetching PMID 27899623:
Error fetching PMID 28576882:
Error fetching PMID 19995984:
Error fetching PMID 28576882:
Error fetching PMID 19995984:
- Error fetching PMID 27899623:
- Error fetching PMID 28576882:
- Error fetching PMID 19995984: