correlation analysis
3 months ago
rheab1230

Hello everyone, I have two datasets. One is normalised_counts.txt which is the results of deseq2 analysis and the other is a predixcan file which have counts for each gene for all the chromosomes. I have to do Pearson correlation to figure out how many genes in our sample is present in the normalised_counts file. But the problem is that the normalised_Count file is created from rna-seq of bam files for only british ancestry whereas our predixcan file is run on vcf file for all the ancestry. So my doubt is if I do the correlation analysis won't it affect my results since in one file i have only one ancestry and in another I have all.

3 months ago

I have to do Pearson correlation to figure out how many genes in our sample is present in the normalised_counts file.

Taking this statement in its literal sense —and ignoring anything to do with ancestry—, you do not need to do a correlation analysis in order to determine this. Instead, you just need simple filtering criteria with base R functions, like which(), match(), merge(), et cetera

yes. i was able to do this. i used %in% function to extract those samples. Thank you.

Very good / Muito bem

