Question

correlation analysis

0

Entering edit mode

2.5 years ago

rheab1230 ▴ 140

Hello everyone, I have two datasets. One is normalised_counts.txt which is the results of deseq2 analysis and the other is a predixcan file which have counts for each gene for all the chromosomes. I have to do Pearson correlation to figure out how many genes in our sample is present in the normalised_counts file. But the problem is that the normalised_Count file is created from rna-seq of bam files for only british ancestry whereas our predixcan file is run on vcf file for all the ancestry. So my doubt is if I do the correlation analysis won't it affect my results since in one file i have only one ancestry and in another I have all.

correlation vcf genes. • 842 views

ADD COMMENT • link updated 2.5 years ago by Kevin Blighe 87k • written 2.5 years ago by rheab1230 ▴ 140

score 1 · Answer 1 · 2021-10-20

1

Entering edit mode

2.5 years ago

Kevin Blighe 87k

I have to do Pearson correlation to figure out how many genes in our sample is present in the normalised_counts file.

Taking this statement in its literal sense —and ignoring anything to do with ancestry—, you do not need to do a correlation analysis in order to determine this. Instead, you just need simple filtering criteria with base R functions, like which(), match(), merge(), et cetera

ADD COMMENT • link 2.5 years ago by Kevin Blighe 87k

0

Entering edit mode

yes. i was able to do this. i used %in% function to extract those samples. Thank you.