I would like to compare my RNA seq file together to find overlap?
0
0
Entering edit mode
6.1 years ago
star ▴ 350

I have 3 RNA seq files and i would like to compare these files together to find overlaps and unique reads between them. In fact, I have 3 files (Files1, Files2 and files3) that I think File1 is the merge of File2 and File3 but I am not sure, so I decide to compare these 3 files together to find is there any unique reads between them?

I have .fastq file (Raw data) , .bam file (after aligning) and count table file from those. I would like to know it is better to do comparing in which step and how can I compare them?

I have also checked number of their reads before alignment and after alignment and also number of mapped reads and i found that the merge of File 2 and File3 is a bit bigger than File 1.

               number of read       number of mapped read          file size 
File1               10403419          10294966                        1.8 GB
File2                 5539406          5487472                      944.4 MB
File3                 5517327          5466102                      940.7 MB
RNA-Seq genome sequencing bam fastq • 883 views
ADD COMMENT
1
Entering edit mode

You should compare them after aligning. Have a look at bedtools intersectand bedtools subtract.

ADD REPLY
1
Entering edit mode

File 1 reads =/= File 2 reads + File 3 reads, if those numbers above are correct. So at a minimum that does not explain a simple addition.

If you feel that somehow the reads in file 2 and file3 have been combined into file 1 then you can extract a subset of read headers from file 2 and 3 and see if they are present in file 1 (raw data). Comparing sequence/count data does not make a lot of sense since at that level it is not assignable to a particular file.

ADD REPLY

Login before adding your answer.

Traffic: 1897 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6