RSEM error: The adjacent two lines do not represent the two mates of a paired-end read!
1
0
Entering edit mode
3.1 years ago
lluc.cabus ▴ 20

Hi, I'm running a script with STAR, umi_tools dedup and RSEM. When running the RSEM without the deduplication, everything goes fine, but when running the RSEM after doing the deduplication, it generates this error:

Read ST-E00114:1178:HFL75CCX2:8:2209:25652:18291_CGATTAATAT: The adjacent two lines do not represent the two mates of a paired-end read! (RSEM assumes the two mates of a paired-end read should be adjacent)

I tried using the convert-sam-for-rsem, but the result is the same. Do you know a way to solve this?

Thanks

rna alignment aligner • 2.4k views
ADD COMMENT
1
Entering edit mode
3.1 years ago

It looks like the dedup step wants the bam sorted by name, not by coordinate. Can you eyeball the file to see if it looks like it's sorted by name? Is it possible that you filtered away some reads before mapping, so you have orphans in there?

ADD COMMENT
0
Entering edit mode

Yes, it is sorted by name, but I tried when sorting by coordinate and the results are the same. Before the mapping the fastq had the same number of reads (so I suppose that there were no orphans), and if I run the RSEM without the deduplication step, the RSEM worked perfectly. Then I suppose that there is a problem with the deduplication step, that it generates some orphans. I ran the umi-tools dedup with the following parameters.

--paired --unpaired-reads=discard --chimeric-pairs=discard

So in theory it shouldn't have unpaired-reads

ADD REPLY
0
Entering edit mode

It's not at all clear that those command option will affect reads which started out as properly paired if the deduplication itself breaks up a pair. It's hard to do anything but speculate if you won't provide full command lines.

ADD REPLY

Login before adding your answer.

Traffic: 2614 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6