Question: Single-end strand specific Rna-Seq: how to separate alignments by strand
1
gravatar for termanini
3.1 years ago by
termanini10
termanini10 wrote:

Hello, I have a single-end strand-specific RNA-seq data (Illumina protocol), and I would like to split the bam file in two parts by strand. For the alignment I am using TopHat 2.

I see this post: C: How To Separate Illumina Based Strand Specific Rna-Seq Alignments By Strand

However I don't understand how to apply that script to my data since my SAM flags are different (filtering with samtools view -b -f 128 -F 16 etc. give me an empty file).

samtools view sample.bam | cut -f2 | sort | uniq

0x0     (0)
0x10    (16 r   read reverse strand)
0x100   (256    s   not primary alignment)
0x110   (272    rs  read reverse strand + not primary alignment)

Can anyone help me?

Thanks very much!

Best, Alberto

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by termanini10
2
gravatar for Devon Ryan
3.1 years ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

samtools view -f 16 ... will yield reads originating from the + strand.

samtools view -F 16 ... will do the same for reads originating from the - strand.

If you look at Istvan's script, just pay attention to the "first in pair" parts and ignore any bits > 16.

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by Devon Ryan88k

Hi Devon, I'm trying to understand FLAG and strands of alignments and it's getting confusing. I asked a question two months ago here A: SAM flag and select reads that map uniquely and you answered:

if bit 16 is set, then then the read is mapped as a reverse complement (thus, if it's unset, then it's not mapped as a reverse complement). Whether "mapped as a reverse complement" can be taken to mean "maps to the minus strand" will be specific to how the library was prepared.

here you used -f 16 for reads originating from the + strand and it infers that -f 0 is will yield reads originating from the - strand, which is the same thing you mentioned in your comment in that post. This only applies to illumina TruSeq stranded RNA-seq protocol, right? besides stranded RNA-seq, is there any other application of sequencing technologies that use stranded library prep? And is there cases where -f 0 will yield reads originating from the + strand?

ADD REPLYlink written 2.6 years ago by epigene450
1

-f 0 will probably return every alignment in the file, since you're doing flag & 0 == 0, which should always be true. The opposite of -f 16 is -F 16. -F 16 is doing a flag | 16 > 0, which will return true only if bit 16 is set. These are bit operations, they're not directly comparing numbers like flag == 0.

BTW, BSseq also uses strand information, though it gets much more complicated than with RNAseq. You also sometimes see strand-specific ChIP experiments where some factor is binding a single strand of DNA (i.e., not your typical histone or transcription factor ChIP experiment). Those will work the same as RNAseq. Also, yes the flag stuff is specific to the current TruSeq and related protocols that are dUTP-based. For some older stuff, everything is swapped.

ADD REPLYlink written 2.6 years ago by Devon Ryan88k

I think I was confused with numeric comparison from bit operations.. -f 0 and -F 0 both return all the alignments in the bam file. Yeah, BS-seq strand is so confusing.. i need to spend more time to understand that when it comes up...

ADD REPLYlink written 2.6 years ago by epigene450
0
gravatar for termanini
3.1 years ago by
termanini10
termanini10 wrote:

Thanks very much Devon,

I will try as you suggested! Just a curiosity: do you know if the flag XS:A+ and XS:A:- assigned by TopHat can be used in order to separate by strands?

Thanks again! Alberto

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by termanini10
1

If your aligner happens to add them (they're custom tags only used by tophat2 (and STAR if you ask it to)) then you could alternatively use them. Note that it's faster to do what I wrote than to use those tags.

ADD REPLYlink written 3.1 years ago by Devon Ryan88k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1021 users visited in the last hour