Identifying inserts (tranposable elements) using SAM Flags
1
0
Entering edit mode
4.4 years ago
William • 0

Hi,

I am interested in finding the sites of inserts (transposable elements) in a genome of interest. The genome have been sequenced using paired end sequencing, and thereafter an alignment was performed to an index created with the wildtype genome and the insert sequence.

    samtools flagstat alignment.sorted.bam

0 + 0 duplicates

2188174 + 0 mapped (90.60%:-nan%)

2415270 + 0 paired in sequencing

2187112 + 0 properly paired (90.55%:-nan%)

2187284 + 0 with itself and mate mapped

890 + 0 singletons (0.04%:-nan%)

158 + 0 with mate mapped to a different chr

12 + 0 with mate mapped to a different chr (mapQ>=5)


How should I go about finding the number and sites of inserts in this genome? I think I should focus on the reads with mates mapped to a different chromosome. Is it possible to filter by SAMFLAGS to obtain these read? If so, what are the SAMFLAGS I should use?

alignment sequencing next-gen • 1.4k views
0
Entering edit mode

? I think I should focus on the reads with mates mapped to a different chromosome.

0
Entering edit mode

Hi,

I have tried as you suggested the following

samtools view -c -F 1294 alignment.sorted.bam

samtools view -c -F 3854 alignment.sorted.bam


However, in both cases I obtained the number 172, which does not tally with my flagstat output.

158 + 0 with mate mapped to a different chr

12 + 0 with mate mapped to a different chr (mapQ>=5)

0
Entering edit mode
4.4 years ago
d-cameron ★ 2.6k

I strongly recommend against doing this yourself unless you know existing tools do not work for your data/organism. Given the repetitive nature of TEs, many of your read alignments are going be wrong anyway.

I would start by using an specialised tools such as RetroSeq. If specialised tools don't work for you I would try a generic SV caller such as GRIDSS or manta and doing post-processing to determine which of the breakends/insertion calls are unplaced TEs.

0
Entering edit mode

Hi,

Thank you for your reply. I am interested to view the "12 + 0 with mate mapped to a different chr (mapQ>=5)" using samtools.

I am thinking of using the command but i am unsure of the SAM flags that I should use. Do you have any idea on this?

samtools view -F or samtools view -f