Iterative trimming using bowtie
0
1
Entering edit mode
5.2 years ago

Hi

I want to use bowtie2 and align the sequenced reads(length=76) to hg38. I have trimmed to get 30 bp long reads and performed alignment. Reads that mapped to multiple regions in the genome were to be extended by 5 bp and then re-mapped. This process will be repeated until either all reads were uniquely mapped to the reference genome or until the reads were extended to their entirety (76 bp). I want to put all these steps

for sample in `ls ./*R1.fastq`
do
dir="./"
base=$(basename $sample "_R1.fastq")
for (i=46;i>=1;i-=5)
do
bowtie2 -x index -q ${base}_R1.fastq -5 0 -3 $i -p 64 --score-min L,-0.6,-0.2 --very-sensitive --no-unal --no-hd --no-sq -S ${base}.sam_$i_trimmed
grep -v XS ${base}.sam_$i_trimmed > Uniq_alignd_${base}.sam_$i_trimmed
grep XS ${base}.sam_$i_trimmed | cut -f1 >multi_aligned_ids_$i_trimmed
seqtk subseq $sample_R1.fastq multi_aligned_ids_$i_trimmed  > input_$i_for_bowtie.fastq
done

However Now I am not sure how to use fastq_output file (input_$i_for_bowtie.fastq) from seqtk to be used in bowtie

This iterative trimming method is been used for HI-C Data analysis (Imakaev, et al., 2012). They have a Python module to perform this task. I just want to try it myself using linux.

bowtie2 iterative trimming alignment ATAC-Seq • 1.9k views
ADD COMMENT
1
Entering edit mode

Not really helping, but fyi: Hi-C data are processed like that because the read may span a ligation junction and might therefore mess up the alignment, but this is not the case for ATAC-seq. Better make sure you properly trim the Nextera adapter CTGTCTCTTATACACATCT. More steps are typically not necessary.

ADD REPLY
0
Entering edit mode

The reason I chose to do iterative mapping is after adapter trimming only 56% of the reads are getting Uniquely aligned to the reference genome. Where as around 10-15% are getting aligned to multiple locations. I thought If I could somehow retain those as well that would add up to my alignment percentage.

ADD REPLY
0
Entering edit mode

Does your reference genome contain chrM?

ADD REPLY
0
Entering edit mode

Yes. Thought I would grep them out after alignment.

ADD REPLY
0
Entering edit mode

That is reasonable. Still, that mapping rate is poor and unusual for a good ATAC-seq library. Published data?

ADD REPLY
0
Entering edit mode

Yeah. GSE74912

ADD REPLY

Login before adding your answer.

Traffic: 2057 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6