Why Samtools view don't extract sequence out of the given region?
1
0
Entering edit mode
4.0 years ago
kamanovae ▴ 100

Hi!

I have nanopore sequencing results in fastq format. I need to know the coverage of specific genes. To do this, I aligned the sequence using minimap2 and converted the result to bam format. I want to use the Samtools view to extract the gene by coordinate. Then I plan to use the Samtools depth.

The command to run after alignment looks like this:

  1. samtools view -h myFile.sam -b -o fastq_pass/myFile.bam
  2. samtools sort myFile.bam -o myFile_sort.bam
  3. samtools index myFile_sort.bam
  4. samtools view myFile_sort.bam chr22:41865129-41924993 > myGene.bam
  5. samtools depth myGene.bam > myGene.bam_bepth

I want to extract the sequence from region chr22:41865129-41924993 . But when opening a file with a gene coverage depth, the start and end coordinates are shifted:

chr22   41863407    1
chr22   41863408    1
chr22   41863409    1
...
...
...
chr22   41932508    1
chr22   41932509    1
chr22   41932510    1
chr22   41932511    1

Could someone please tell me what I'm doing wrong.

samtools • 1.6k views
ADD COMMENT
1
Entering edit mode
4.0 years ago

Hello kamanovae,

samtools view returns all reads that overlap a given region. So it's expected to have information about the region outside you give.

You can use the -r parameter for samtools depth (or -b if you provide a bed file with regions) to get the coverage information for an exact region instead.

fin swimmer

ADD COMMENT
0
Entering edit mode

I appreciate your help! Can I use the Samtools view to get only the necessary region, if now I need to extrect the sequence?

ADD REPLY
0
Entering edit mode

samtools is already returning the reads with alignments overlapping your region of interest.

If a read is 100bp long and starts at position 1 with a perfect alignment, then it aligns to bases: [1-100] inclusive. If you ask for reads that overlap positions 100-200, samtools is going to return the read that starts at position 1 because it aligns to position 100 which is inside the [100-200] interval you specified.

ADD REPLY

Login before adding your answer.

Traffic: 1963 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6