How to Extract Specific Region on Bam file
0
0
Entering edit mode
6 weeks ago
santos48 ▴ 30

I aligned all my Nanopore Rna-Seq reads with Human Reference Genome and Transcript (Then I had 2 different bam files one of them is aligned with Reference Genome other is aligned with Transcript)

I want to extract chromosome X from the aligned BAM file. I tried extract region with samtools view command but it did not work I taken the error specifies an invalid region or unknown reference. Continue anyway.'

I used to check chromosome list with samtools view -H input.bam then the list for both bam file(transcript and genome)

@SQ SN:XR_928432.2 LN:810

Can you tell me what am I doing wrong ? and How to extract that region?

Nanopore HG38 Rna-Seq Genome • 579 views
0
Entering edit mode

. I tried extract region with samtools view command but it did not work

what was the command ???

0
Entering edit mode
samtools view -h input.bam X:123456-654321 -o output.bam

0
Entering edit mode

your parameter order is wrong, samtools expects:

samtools view [PARAMS] BAM REGION


so you need:

samtools view -h -o output.bam input.bam X:123456-654321

0
Entering edit mode

Thank you for the answer but it is not working, I have taken the same error

specifies an invalid region or unknown reference. Continue anyway.

0
Entering edit mode

check if chromosome X is labeled as X or chrX

0
Entering edit mode

I checked with samtools view -H input.bam with that command but I did not see any chromosome I downloaded ref gene from that link: https://www.ncbi.nlm.nih.gov/genome/guide/human/

I saw only like

@SQ SN:NT_187361.1 LN:175055

0
Entering edit mode

Check your reference that you aligned to for the chromosome names.

You can ussally use grep for it

grep ">" reference.fa

0
Entering edit mode

Thank you for learned to me grep command, I check it, and I see all chromosomes in there. But I can not extract from bam file I am exactly sure I aligned with the correct file.

0
Entering edit mode

Ok, and when you do

samtools view -h -o output.bam input.bam X:123456-654321


Does the X in the above command match what the X chromosome is called in the output of the grep command?

You need to make sure the samtools X: is the same as what X is called in the grep command output.

0
Entering edit mode

In grep command its output like(genomic ref);

> NT_187634.1 Homo sapiens chromosome X genomic scaffold, GRCh38.p13 alternate locus group ALT_REF_LOCI_1 HSCHRX_1_CTG3


You are right, I think it's output different from X:123456-654321 then and I didn't see any location as a number. In that case how to extract that region?

I am confused should I align with the reference file from the reading, I did but I can not visualize to look chrX.

0
Entering edit mode

samtools view -h -o output.bam input.bam NT_187634.1:123456:654321

0
Entering edit mode

It will not solve the problem because it has many different kinds of 'NT_ '

0
Entering edit mode

In your reference should be one NT_ sequence for the X chromosome, the other NT_ sequences are the other chromomosomes (1,2,3...22, Y, MT)

0
Entering edit mode

Does anyone have a different idea?

0
Entering edit mode