Question: How to remove Read Name (1st column) extension from a BAM file?
1
gravatar for IrK
2.1 years ago by
IrK20
Australia
IrK20 wrote:

Hello everyone,

i have paired-end data, after alignment I have noticed that R1 and R2 have different read names in BAM file, for example:

SRR1032070.122660125.1
SRR1032070.122660125.2

so read R1 has extension .1 and read R2 has extension .2. This causes a problem when I try to convert BAM to BED file with bedtools bamtobed -bedpe -i. So the only solution I can think of is to remove these extensions. Could anyone please advice on the tool, I would not like to convert data back to SAM, as it is extremely large!!!

Thank you

bam paired-end • 1.1k views
ADD COMMENTlink modified 2.1 years ago by Pierre Lindenbaum118k • written 2.1 years ago by IrK20

I assume you got this data from SRA? You should have used -F|--origfmt Defline contains only original sequence name option to avoid getting these kind of read names.

As for adding /1 /2 to read names you could use reformat.sh from BBMap suite with the addslash=t or addcolon=t options.

ADD REPLYlink written 2.1 years ago by genomax65k

ohhh ok, so you mean when I convert SRA to FATSQ with fastq-dump -F use F option

Thank you

ADD REPLYlink written 2.1 years ago by IrK20
2
gravatar for Pierre Lindenbaum
2.1 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum118k wrote:
samtools view -h in.bam | sed '/^[^@]/s/^\(.*\)\.[12]\t/\1\t/' | samtools view -Sb -o out.bam -
ADD COMMENTlink written 2.1 years ago by Pierre Lindenbaum118k
1

thank you so much, Pierre.

I modified the sed part [12] to [.1.2] and it works. So the final command, which works for me is:

samtools view -h in.bam| sed '/^[^@]/s/^\(.*\)\.[.1.2]\t/\1\t/' | samtools view -Sb -o out.bam

May I clarify for the future reference, how paired-end reads can be annotated with extensions 1 and 2? What software was used for this?

ADD REPLYlink written 2.1 years ago by IrK20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 958 users visited in the last hour