Question

Bowtie2-Samtools. Too many degenerate bases

2

Entering edit mode

8.2 years ago

saabalde ▴ 20

Hi all,

This is my first-question message in the forum, so I hope it's in the right place. I've had a look to related threads with a similar topic, but I couldn't find another one with this issue, what was helpful to me.

I have a problem using Bowtie2 and Samtools. I've assembled paired-end reads usign Trinity. To get longer and/or more complete contigs, I've mapped the reads against these contigs using Bowtie2.

The problem comes when I convert this sam file in a fastq file. I use this command line (seen in http://samtools.sourceforge.net/mpileup.shtml):

samtools mpileup -uf ref.fa aln.bam | bcftools view -cg - | vcfutils.pl vcf2fq > cns.fq

I convert from fastq to fasta using:

seqtk fq2fa file.fastq > file.fasta

And too many of my transcripts have N's. There are transcripts without any, transcripts full on N's and transcripts with many N's along the sequence. So when I try to convert them into protein sequences, I get sequences full of X's.

I guess the reason is bcftools and vcfutils detect all the variant callings and they cannot decide which base is the right one.

How can I say them to select the most frequent base in each case, since I don't want to get variant calls? If there is another approach, like not using bcftools or vcfutils, or whatever (I can't imagine other options...) is welcome.

The version of the software I'm using is:

bowtie2 -> BOWTIE/2.2.6
samtools -> SAMTOOLS/0.1.18

I hope the problem is well explained and thanks in advance,

Samu

degenerate-bases Assembly samtools Bowtie2 • 2.6k views

ADD COMMENT • link updated 21 months ago by Ram 43k • written 8.2 years ago by saabalde ▴ 20

0

Entering edit mode

unrelated: your version of samtools is old

ADD REPLY • link 8.2 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

I know, but I work in a cluster without administrator permission. Sometimes it's quite difficult to be updated.

ADD REPLY • link 8.2 years ago by saabalde ▴ 20

0

Entering edit mode

That's what the home directory is for :)

ADD REPLY • link 8.2 years ago by Devon Ryan 104k

0

Entering edit mode

you don't need to be administrator to install samtools in your home.

ADD REPLY • link 8.2 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

I know, there is no excuses for that.

ADD REPLY • link 8.2 years ago by saabalde ▴ 20

0

Entering edit mode

Try to create a new folder for your reference (ref.fa in samtools command) put it there alone, index it there and perform your commands again with new reference file - so the only thing changed is the destination of reference file. This might help

ADD REPLY • link updated 21 months ago by Ram 43k • written 8.2 years ago by Max Ivon ▴ 130

0

Entering edit mode

So, how could that output a different result, since all the files are the same? I'll try, I'm just trying to understand.

I'll update you. Many thanks,
Samu

ADD REPLY • link updated 21 months ago by Ram 43k • written 8.2 years ago by saabalde ▴ 20

0

Entering edit mode

There is a bug with several version of samtools (don't know which exactly). It results in improper mpileup (reference is not recognised properly and all or major part is substituted with N). Don't know why it helps, so I'm not sure whether it will help you or no - may be you faced another error, but it is not so hard to test.

ADD REPLY • link updated 21 months ago by Ram 43k • written 8.2 years ago by Max Ivon ▴ 130

score 1 · Answer 1 · 2016-02-15

1

Entering edit mode

8.2 years ago

Devon Ryan 104k

There's no way to make samtools do what you want. Instead, give GATK's FastaAlternateReferenceMaker (or whatever it's called) a try. I don't know if that handles indels.

ADD COMMENT • link 8.2 years ago by Devon Ryan 104k