Bowtie2-Samtools. Too many degenerate bases
1
2
Entering edit mode
8.2 years ago
saabalde ▴ 20

Hi all,

This is my first-question message in the forum, so I hope it's in the right place. I've had a look to related threads with a similar topic, but I couldn't find another one with this issue, what was helpful to me.

I have a problem using Bowtie2 and Samtools. I've assembled paired-end reads usign Trinity. To get longer and/or more complete contigs, I've mapped the reads against these contigs using Bowtie2.

The problem comes when I convert this sam file in a fastq file. I use this command line (seen in http://samtools.sourceforge.net/mpileup.shtml):

samtools mpileup -uf ref.fa aln.bam | bcftools view -cg - | vcfutils.pl vcf2fq > cns.fq

I convert from fastq to fasta using:

seqtk fq2fa file.fastq > file.fasta

And too many of my transcripts have N's. There are transcripts without any, transcripts full on N's and transcripts with many N's along the sequence. So when I try to convert them into protein sequences, I get sequences full of X's.

I guess the reason is bcftools and vcfutils detect all the variant callings and they cannot decide which base is the right one.

How can I say them to select the most frequent base in each case, since I don't want to get variant calls? If there is another approach, like not using bcftools or vcfutils, or whatever (I can't imagine other options...) is welcome.

The version of the software I'm using is:

  • bowtie2 -> BOWTIE/2.2.6
  • samtools -> SAMTOOLS/0.1.18

I hope the problem is well explained and thanks in advance,

Samu

degenerate-bases Assembly samtools Bowtie2 • 2.6k views
ADD COMMENT
0
Entering edit mode

unrelated: your version of samtools is old

ADD REPLY
0
Entering edit mode

I know, but I work in a cluster without administrator permission. Sometimes it's quite difficult to be updated.

ADD REPLY
0
Entering edit mode

That's what the home directory is for :)

ADD REPLY
0
Entering edit mode

you don't need to be administrator to install samtools in your home.

ADD REPLY
0
Entering edit mode

I know, there is no excuses for that.

ADD REPLY
0
Entering edit mode

Try to create a new folder for your reference (ref.fa in samtools command) put it there alone, index it there and perform your commands again with new reference file - so the only thing changed is the destination of reference file. This might help

ADD REPLY
0
Entering edit mode

So, how could that output a different result, since all the files are the same? I'll try, I'm just trying to understand.

I'll update you. Many thanks,
Samu

ADD REPLY
0
Entering edit mode

There is a bug with several version of samtools (don't know which exactly). It results in improper mpileup (reference is not recognised properly and all or major part is substituted with N). Don't know why it helps, so I'm not sure whether it will help you or no - may be you faced another error, but it is not so hard to test.

ADD REPLY
1
Entering edit mode
8.2 years ago

There's no way to make samtools do what you want. Instead, give GATK's FastaAlternateReferenceMaker (or whatever it's called) a try. I don't know if that handles indels.

ADD COMMENT

Login before adding your answer.

Traffic: 3381 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6