Question: Tracking The Reads In A Deno Assembly
0
gravatar for rehma.ar
7.7 years ago by
rehma.ar240
rehma.ar240 wrote:

hi all!

i have a very simple question which may not be very simple,

i have extracted some reads from a bam file, and have made a denovo assembly using soapdenovo. then i blast these contigs against my reference genome. i got some hits. but the reads belong to different positions in the bam file. so i know the position of the hit in the reference genome but i also want to know the position in the bam file. but soapdenovo does not keep track of the reads in the assembly.

if i could get the information that during assembly, which read went to which contig that would solve my problem.

i was wondering if there is another assembler that keep tracks of the reads in the assembly would be great or any other indirect solution.

• 2.1k views
ADD COMMENTlink modified 6.6 years ago by Biostar ♦♦ 20 • written 7.7 years ago by rehma.ar240
1
gravatar for lelle
7.7 years ago by
lelle830
Berlin
lelle830 wrote:

Velvet can be configured to do read tracking.

Look in the manual section 3.2.4

ADD COMMENTlink written 7.7 years ago by lelle830
0
gravatar for rehma.ar
7.7 years ago by
rehma.ar240
rehma.ar240 wrote:

thanks alot for the response but i am facing a problem here i have used velvet to make the assembly using this command

./velveth outputdirectory 25 -bam -shortPaired .bam

and then

./velvetg outputdirectory/ -mincontiglgth 100 -inslengthlong 300 -readtrkg yes -amosfile yes

and i got the the result. the read traking should be in the file "LastGraph" a very small portion of the file looks like this.

NR 29 113

1294 52 0

1299 49 0

which should be innterpretted as

NR $NODEID $NUMBEROFSHORTREADS $

READID $OFFSETFROMSTARTOFNODE $STARTCOORD

$READ_ID2 etc

. but "1294" and "1299" are not exactly the read ids and i don't have them in my .bam file. what do you think is going wrong in here.

ADD COMMENTlink written 7.7 years ago by rehma.ar240

It has been a while since I worked with this, but I think these are the read IDs velvet is using internally.

In the output folder there should be a file called "Sequences". It is basically a fasta file with the input reads. The fasta header should look like this:

> M00298:13:000000000-A1BJE:1:1101:17340:1510 2:N:0:1    6       1

It starts with the original read ID followed by the vlevets internals read ID (6 in this case). I do not know at the moment what the last number means.

ADD REPLYlink modified 7.7 years ago • written 7.7 years ago by lelle830

it was really helpful. actually there was no description of the "Sequences" file in the manual.

thanks alot

ADD REPLYlink written 7.7 years ago by rehma.ar240
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1109 users visited in the last hour