How Can I Align My 454 Reads To A Reference Sequence And Generate A Recruitment Plot?
3
1
Entering edit mode
12.8 years ago
User 6228 ▴ 110

I'd like to align my trimmed 454 metagenomic reads to a reference sequence and generate plots of the coverage / percent identity / recruitment.

Both the reads file and the reference file are in fasta format. I don't have access to the original .sff files from the 454 sequencer.

I think consed can align reads like this but I cannot get it to work to align 2 fasta files and the documentation is almost no help. If anyone knows how to get consed to do this that would be great.

Otherwise, I am completely open to suggestions of other pieces of software. I am currently aligning the reads to the reference sequence using fr-hit (http://weizhong-lab.ucsd.edu/frhit/) but I don't know how to plot the resulting data into a 'percent identity plot' or 'recruitment plot'. Any help would be greatly appreciated.

Thanks!

fasta • 4.6k views
ADD COMMENT
2
Entering edit mode
12.8 years ago
User 6228 ▴ 110

I ended up borrowing (with permission) the Javascript from this site: http://weizhong-lab.ucsd.edu/frv/, then I wrote a tiny php backend that parses Blast and FR-Hit files and sends the data to the Javascript plotter in the right format.

It works very well on my system, but you will have to apply a little elbow grease to get the backend working for yourself.

Here is the code I am using to parse the Blast & FR-Hit files and send it to the Javascript plotter:

    function get ($file)

{ if (preg_match('/blast/', $file)) { $type = 'blast'; } else { $type = 'frv'; }

$lines = file($SERVER['DOCUMENTROOT'].'/fr-view/resources/data/'.$file);

$graphdata = array(); $maxx = 0; $numberofmatches = 0; foreach ($lines as $l) { if ($l[0] == '#' || $l[0] == ' ') continue;

$l = explode("\t", $l); foreach ($l as &$each) { $each = trim($each, "\n\r\t ,%"); }

if ($type == 'frv') { $percentid = $l[7]; $beginning = $l[9]; $end = $l[10]; } else if ($type == 'blast') { $percentid = $l[2]; $beginning = $l[6]; $end = $l[7]; } else { die('Invalid type!'); }

if ($end > $maxx) $maxx = $end;

$graphdata[] = array($beginning, $percentid, $end); $numberofmatches++; } die(jsonencode(array( 'numberofmatches' => $numberofmatches, 'maxx' => $maxx, 'graphdata' => $graph_data ))); }

You can put that in a PHP file, and make an ajax call to that URL in your Javascript to retrieve the data.

If none of that makes any sense, just ignore this answer.

ADD COMMENT
1
Entering edit mode
12.8 years ago
Benm ▴ 710

You can use gsMapper(Newbler v2.5.3, it is not open source software) to align your 454 (sff) reads to reference sequence. But I don't find the suitable software to do 'recruitment plot', I don't know if the CAMERA 1.3.2 could help you, I think you can make it with R using the alignment infor. from gsMapper results. Sorry for the weak suggestion.

ADD COMMENT
0
Entering edit mode

I can align it with a reference sequence (blast, fr-hit, etc) but I need something to plot it. I will take a look at R.

ADD REPLY
1
Entering edit mode
12.8 years ago

have a look at this [http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.0050077 paper, basically you would want to BLASTn you reads against a reference genome. then, parse the BLAST results to extract the hits' coordinates and percent ID. finally, for plotting you would need to create a csv file ("coordinate", "%ID") and use the likes of ggplot's geom_point in R for plotting, e.g:

foo = read.csv('your_results.csv', header=T)
ggplot(foo, aes(x=coord, y=X.id)) +  geom_point(aes(size=1.5, alpha=X.id,legend.position="none")) + scale_colour_manual(value = colours) + opts(theme_bw(), axis.text.x = theme_text(size=8, angle = 90, hjust=1), panel.background = theme_blank(), panel.grid.major = theme_blank(), title= "choose_title")

--OR-- another option is to upload your metagenomic data to the mg-rast server (http://metagenomics.anl.gov/) where a recruitment analysis option is available against genomes available on their server.

ADD COMMENT

Login before adding your answer.

Traffic: 1950 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6