Question: Bfast Match Argument : Regroup Multiple Reads File For Final Alignment
0
gravatar for Francois Olivier Hébert
7.7 years ago by
Quebec
Francois Olivier Hébert280 wrote:

I would like to align reads from multiple samples on the same contigs from a de novo assembly. I am working on a non model species and I don't have any reference genome. Consequently, I have made a de novo assembly with reads from a sequence capture chip. The chip contains approximately 3000 genes and this is what I have assembled. I have a little bit more than 50000 contigs on which I would like to align Illumina paired-end reads (100 bp). Those reads are separated in 24 files containing approximately 30 million reads each. Each file represent an individual from one of 2 different populations and they are all individually tagged.

I know the genome of my species contains a lot of repeated sequences, so I decided to use BFAST to align the reads on the contigs. I am currently in step 3, which is the "match" argument (finding CALs). My problem is that at the end of the process of aligning the reads, I would like to have a file containing the contigs and the reads of ALL THE INDIVIDUALS aligned on these contigs.

My question is : how can I do this if I align the reads separately for each individual ? (I know BFAST works better with files that are not too big, i.e with a few million reads)

Will I have a separate alignment file for each individual ? Could I merge these files somewhere in the process ? Or at the end ?

The final goal would be to find SNPs and conduct different population genomics analyses with the results.

Can anybody help me with this ? I am trying to start working with these tools, but I am definitely not an expert in bio-info. :S

Thank you VERY MUCH !

ADD COMMENTlink modified 5.6 years ago by Biostar ♦♦ 20 • written 7.7 years ago by Francois Olivier Hébert280

I launched the task using a simple loop in bash and now, it's been running for 2 days and 3 files out of 24 have been processed... but I don't have any output files yet in the folder. Is it supposed to be normal? I'm thinking that maybe when everything is done, the output files will be there... but it doesn't look good so far. :S

ADD REPLYlink written 7.7 years ago by Francois Olivier Hébert280

Alright, I solved the problem, I wasn't sure I had to do this, but I just re-directed the output in the loop into the correct output file. Everything's fine.

ADD REPLYlink written 7.7 years ago by Francois Olivier Hébert280
2
gravatar for Manu Prestat
7.7 years ago by
Manu Prestat3.9k
Marseille, France
Manu Prestat3.9k wrote:

I am not sure to understand, but if you only want to concatenate you results, you could just use the cat GNU/Linux command tool.

cat * > myConcatResults

or if your results have the same prefix (e.g. "myresults")

cat myResults* > myConcatResults
ADD COMMENTlink written 7.7 years ago by Manu Prestat3.9k

Yeah I thought of doing this but I just don't know what the files look like, so I wasn'nt sure it would work. I'll try this when it's finished.

ADD REPLYlink written 7.7 years ago by Francois Olivier Hébert280

Well, I'm not done, but it works fine. I wasn't sure it was possible to do this, but now I understand how the files work out and what it looks like. Thanks !

ADD REPLYlink written 7.6 years ago by Francois Olivier Hébert280
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1249 users visited in the last hour