Bfast Match Argument : Regroup Multiple Reads File For Final Alignment
1
0
Entering edit mode
12.3 years ago

I would like to align reads from multiple samples on the same contigs from a de novo assembly. I am working on a non model species and I don't have any reference genome. Consequently, I have made a de novo assembly with reads from a sequence capture chip. The chip contains approximately 3000 genes and this is what I have assembled. I have a little bit more than 50000 contigs on which I would like to align Illumina paired-end reads (100 bp). Those reads are separated in 24 files containing approximately 30 million reads each. Each file represent an individual from one of 2 different populations and they are all individually tagged.

I know the genome of my species contains a lot of repeated sequences, so I decided to use BFAST to align the reads on the contigs. I am currently in step 3, which is the "match" argument (finding CALs). My problem is that at the end of the process of aligning the reads, I would like to have a file containing the contigs and the reads of ALL THE INDIVIDUALS aligned on these contigs.

My question is : how can I do this if I align the reads separately for each individual ? (I know BFAST works better with files that are not too big, i.e with a few million reads)

Will I have a separate alignment file for each individual ? Could I merge these files somewhere in the process ? Or at the end ?

The final goal would be to find SNPs and conduct different population genomics analyses with the results.

Can anybody help me with this ? I am trying to start working with these tools, but I am definitely not an expert in bio-info. :S

Thank you VERY MUCH !

next-gen sequencing alignment multiple • 2.9k views
ADD COMMENT
0
Entering edit mode

I launched the task using a simple loop in bash and now, it's been running for 2 days and 3 files out of 24 have been processed... but I don't have any output files yet in the folder. Is it supposed to be normal? I'm thinking that maybe when everything is done, the output files will be there... but it doesn't look good so far. :S

ADD REPLY
0
Entering edit mode

Alright, I solved the problem, I wasn't sure I had to do this, but I just re-directed the output in the loop into the correct output file. Everything's fine.

ADD REPLY
2
Entering edit mode
12.3 years ago

I am not sure to understand, but if you only want to concatenate you results, you could just use the cat GNU/Linux command tool.

cat * > myConcatResults

or if your results have the same prefix (e.g. "myresults")

cat myResults* > myConcatResults
ADD COMMENT
0
Entering edit mode

Yeah I thought of doing this but I just don't know what the files look like, so I wasn'nt sure it would work. I'll try this when it's finished.

ADD REPLY
0
Entering edit mode

Well, I'm not done, but it works fine. I wasn't sure it was possible to do this, but now I understand how the files work out and what it looks like. Thanks !

ADD REPLY

Login before adding your answer.

Traffic: 2367 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6