Question: Merging Two Fastq Files From Exome Sequencing
gravatar for User 1933
7.7 years ago by
User 1933340
User 1933340 wrote:

One sample, is whole exome sequenced, twice to reach the required coverage as these samples had some issues. The sequencing in both sample is exactly same (batch, machine and the vendor) I wonder, if it makes sense to merge these two fastq files in order to reach the adequate coverage., and do variant calling once ?! if so, how should I merge them. is there any other suggestion in this case?

exome • 4.2k views
ADD COMMENTlink modified 7.7 years ago • written 7.7 years ago by User 1933340

Did you used the same library and the same machine for both runs? And was is the same person who did the experiments? :)

ADD REPLYlink written 7.7 years ago by David Langenberger9.5k

Yes ! I have update the post

ADD REPLYlink written 7.7 years ago by User 1933340
gravatar for Pierre Lindenbaum
7.7 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum131k wrote:

align your reads with bwa/bowtie2, etc... and don't forget to create a group to mark the origin of your fastqs. For BWA:

-R STR     Complete read group header line. ’\t’ can be used in STR and will be converted to a TAB in the output SAM. The read group ID will be attached to every read in the output. An example is ’@RG\tID:foo\tSM:bar’. [null]

or you can use picard

at one point , you can merge the BAMs using . You'll be aways be able to exclude/see the origin of the reads as they will carry a Group-ID.

ADD COMMENTlink modified 13 months ago by RamRS30k • written 7.7 years ago by Pierre Lindenbaum131k

since I have the fastq files at this moment, isn't it possible merge them and and do alignment and other stuff, afterwards ? something like, cat run2.fastq > run1.fastq !?

ADD REPLYlink written 7.7 years ago by User 1933340

Of course it is possible to do that, but Pierre's solution is better I think. If you do it his way, you will be able to go back from any later stage and see which run a read came from. This might let you identify really strange artifacts like run-specific variants (if any of these exist). If you just cat the files, you can't do this.

ADD REPLYlink written 7.7 years ago by matted7.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1245 users visited in the last hour