Question

Merging Two Fastq Files From Exome Sequencing

0

Entering edit mode

12.3 years ago

User 1933 ▴ 360

One sample, is whole exome sequenced, twice to reach the required coverage as these samples had some issues. The sequencing in both sample is exactly same (batch, machine and the vendor) I wonder, if it makes sense to merge these two fastq files in order to reach the adequate coverage., and do variant calling once ?! if so, how should I merge them. is there any other suggestion in this case?

exome • 5.4k views

ADD COMMENT • link 12.3 years ago by User 1933 ▴ 360

0

Entering edit mode

Did you used the same library and the same machine for both runs? And was is the same person who did the experiments? :)

ADD REPLY • link 12.3 years ago by David Langenberger 11k

0

Entering edit mode

Yes ! I have update the post

ADD REPLY • link 12.3 years ago by User 1933 ▴ 360

Ram · Answer 1 · 2013-04-03

3

Entering edit mode

12.3 years ago

Pierre Lindenbaum 166k

align your reads with bwa/bowtie2, etc... and don't forget to create a group to mark the origin of your fastqs. For BWA:

-R STR     Complete read group header line. ’\t’ can be used in STR and will be converted to a TAB in the output SAM. The read group ID will be attached to every read in the output. An example is ’@RG\tID:foo\tSM:bar’. [null]

or you can use picard http://picard.sourceforge.net/command-line-overview.shtml#AddOrReplaceReadGroups

at one point , you can merge the BAMs using http://picard.sourceforge.net/command-line-overview.shtml#MergeSamFiles . You'll be aways be able to exclude/see the origin of the reads as they will carry a Group-ID.

ADD COMMENT • link updated 5.8 years ago by Ram 45k • written 12.3 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

since I have the fastq files at this moment, isn't it possible merge them and and do alignment and other stuff, afterwards ? something like, cat run2.fastq > run1.fastq !?

ADD REPLY • link 12.3 years ago by User 1933 ▴ 360

3

Entering edit mode

Of course it is possible to do that, but Pierre's solution is better I think. If you do it his way, you will be able to go back from any later stage and see which run a read came from. This might let you identify really strange artifacts like run-specific variants (if any of these exist). If you just cat the files, you can't do this.

ADD REPLY • link 12.3 years ago by matted 7.8k