Question: I have same samples in multiple lanes. What are all the steps to be taken before downstream analysis?
gravatar for nalandaatmi
5.3 years ago by
United States
nalandaatmi90 wrote:

Dear All,

I need some clarifications for the below scenario (RNASeq experiment),

- lane 1 in flowcell contains 4 samples (A, B, C, D) those had been loaded on the lane 2 (A, B) and lane 3 (C, D) as well.

- Before performing downstream analysis, do I need to merge the FASTQ reads from lane 1 with FASTQ reads in lane 2 (similarly for reads from lane 1 with reads in lane 3)?

- Without merging the fastq files, if I do the alignment separately for lane 1, lane 2, and lane 3 with human reference. Does it impact my analysis? then I am planning to merge the two bam files from lane 1 and lane 2 (similarly for lane 1 and lane 3) using sam tools.


ADD COMMENTlink modified 5.3 years ago by Devon Ryan98k • written 5.3 years ago by nalandaatmi90

is your data barcoded? multiplexed?

ADD REPLYlink written 5.3 years ago by apelin20480

Dear Pelin, Yes it is bardcoded and dumultiplexed. Please find below the necessary details






ADD REPLYlink written 5.3 years ago by nalandaatmi90
gravatar for Devon Ryan
5.3 years ago by
Devon Ryan98k
Freiburg, Germany
Devon Ryan98k wrote:

I'm assuming that only a single library was made from each sample and then split on multiple lanes (in a somewhat weird way, I might add). If multiple libraries were made then you will need to give further details.

It doesn't matter much if you concatenate the fastq files before alignment or merge the BAM files afterwards and you should get essentially the same results either way ("essentially" because there's always some randomness to alignment).

ADD COMMENTlink written 5.3 years ago by Devon Ryan98k

Dear Devon Ryan,

They made a single library and then loaded into different lanes.

Actually two projects samples were loaded in the flow cell. After loading, one of the lane was empty in the flowcell, so instead of leaving it blank they loaded all the 4 samples in that lane1.

For project A they loaded samples in following lanes (1,2,3) and for project B in the following lanes (4,5,6,7,8).

ADD REPLYlink modified 14 months ago by _r_am32k • written 5.3 years ago by nalandaatmi90

Cool, they can be concatenated at any point then.

ADD REPLYlink written 5.3 years ago by Devon Ryan98k

Thanks Devon Ryan. 

Instead of merging at the fastq level. I am going to do with the bam files. I received accepted_hits.bam file for each sample as an output after running the Tophat (which uses bowtie2) command.

With those bam files, I am planning to merge them using the steps mentioned in this Fastq Files From Different Flowcells. In this link, they are merging sam. So I am going to convert my bam to sam and then sort the sam and finally merge the sam files. Then I am going to use these sam or bam files for cufflinks step. Correct me if I am wrong. 

ADD REPLYlink modified 5.3 years ago • written 5.3 years ago by nalandaatmi90

I would merge directly the bam files and make sure to remove duplicates (by using either "rmdup" from samtools or MergeBamAlignment from picard tools).

ADD REPLYlink written 5.3 years ago by Alternative240

Duplicates should typically not be marked or removed from RNAseq data. For highly expressed genes you end up capping your signal.

ADD REPLYlink written 5.3 years ago by Devon Ryan98k

Hi Devon. I have a similar question as Nalandaatmi. The only difference is that I have a sample that was sequenced twice using different adapters. I have trimmed and normalized reads for both libraries and generated 2 bam files for this sample. Can I use samtools merge to merge the 2 bam files or should I use Picard? Thanks for your help!

ADD REPLYlink written 4.2 years ago by ta200730
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1452 users visited in the last hour