Small RNA seq analysis
1
1
Entering edit mode
3.3 years ago
kb_93 ▴ 10

Hello!

I am doing some small RNA analysis to identify a number of small RNAs in cancer data. The data was multiplexed and I have been given the demultiplexed reads. I was given 27 FastQ files for 9 samples (6 tumour and 3 normal) run on 3 different flow cells and lanes where each sample has the same index for the 3 lanes. I was confused whether to merge sam files after alignment and if so, what way to merge them - should it be by patient or by condition, if a patient has provided samples of both conditions how should I proceed?

Any help would be greatly appreciated!

RNA-Seq rna-seq sequencing alignment bowtie • 788 views
ADD COMMENT
2
Entering edit mode
3.3 years ago
GenoMax 141k

run on 3 lanes where each sample has the same index for the 3 lanes.

That is technical sequence replication. You can merge those lane specific files for each sample at any step (before or after alignment). It is possible to generate files that are not split by lanes when a large pool runs on multiple lanes (which was not done in your case).

ADD COMMENT
0
Entering edit mode

Thank you for the reply. I should have clarified earlier but the data I was given is a subset of a larger dataset where there are 19 pools with 45 samples within run on multiple flowcells and lanes. The data I was given was 27 FastQC files from within one pool. Is it still appropriate for me to merge specific lanes for each sample?

ADD REPLY
0
Entering edit mode

is a subset of a larger dataset where there are 19 pools with 45 samples within run on multiple flowcells and lanes.

That is not enough information there to comment. If a sample library with one index was run in different combinations then it would still be technical sequence replication for that library. If there are multiple libraries for the same sample with different indexes then it is a library prep replicate.

The data I was given was 27 FastQC files from within one pool.

As I said before if a large pool ran on multiple lanes of a FC then you are going to get lane specific files for each sample (unless --no-lane-splitting option is used for bcl2fastq). So for that particular pool as long as it ran on one flowcell, it should be ok to merge lane specific files for that one flowcell.

ADD REPLY
0
Entering edit mode

ok, just to double check, no merging across different flowcells?

Also, if the sample libraries with different indexes are in different pools are these still considered library prep replicates?

ADD REPLY
0
Entering edit mode

That may depend on if it is the same library being run across many flowcells (and different pools) and your ultimate aim of analysis. If you are simply going in for great depth (and don't care about potential batch effect) then you could merge across runs/flowcells. You could also use read groups and keep tabs on runs, if you choose to merge.

Also, if the sample libraries with different indexes are in different pools are these still considered library prep replicates?

That would mean that the libraries were independently made (starting with same sample) so yes. This is sometime done if there is a question about which index combination(s) work well for library prep/sequencing.

ADD REPLY
0
Entering edit mode

that's great, thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2619 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6