Question: Sequenced sample twice, can I merge the fastqs to analyze?
1
gravatar for Pin.Bioinf
6 months ago by
Pin.Bioinf240
Malaga
Pin.Bioinf240 wrote:

Hello,

my colleague sequenced (RNA-Seq) some samples a month ago and the quality of some of them was not so good. She sequenced some new samples and also some of the bad quality samples again to analyze them again. She is asking me if, in order to have more reads and better results, I can re-use the bad quality old reads (.fastq files) and merge them with the new ones (only with the samples that have been re sequenced). Is this possible? Is it recommended or bad practice? It does not sound very good to me, but I don't have experience with this.

If it is a good idea, should I do it with command 'cat' or how? thank you.

fastq rnaseq • 222 views
ADD COMMENTlink modified 6 months ago by geek_y9.4k • written 6 months ago by Pin.Bioinf240
2

You don't want "more" reads, you want representative numbers of reads.

For genome assembly, sure, it would likely be fine. Not for an application where read numbers count though.

ADD REPLYlink written 6 months ago by jrj.healey12k
2
gravatar for Benn
6 months ago by
Benn6.6k
Netherlands
Benn6.6k wrote:

In my opinion this does not seem like a good idea. If your quality is bad, don't use it. If you really would like to use technical replicates in your RNA-seq design (and ignore the fact that you have bad quality), you can use the duplicateCorrelation function in limma to include the technical replicate information, so no don't use cat command here either.

ADD COMMENTlink written 6 months ago by Benn6.6k

Okay, thank you. Maybe so my colleague agrees that it is not a good idea I could check if the replicates are consistent plotting a PCA (and probably this will show they are not similar) so then I can convince her it is not a good idea?

ADD REPLYlink written 6 months ago by Pin.Bioinf240
1
gravatar for geek_y
6 months ago by
geek_y9.4k
Barcelona/CRG/London/Imperial
geek_y9.4k wrote:

Quantify the genes from both technical replicates and check the correlation between two samples. PCA might not give a good idea between technical replicates. Its good for biological replicates, as it takes only top 500 or 1000 genes. "Bad quality samples" either mean the initial RNA is of poor quality, or low starting material etc. In this case its better not to use the replicate.

If bad quality means, drop in base qualities or low sequencing depth etc, then you can use them as the gene expression quantification are not going to change much (After quality trimming).

ADD COMMENTlink modified 6 months ago • written 6 months ago by geek_y9.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1877 users visited in the last hour