Uneven amount of total sequences across individuals in RAD-seq dataset
1
1
Entering edit mode
6.2 years ago
CaffeSospeso ▴ 50

I have a RAD-seq genomic dataset with around 180 individuals. I want to analyse this data for a phylogenomic project. However, by assessing the quality of the data by using fastqc I realised that there is an uneven amount of total sequences across individuals, going from 6.5 millions for some individuals to 0.2 millions for others.

I'm afraid this could lead to some issue during the analyses. The only problem I see is that I will have some individuals with more or less MISSING data. This obviously will affect the total number of loci that I could use for phylogenetics.

What do you think? Do you have any suggestion? Shall I exclude some individuals? Is there some established criteria, like exclude the 5% individuals with lowers total amount of sequence?

next-gen sequencing • 1.3k views
ADD COMMENT
0
Entering edit mode

Were the libraries not QC'ed before pooling?

ADD REPLY
0
Entering edit mode

The samples were controlled for a sufficiently high concentration. And they were quality controlled before pooling.

ADD REPLY
1
Entering edit mode
5.4 years ago
Gio12 ▴ 220

You may find this paper interesting. A second read can also be found here.

ADD COMMENT
0
Entering edit mode

These are some nice papers that specifically relate to RAD-Seq!

If people have generally encountered problems with matching expected read counts, that is also something that I would like to hear about (since unexpected differences in observed versus expected reads can cause reads to need to be combined between runs). In other words, that relates to the proposed QC flag b) in this post: Calling Single-Barcode Samples from Mixed Runs as Dual-Barcode Samples | Possible Illumina Run QC Flags?

However, that post is not specifically related to RAD-Seq (in fact, there were 0 RAD-Seq samples among those runs).

ADD REPLY

Login before adding your answer.

Traffic: 813 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6