4.3 years ago

Ati
10

Hello,

The insert length distributions from RseQC, inner_distance.py, usually give me the indication that the fragments are the same size as the reads. From the lab, I know that they are not. So the plot always is right-skewed. Does anyone know the reason for that.

Here is result from picard, CollectInsertSizeMetrics and bamtools stats

```
**Bamtools stats:**
Total reads: 35421609
Mapped reads: 33181734 (93.6765%)
Forward strand: 18830738 (53.1617%)
Reverse strand: 16590871 (46.8383%)
Failed QC: 0 (0%)
Duplicates: 0 (0%)
Paired-end reads: 35421609 (100%)
'Proper-pairs': 33181726 (93.6765%)
Both pairs mapped: 33181726 (93.6765%)
Read 1: 17710804
Read 2: 17710805
Singletons: 8 (2.25851e-05%)
Average insert size (absolute value): 1552.95
Median insert size (absolute value): 218
**Picard,CollectInsertSizeMetrics**
MEDIAN_INSERT_SIZE 227
MODE_INSERT_SIZE 134
MEDIAN_ABSOLUTE_DEVIATION 106
MIN_INSERT_SIZE 24
MAX_INSERT_SIZE 886604
MEAN_INSERT_SIZE 290.754462
STANDARD_DEVIATION 260.955561
READ_PAIRS 15414176
PAIR_ORIENTATION FR
WIDTH_OF_10_PERCENT 57
WIDTH_OF_20_PERCENT 105
WIDTH_OF_30_PERCENT 145
WIDTH_OF_40_PERCENT 179
WIDTH_OF_50_PERCENT 213
WIDTH_OF_60_PERCENT 269
WIDTH_OF_70_PERCENT 923
WIDTH_OF_80_PERCENT 2493
WIDTH_OF_90_PERCENT 5991
WIDTH_OF_95_PERCENT 11299
WIDTH_OF_99_PERCENT 35925
```

Thank you so much for your help in advance, Ati

Could you clarify what you mean by "from the lab", what procedure did you use to estimate the size distribution of your libraries?

I ran inner_distace.py and CollectInsertSizeMetrics on some of my data and the two tools agreed with one another. Both indicated a right skewed size distribution, with consistent estimation of the median size. The skew may reflect a difference between the bench-level size analysis you did on your library, and the fragments from that library that actually aligned.