Liftover from mm10 to CAST/PWK – Very Low Mapping Rate & Missing Chain File
0
0
Entering edit mode
7 weeks ago
daffodil ▴ 10

Hi,I’m currently working on a piRNA project and trying to map annotated piRNA cluster coordinates from the mm10 reference genome to PWK and CAST/EiJ genomes.

So far, I’ve tested both CrossMap and UCSC liftOver using the available chain files:

mm10ToMm39.over.chain.gz

mm39ToGCA_001624775.1.over.chain.gz (for PWK)

But I don’t have a chain file for CAST/EiJ (e.g., mm39ToCAST_EiJ.over.chain.gz is not publicly available from UCSC)

Here’s the issue I’m facing:

My input BED12 file contains 316 piRNA cluster regions in mm10

After running liftover (either via CrossMap or liftOver), only ~2 regions are successfully mapped to PWK

The rest end up in the .unmap output

Many of my regions are large (50–600 kb), which might affect mapping

I’m trying to understand: Is the poor mapping due to region size or gaps in the PWK/CAST chain files?

Is there any way to obtain or generate a reliable CAST liftover chain file?

Would you recommend another strategy to compare piRNA clusters across mouse strains?

I would really appreciate your insight or any suggestions on how to move forward.

Best regards,

Liftover • 378 views
ADD COMMENT
0
Entering edit mode

Mouse strains can be very different, especially when you are comparing a wild derived strain to a lab inbred one. GPT lists the following differences between them. So a simple liftover is likely not going to work, which is what you seem to have discovered.

| Feature                        | PWK/PhJ                               | C57BL/6 (Black 6)                          |
| ------------------------------ | ------------------------------------- | ------------------------------------------ |
| **Origin**                     | Wild-derived (Mus musculus musculus)  | Classical inbred (Mus musculus domesticus) |
| **Genomic Diversity**          | Very high                             | Low                                        |
| **Number of SNPs vs. Black 6** | \~20+ million                         | N/A (reference genome)                     |
| **Subspecies**                 | musculus                              | domesticus                                 |
| **Mitochondrial Origin**       | musculus                              | domesticus                                 |
| **Phenotypic Differences**     | Many (immunity, behavior, metabolism) | Standard reference strain                  |
ADD REPLY

Login before adding your answer.

Traffic: 3925 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6