Question: could I use female methylation data annotated using hg38?
1
gravatar for hansong798
19 months ago by
hansong79820
hansong79820 wrote:

I downloaded TCGA breast cancer methylation data from 91 female individuals but I found something interesting. The data of female annotated by 'hg38' have Y chromosome gene symbol.

So, I searched how to handle it and get solution that tells to use reference 'hg38 canonical female'. the difference between hg38 and hg38 canonical female is as below:

(1) The hg38 contains all chromosomes as well as all unplaced contigs.

(2) The hg38 canonical female contains everything from the canonical set with the exception of chromosome Y.'

then, is it the same as removing the Y chromosome from the data annotated with hg38?

assembly • 377 views
ADD COMMENTlink modified 18 months ago • written 19 months ago by hansong79820

An idea: check the genomic locations of these probes. They may lie in the pseudo-autosomal regions (PARs), where chrX is homologous to chrY. Unless you are specifically interested in the sex chromosome probes, you could just remove these from your analysis from the start, stating this in your methods, of course.

ADD REPLYlink written 18 months ago by Kevin Blighe61k

Thank you for your comment. I checked the position of chrY is included in PAR region unfortunately, they were not included.

ADD REPLYlink modified 18 months ago • written 18 months ago by hansong79820
2
gravatar for JC
18 months ago by
JC10k
Mexico
JC10k wrote:

There is no easy answer, in general is fine just to remove the reads from chrY, but there are some considerations that depends on the aligner used:

  • Check how many reads are aligned to chrY, if there are only a few ones (<1%?), it's fine to remove them.
  • If the aligned read is mapped to chrY as a primary hit, you need to check if the same read is reported to be aligned in a secondary hit or not (many aligners only reports the primary, ignoring the secondary unless some parameters are set).
  • In case you have only the primary hit, you can try to align the read again to the genome to check where it comes from.
  • In the other case, you maybe want to readjust the flag for the alignment changing it from secondary to primary.
ADD COMMENTlink written 18 months ago by JC10k

Thanks for your help. Then, I will check their read counts!

ADD REPLYlink written 18 months ago by hansong79820
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1677 users visited in the last hour