Hello:
I am using the tables of https://genome.ucsc.edu/cgi-bin/hgTables , with the next characteristics:
}group ~Genes and Gene Predictions
}region ~defined regions
<h5>In this part, I uploaded a bed format file, in the first column of the bed format file is the chromosome, in the second column the beginning of the sequence, in the third column is the ending of the sequence and in the last column is the number of times that the sequence is in my file, for that reason, the fourth column is very important to me.</h5>}Output format ~selected fields from primary and related tables ~~~>get output.
~name ~chrom ~geneSymbol ~refseq ~description ~kgXref ~refSeqSummary ~~~>get output.
At the ending of the process the genome.ucsc.edu give me the chromosomes and the genes that correspond to the coordinates of the bed format file, but it doesn´t appear the fourth column next to the resulting genes. Some coordinates are not characterized in the genome browser, for that reason some coordinates are erased in the genome browser and I do not know what genes correspond to the specific number of the fourth column.
I would like to know if there is a way of getting the resulting genes with the numbers of the fourth column of my bed format file.
I would appreciate if someone can help me. ####
Thank you.
Alex
You may have realized that using "~" and "####" in your post has made its display a little strange. BioStar's editor is probably trying to interpret some of these characters as code. Consider using other formatting tools (quotes, bold, italics etc) to improve the display.
If I understand your question right then what you are looking for is a "union" of those two BED files so you have a file that looks like
chr --> chr start --> chr end --> Gene name --> # of times present
. Is that correct?Hello genomemax2:
Thank you for answering me. I will try to reformulate my question:
I have a bed file with this characteristic:
*Chromosome *Chrom start *Chrom end *# of times that the sequence appear (sequencing depth)"
chr16 12745162 12745366 72
chr5 73280404 73280517 72
chr10 103823794 103823884 74
chr15 26882981 26882984 75
chr22 43955226 43955283 76
chr4 83113354 83113424 76
With tables of https:/ /genome.ucsc.edu/cgi-bin/hgTables I got the gene names of the coordinates, but the *# of times that the sequence appear (sequencing depth)" dissapear like this example:
hg38.knownGene.chrom hg38.kgXref.geneSymbol
chr16 CPPED1
chr5 RP11-60A8.1
chr10 SH3PXD2A
chr15 GABRB3
chr15 GABRA5
chr22 PNPLA3
Some coordinates are not characterized in the genome browser, for that reason some coordinates are erased in the genome browser and I do not know what genes correspond to the specific number of the fourth column
What I want is to get the gene names of the coordinates and next to the gene names the "# of times that the sequence appear (sequencing depth)" like the next example:
hg38.knownGene.chrom hg38.kgXref.geneSymbol *# of times that the sequence appear (sequencing depth)"
chr16 CPPED1 72
chr5 RP11-60A8.1 72
chr10 SH3PXD2A 74
chr15 GABRB3 75
chr15 GABRA5 ?
chr22 PNPLA3 76
If someone knows the way of doing this through https:/ /genome.ucsc.edu or any bioinformatics tools, I would appreciate it.
I hope to be clear.
Alex