Entering edit mode
8.8 years ago
bioguy24
▴
230
I am trying to add the gene_name to a bed file and I think the method I used below is using the gene_id instead. Is it possible to get the gene_name and maybe exon_number? Thank you
1. wget ftp://ftp.sanger.ac.uk/pub/gencode/release_21/gencode.21.annotation.gtf.gz
2.
gunzip --stdout gencode.v21.annotation.gtf.gz \
| gtf2bed - \
| grep "exon" \
> gencode.exons.bed
3. bedmap --echo --echo-map-id-uniq epilepsy70_medex_edit.bed gencode.exons.bed > output.bed
genecode_exons.bed
chr1 11868 12227 ENSG00000223972.5 . + HAVANA exon . gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_status "KNOWN"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_status "KNOWN"; transcript_name "DDX11L1-002"; exon_number 1; exon_id "ENSE00002234944.1"; level 2; tag "basic"; transcript_support_level "1"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";
output.bed (as of now)
chr1 43408884 43409004
|
chr1 43424253 43424373
|ENSG00000198198.11
chr1 154540492 154540612
|
chr1 154541927 154542093
|ENSG00000163239.10
epilepsy70_medex_edit.bed
chr1 40539722 40539865
chr1 40542489 40542609
chr1 40544221 40544341
chr1 40546054 40546174
chr1 40555071 40555194