How to annotate bam-file using ready annotation gtf
1
0
Entering edit mode
5.0 years ago
Alexandr • 0

Hi! At the moment I am stuck at the stage where I have a sorted bam-file, a vcf-file with variants and an annotation gtf-file downloaded from Ensemble. Is it possible to use bedtools (or is it better to use other programs?) To annotate my bam-file using a ready-made genome annotation from Ensembl (gtf format)? The goal is to find out which genes have been sequenced In the future, the task is to find out if there are substitutions, deletions, sudden stop codons, etc. Thanks

snp • 2.3k views
ADD COMMENT
1
Entering edit mode
5.0 years ago

What do you mean annotate your BAM file? BAM files are not human readable. What sort of data is this? Exome or whole-genome? If you want per-gene coverage, GATK has a decent guide on how to do this. If you just need to know where the variants are and their potential impact on gene products, you need to annotate the VCF file with something like VEP.

ADD COMMENT
0
Entering edit mode

Thanks for the answer! These data on genome-wide sequencing (as indicated by the authors in NCBI), but they do not exactly cover the entire genome and were originally used by the authors for other purposes. (this data not from human).

I know about GATK and VEP, but I have technical difficulties with how to actually start manipulating my files. I used to this time only: fastqc - trimmomatic - bowtie2 - samtools/bcftools - vcftools. As a result, I received a sorted file with aligned and deduplicated reads (bam), as well as a file with variations (vcf)

Therefore, I wanted to know how to associate my files (bam and vcf) with the Ensemble file (gtf)? Or make this GATK? Examples of commands? First: That is, find out what genes are read. Second: which of the read genes have variations.

Otherwise, we can say why and how to continue here? Sorry if the question seemed very general. Thanks.

ADD REPLY
1
Entering edit mode

Then, as the answers says, GATK and VEP can do what you need. GATK can tell you per-gene coverage, which you can use to determine which genes may not have the necessary coverage to identify low frequency SNPs. VEP can annotate the SNPs from your VCF file to tell you potential consequences on gene products.

ADD REPLY

Login before adding your answer.

Traffic: 1522 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6