【tophat】 RNA-seq mapping reference genome
2
0
Entering edit mode
9.3 years ago
bio_zhangxl ▴ 10

Mapping the RNA-seq reads to reference genome:

tophat --GTF Mus_musculus/UCSC/mm9/Annotation/Genes/genes.gtf Mus_musculus/UCSC/mm9/Sequence/Bowtie2Index/genome SRR1508783_1.fastq SRR1508783_2.fastq

There are only three file in the tophat_out:

logs  prep_reads.info  tmp

I do not know where are the accepted_hits.bam ,junctions.bed, accepted_hits.bam files.

or I use tophat in the wrong way.

I want to do it again, and try to get the parameter -r. But I can not find the fragments length.

Just know this is a pair-end data(101bp,99bp).

Tophat • 4.4k views
ADD COMMENT
0
Entering edit mode

Hello I am experiencing the same issue as the OP. I have increased the memory as you have suggested. My command line looks like this:

qsub -cwd \
  -V \
  -l num_proc=8,h_data=32G,h_rt=24:00:00 \
  -N day0_tophat \
  -b y \
  "module load tophat;
   module load bowtie2;
   module load samtools;
   tophat -g 1 -p 6 -o tophat_output0 -G /path/Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf /path/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/genome day0.fastq"

Unfortunately I am still experiencing what the OP was experiencing in terms of output files and lacking accepted_hits.bam files. There are no errors. It is still running and these are my current output results:

genes.1.bt2
genes.2.bt2
genes.3.bt2
genes.4.bt2
genes.bwt.samheader.sam
genes.fa
genes.fa.tlst
genes.juncs
genes.rev.1.bt2
genes.rev.2.bt2
genes.ver
genome_genome.bwt.samheader.sam
left_kept_reads.bam
left_kept_reads.bam.index
left_kept_reads.m2g.bam
left_kept_reads.m2g.bam.index
left_kept_reads.m2g_um.bam
left_kept_reads.m2g_um.bam.index
left_kept_reads.m2g_um.mapped.bam
left_kept_reads.m2g_um.mapped.bam.index
left_kept_reads.m2g_um_seg1.bam
left_kept_reads.m2g_um_seg1.bam.index
left_kept_reads.m2g_um_seg1.fq.z
left_kept_reads.m2g_um_seg1_unmapped.bam
left_kept_reads.m2g_um_seg1_unmapped.bam.index
left_kept_reads.m2g_um_seg2.bam
left_kept_reads.m2g_um_seg2.bam.index
left_kept_reads.m2g_um_seg2.fq.z
left_kept_reads.m2g_um_seg2_unmapped.bam
left_kept_reads.m2g_um_seg2_unmapped.bam.index
left_kept_reads.m2g_um_unmapped.bam
left_kept_reads.m2g_um_unmapped.bam.index
segment.deletions
segment.fusions
segment.insertions
segment.juncs
temp.samheader.sam

What else can I do?

-EDIT Please disregard I found the problem

ADD REPLY
1
Entering edit mode
9.3 years ago
bio_zhangxl ▴ 10

Now I know the reason. The memory is not enough,and the file can not be written into the folder.

ADD COMMENT
0
Entering edit mode
9.3 years ago

Use tophat2 with following command and post if you see any error/warnings

tophat2 -o tophat_out -p <int> -G genes.gtf genome_index sample1_R1.fq sample1_R2.fq

Regarding the insert size, you can leave it to defaults, but you can get it from the kit they have used to prepare libraries.

ADD COMMENT
0
Entering edit mode

Thank you so much.

Can you tell me why the first time I get no useful results?

Why did not generate accepted_hits.bam junctions.bed accepted_hits.bam files.

The parameter -o defaults in the tophat_out ,but there are only (logs, prep_reads.info, tmp)

And -r default value=50, seem not precise

What is the function of GTF file? If I want get small RNA (from 25bp to 30bp), not genes, is it necessary to use it?

ADD REPLY
0
Entering edit mode

What was the error you faced ?

GTF file will be used for constructing transcriptome. Read the top hat paper for detailed steps.

ADD REPLY
0
Entering edit mode

I get these files in the tmp folder,but there are not accepted_hits.bam junctions.bed file. There are no errors, I do not know why.

left_kept_reads.m2g_um_seg3.bam
right_kept_reads.bam
right_kept_reads.m2g.bam
left_kept_reads.m2g_um.bam
left_kept_reads.bam
left_kept_reads.m2g_um_seg2.bam
left_kept_reads.m2g_um_seg1.bam
right_kept_reads.m2g_um.bam
right_kept_reads.m2g_um.mapped.bam
left_kept_reads.m2g_um.mapped.bam
left_kept_reads.m2g_um_seg4.bam
genes.1.bt2
genes.2.bt2
genes.3.bt2
genes.4.bt2
genes.bwt.samheader.sam
genes.fa
genes.fa.tlst
genes.juncs
genes.rev.1.bt2
genes.rev.2.bt2
genes.ver
genome_genome.bwt.samheader.sam
left_kept_reads.bam.index
left_kept_reads.m2g.bam
left_kept_reads.m2g.bam.index
left_kept_reads.m2g_um.bam.index
left_kept_reads.m2g_um.mapped.bam.index
left_kept_reads.m2g_um_seg1.bam.index
left_kept_reads.m2g_um_seg1.fq.z
left_kept_reads.m2g_um_seg1_unmapped.bam
left_kept_reads.m2g_um_seg1_unmapped.bam.index
left_kept_reads.m2g_um_seg2.bam.index
left_kept_reads.m2g_um_seg2.fq.z
left_kept_reads.m2g_um_seg2_unmapped.bam
left_kept_reads.m2g_um_seg2_unmapped.bam.index
left_kept_reads.m2g_um_seg3.bam.index
left_kept_reads.m2g_um_seg3.fq.z
left_kept_reads.m2g_um_seg3_unmapped.bam
left_kept_reads.m2g_um_seg3_unmapped.bam.index
left_kept_reads.m2g_um_seg4.bam.index
left_kept_reads.m2g_um_seg4.fq.z
left_kept_reads.m2g_um_seg4_unmapped.bam
left_kept_reads.m2g_um_seg4_unmapped.bam.index
left_kept_reads.m2g_um_unmapped.bam
left_kept_reads.m2g_um_unmapped.bam.index
right_kept_reads.bam.index
right_kept_reads.m2g.bam.index
right_kept_reads.m2g_um.bam.index
right_kept_reads.m2g_um.mapped.bam.index
right_kept_reads.m2g_um_unmapped.bam
right_kept_reads.m2g_um_unmapped.bam.index
temp.samheader.sam
ADD REPLY
0
Entering edit mode

What does 'you can get it from the kit' mean? Where can get it?

ADD REPLY

Login before adding your answer.

Traffic: 1742 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6