RNA-SeQC error no output
4
1
Entering edit mode
7.4 years ago

I am running RNA-SeQC on RNA-seq data. The following is the command.

Command :

java -jar ~/SPRING-SUMMER_2014/Softwares/RNA-SeQC_v1.1.7.jar \
    -r Saccer3_genome.fa \
    -o ~/SPRING-SUMMER_2014/RNA-seq/RNA-seq_C1W8_time_work_bench/ \
    -s C1W8_8hr_PE_output_soap_BAM_sorted.bam \
    -t ~/SPRING-SUMMER_2014/RNA-seq/RNA-seq_C1W8_time_work_bench/Sac_cerevisiae.gtf

Output:

RNA-SeQC v1.1.7 05/14/12
Creating rRNA Interval List based on given GTF annotations
java.lang.ArrayIndexOutOfBoundsException: 1
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics$MetricSample.readInSamples(RNASeqMetrics.java:1369)
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics.prepareFiles(RNASeqMetrics.java:182)
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics.execute(RNASeqMetrics.java:165)
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics.main(RNASeqMetrics.java:135)
RNA-SeQC Total Runtime:    0 min

There's no output file.

RNA-Seq Tool quality-control • 5.8k views
ADD COMMENT
0
Entering edit mode

I have the same issue how did you resolve?

ADD REPLY
0
Entering edit mode

I found it does work if you use the previous release (21) of the annotation file. It is not ideal, but it is still GRCh38

ADD REPLY
0
Entering edit mode

sorry, this is the answer to another issue below

ADD REPLY
1
Entering edit mode
7.4 years ago

are your chromosomes consistently named 1,2,3... in all reference files?

ADD COMMENT
0
Entering edit mode

They're numbered in roman numerals.

ADD REPLY
1
Entering edit mode

Looks like some file is looking for a "1".

What is the output of:

head Saccer3_genome.fa ~/SPRING-SUMMER_2014/RNA-seq/RNA-seq_C1W8_time_work_bench/Sac_cerevisiae.gtf
samtools view -H C1W8_8hr_PE_output_soap_BAM_sorted.bam
ADD REPLY
0
Entering edit mode

This is the output:

head Saccer3_genome.fa ~/SPRING-SUMMER_2014/RNA-seq/RNA-seq_C1W8_time_work_bench/Sac_cerevisiae.gtf

==> Saccer3_genome.fa <==
>chrI
CCACACCACACCCACACACCCACACACCACACCACACACCACACCACACC
CACACACACACATCCTAACACTACCCTAACACAGCCCTAATCTAACCCTG
GCCAACCTGTCTCTCAACTTACCCTCCATTACCCTGCCTCCACTCGTTAC
CCTGTCCCATTCAACCATACCACTCCGAACCACCATCCATCCCTCTACTT
ACTACCACTCACCCACCGTTACCCTCCAATTACCCATATCCAACCCACTG
CCACTTACCCTACCATTACCCTACCATCCACCATGACCTACTCACCATAC
TGTTCTTCTACCCACCATATTGAAACGCTAACAAATGATCGTAAATAACA
CACACGTGCTTACCCTACCACTTTATACCACCACCACATGCCATACTCAC
CCTCACTTGTATACTGATTTTACGTACGCACACGGATGCTACAGTATATA

==> /home/bioratcliff/SPRING-SUMMER_2014/RNA-seq/RNA-seq_C1W8_time_work_bench/Sac_cerevisiae.gtf <==
I    protein_coding    CDS    335    646    .    +    0    exon_number "1"; gene_id "YAL069W"; gene_name "YAL069W"; p_id "P3633"; protein_id "YAL069W"; transcript_id "YAL069W"; transcript_name "YAL069W"; tss_id "TSS1128";
I    protein_coding    exon    335    649    .    +    .    exon_number "1"; gene_id "YAL069W"; gene_name "YAL069W"; p_id "P3633"; seqedit "false"; transcript_id "YAL069W"; transcript_name "YAL069W"; tss_id "TSS1128";
I    protein_coding    start_codon    335    337    .    +    0    exon_number "1"; gene_id "YAL069W"; gene_name "YAL069W"; p_id "P3633"; transcript_id "YAL069W"; transcript_name "YAL069W"; tss_id "TSS1128";
I    protein_coding    CDS    538    789    .    +    0    exon_number "1"; gene_id "YAL068W-A"; gene_name "YAL068W-A"; p_id "P5377"; protein_id "YAL068W-A"; transcript_id "YAL068W-A"; transcript_name "YAL068W-A"; tss_id "TSS5439";
I    protein_coding    exon    538    792    .    +    .    exon_number "1"; gene_id "YAL068W-A"; gene_name "YAL068W-A"; p_id "P5377"; seqedit "false"; transcript_id "YAL068W-A"; transcript_name "YAL068W-A"; tss_id "TSS5439";
I    protein_coding    start_codon    538    540    .    +    0    exon_number "1"; gene_id "YAL068W-A"; gene_name "YAL068W-A"; p_id "P5377"; transcript_id "YAL068W-A"; transcript_name "YAL068W-A"; tss_id "TSS5439";
I    protein_coding    stop_codon    647    649    .    +    0    exon_number "1"; gene_id "YAL069W"; gene_name "YAL069W"; p_id "P3633"; transcript_id "YAL069W"; transcript_name "YAL069W"; tss_id "TSS1128";
I    protein_coding    stop_codon    790    792    .    +    0    exon_number "1"; gene_id "YAL068W-A"; gene_name "YAL068W-A"; p_id "P5377"; transcript_id "YAL068W-A"; transcript_name "YAL068W-A"; tss_id "TSS5439";
I    protein_coding    exon    1807    2169    .    -    .    exon_number "1"; gene_id "YAL068C"; gene_name "PAU8"; p_id "P6023"; seqedit "false"; transcript_id "YAL068C"; transcript_name "PAU8"; tss_id "TSS249";
I    protein_coding    stop_codon    1807    1809    .    -    0    exon_number "1"; gene_id "YAL068C"; gene_name "PAU8"; p_id "P6023"; transcript_id "YAL068C"; transcript_name "PAU8"; tss_id "TSS249";

Output for

samtools view -H C1W8_8hr_PE_output_soap_BAM_sorted.bam

@HD    VN:1.3    SO:coordinate
@SQ    SN:Y55.chr10    LN:770597
@SQ    SN:Y55.chrm    LN:107061
@SQ    SN:Y55.chr01    LN:248261
@SQ    SN:Y55.chr11    LN:686124
@SQ    SN:Y55.scplasm1    LN:7602
@SQ    SN:Y55.chr02    LN:800992
@SQ    SN:Y55.chr12    LN:1067059
@SQ    SN:Y55.chr03    LN:321691
@SQ    SN:Y55.chr13    LN:923317
@SQ    SN:Y55.chr04    LN:1522688
@SQ    SN:Y55.chr14    LN:781629
@SQ    SN:Y55.chr05    LN:577152
@SQ    SN:Y55.chr15    LN:1105914
@SQ    SN:Y55.chr06    LN:273660
@SQ    SN:Y55.chr16    LN:946183
@SQ    SN:Y55.chr07    LN:1113452
@SQ    SN:Y55.chr08    LN:566494
@SQ    SN:Y55.chr09    LN:467776
ADD REPLY
1
Entering edit mode

see the difference?

chrI != I

ADD REPLY
0
Entering edit mode
7.1 years ago
Tim Amos ▴ 20

The error message says there is a problem with "readInSamples"

The correct use of the -s option is, according to http://www.broadinstitute.org/cancer/cga/rnaseqc_run :

-s <arg>

Sample File: tab-delimited description of samples and their bams. This file header is:
Sample ID Bam File Notes
When running on just one sample, this argument can be a string of the form
"Sample ID|Bam File|Notes", where Bam File is the path to the input file.

i.e.:

-s "C1W8_8hr_PE|C1W8_8hr_PE_output_soap_BAM_sorted.bam|NA"

​rather than:

-s C1W8_8hr_PE_output_soap_BAM_sorted.bam

I got this error by accidentally having '\t' in my tab-delimited samples file rather than a literal tab. My mistake was to use:

echo "Sample ID\tBam File\tNotes" > ${OUTDIR}/Samples.txt

rather than including the -e option:

echo -e "Sample ID\tBam File\tNotes" > ${OUTDIR}/Samples.txt
ADD COMMENT
0
Entering edit mode

Hi Timothy,

It appears that I have very similar/the same problem as the person about. I tried all the answers above and I still can't get to work (my RNA-SeQC run). Would you be able to elaborate a bit more on the string under -s flag?

This is my actual bam file name cfDMG2_ACTGAT_sorted_reordered_removed_dups.bam. And this is now I'm inputting it in RNA-SeQC run

-s "TestID|cfDMG2_ACTGAT_sorted_reordered_removed_dups.bam|NA"

And I get this error

The required transcript_id attribute was not found on line 1

Here is verbose look of what I did

rna-seqc -t HomoSapiensH38.gtf -r HomoSapiensH38.fa -o outDir -s "TestID|cfDMG2_ACTGAT_sorted_reordered_removed_dups.bam|NA"

RNA-SeQC v1.1.8.1 07/11/14
Creating rRNA Interval List based on given GTF annotations
Retriving contig names from reference
         contig names in reference: 194
Loading GTF for Read Counting
The required transcript_id attribute was not found on line 1    havana  gene    11869   14409   .       +       .       gene_id "ENSG00000223972"; gene_version "5"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene";

Thanks,

ADD REPLY
0
Entering edit mode

I think your command line is ok. The problem is your GTF file does not have transcript_id attribute. Try use one from GENCODE.

ADD REPLY
0
Entering edit mode
6.5 years ago

I got it all working by reverting back to assembly and annotation Ensembl 37. I did source my fasta and gtf files from GENCODE, but I think it didn't matter in the end, as long as I used other assembly, not the latest one. I'm pretty sure RNA-SeQC breaks when the latest assembly - Ensembl 38 used, as it was build when Ensembl 37 was the latest, but the gtf format has changed in Ensembl 38. I raised an issue on github RNA-SeQC error no output , but it doesn't look they check they page. I wanted to write to them directly, but couldn't find the best email to contact. 

If you or anyone else knows the best contact email regarding RNA-SeQC please let me know. And also if you or anyone else have any thoughts on RNA-SeQC not working with the latest assembly - Ensembl 38, please comment on github issue page or here.

Thanks,

Kirill 

ADD COMMENT
0
Entering edit mode

I found it does work if you use the previous release (21) of the annotation file. It is not ideal, but it is still GRCh38

ADD REPLY
0
Entering edit mode
4.2 years ago

Also resolved my issue by making sure my Sample File was in the correct format and referred to actually extant .bam files. According to the input spec:

-s <arg>

Sample File: tab-delimited description of samples and their bams.

This file header is: Sample ID[tab]Bam File[tab]Notes

ADD COMMENT

Login before adding your answer.

Traffic: 1562 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6