Cuffdiff Says Sam Isn'T Sorted, Although It Handled It Cufflinks
4
0
Entering edit mode
12.0 years ago
Rubellus ▴ 10

Hi all, I'm having an issue with cufflinks in that I can get through cufflinks with a de novo, but am halted here:

cuffdiff -o diffout -b annolrubtran.fasta -p 20 -L 0mg,36mg,125mg -u cuffcmp.combined.gtf ./tophat008paired/acceptedhits.sam.sorted ./tophat009paired/acceptedhits.sam.sorted ./tophat010paired/acceptedhits.sam.sorted You are using Cufflinks v1.3.0, which is the most recent release. [bamheaderread] EOF marker is absent. [bamheaderread] invalid BAM binary header (this is not a BAM file). File ./tophat008paired/acceptedhits.sam.sorted doesn't appear to be a valid BAM file, trying SAM... [bamheaderread] EOF marker is absent. [bamheaderread] invalid BAM binary header (this is not a BAM file). File ./tophat009paired/acceptedhits.sam.sorted doesn't appear to be a valid BAM file, trying SAM... [bamheaderread] EOF marker is absent. [bamheaderread] invalid BAM binary header (this is not a BAM file). File ./tophat010paired/accepted_hits.sam.sorted doesn't appear to be a valid BAM file, trying SAM... [19:24:38] Loading reference annotation and sequence. [19:25:40] Inspecting maps and determining fragment length distributions.

Error: this SAM file doesn't appear to be correctly sorted! current hit is at Contig10010|nucleolin-:26, last one was at Contig1000|---NA---:198 Cufflinks requires that if your file has SQ records in the SAM header that they appear in the same order as the chromosomes names in the alignments. If there are no SQ records in the header, or if the header is missing, the alignments must be sorted lexicographically by chromsome name and by position.

I've prviouisly converted the tophat output into a .sam before sorting:

samtools view -h -o acceptedhits.sam acceptedhits.bam

sort -k 3,3 -k 4,4n acceptedhits.sam > acceptedhits.sam.sorted

This feeds into cufflinks alright, but stops as above when shuttled into cuffdiff. If anyone could offer me a suggestion, I'd thoroughly appreciate it!!

cufflinks transcription gene bowtie tophat • 12k views
ADD COMMENT
2
Entering edit mode
12.0 years ago
Vikas Bansal ★ 2.4k

You should sort your bam file (accepted_hits.bam) using samtools. Use

samtools sort accepted_hits.bam  accepted_hits_sorted

This will give you accepted_hits_sorted.bam file in your directory and Cufflinks do accept bam file, so there will be no need of converting it into sam.

Also if you have used TopHat, then your file is already sorted. So it is ready to use for Cufflinks.

EDIT

If A.bam is the problem file, then [?]samtools view -H A.bam > header.sam samtools reheader header.sam A.bam > B.bam[?]

Source

ADD COMMENT
1
Entering edit mode

Reheading didn't work, but I've altered hits.cpp and will get our dude to make install in the morning. Thanks for your help!

cufflinks -p 20 -o 36mg rehead.bam [20:23:19] Inspecting reads and determining fragment length distribution.

Processed 0 loci. [*******] 100% Map Properties: Total Map Mass: 0.00 Fragment Length Distribution: Truncated Gaussian (default) Default Mean: 200 Default Std Dev: 80 [20:23:19] Assembling transcripts and estimating abundances. Processed 0 loci. [*******] 100%

ADD REPLY
0
Entering edit mode

Did you solve the problem after you altered hits.cpp file?I meet the same error thanks

ADD REPLY
0
Entering edit mode

Hi, I am getting the same error. What should I alter in hits.cpp file?

ADD REPLY
0
Entering edit mode

It starts to give me hassles as here: http://biostar.stackexchange.com/questions/14985/struggling-with-cufflinks-into-cuffmerge when I put the .bam files in. Is there a faster work-around without changing the hitts.cpp? (I don't have permissions and am working off site).

ADD REPLY
0
Entering edit mode

I just found something here (http://seqanswers.com/forums/showthread.php?t=15751). If you got error about BAM header is big, then try reheader with samtools. See my edit.

ADD REPLY
0
Entering edit mode
12.0 years ago
Swbarnes2 ★ 1.6k

It doesn't think you have a .bam file. That's a bigger problem than the lack of sorting.

ADD COMMENT
0
Entering edit mode
12.0 years ago
Arun 2.4k

Your problem seems to be in the creation of BAM from SAM file. The best way to convert a SAM to BAM file is to use picard tools SamFormatConverter as:

java -jar SamFormatConverter.jar I=input.sam O=output.bam

If you want to sort a SAM file by coordinate and convert it to bam all at the same time, you could use MergeSamFiles with just 1 input as:

java -jar MergeSamFiles.jar I=input.sam O=output.bam SO=coordinate AS=false USE_THREADING=true TMP_DIR=<Path_to_TEMP_dir>
ADD COMMENT
0
Entering edit mode
3.8 years ago
1152389206 • 0

I also met the problem but I have solved it .Maybe you can delete the information about 'Contig1000|---NA---' and it will work well.

ADD COMMENT

Login before adding your answer.

Traffic: 2370 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6