Question: Samtools sort error
1
gravatar for ARich
4 months ago by
ARich80
United States
ARich80 wrote:

Dear Biostar users,

I have been working with metagenomics data.

In order to perform binning I need to create depth file.

  • For this I first ran bbwrap with spades contigs as reference and samples as input reads.
for i in ${finallist}
  do
    bbwrap.sh \
    ref=spades/contigs.fasta \
    in=${i},${i%1*}u.final.clean.fq \
    in2=${i%1*}2.merged.final.clean.fq,NULL \
    t=24 \
    out= ${i%1*}sam.gz \
    covstats= ${i%1*}coverage
   done
  • After this I did samtools view with follow command:
    samtools view \
    -S \
    -b \
    -u \
    ${i%1*}sam.gz > \
    ${i%1*}bam
  • Followed by sorting the generated bam file using samtools sort function with following command:
   samtools sort \
    -m 24G \
    -@ 3 \
    ${i%1*}bam \
    -o  ${i%1*}sorted.bam

During the sorting I getting strange character error messages or terminal with empty file in the end
I tried to solve the problem by checking the header of my reference and bam file and it looks the same.

Here are first five lines of my bam file:

NS500633:12:H7JF2BGXX:1:11101:4115:8221 83      NODE_179_length_41253_cov_14.682023     5385    44      121=    =       5235    -271GTGGAAAATGAAAAATATATTGATATCCTATGGGAAGGCTATGGCAAGGGGGAAACGGTGACGATTGGCGAAGATAAGGAATTCCTCATGGCAGATGCAAAGTCAAAGGATGGAATGGAAC        FFFF<FFF.FF<F<F7<FFAFAAF<FFF.<.FF<FFFFAFFFFFF<FFF<FFF<FFFFFFFFFFFAF.FAFFFFFFFFFFFFFFFFFFAFFFFFFFFFFFFF<FFFFFFFFFFFFFAAAA7    NM:i:0  AM:i:44
NS500633:12:H7JF2BGXX:1:11101:4115:8221 163     NODE_179_length_41253_cov_14.682023     5235    45      148=    =       5385    271 GGATGGAAAAGAGGATGGGGGATAGATATTGCAGGATATAAAAAAATCAACATCCAGGCACAGTACGGCAATTGCGTGGTTATTTTGAAATATCCGGATGACATGACGGAATATGTGGTTGTGGATAAAGGGTTTGGTTTTTCCCTGC  AA<AAFAAFFFFFFFFFFAFFFFAFFFFFFFFAFFAF.F<AFFFAFFFF7FFFFFFF.FAFFFAFF<FFAFF<FFFF<AFA7FFFFF77FF<FF.FFF<<.F.F<.FFAF7F)7FFFFFFFFFAAF.7FFFFFFAF.A777A<7A<A< NM:i:0  AM:i:45
NS500633:12:H7JF2BGXX:1:11101:7872:8238 83      NODE_395_length_25884_cov_15.386465     4487    45      145=    =       4287    -345CACTAAAACGTCAGGCTTTAGAAAAGGCCAGCCCGGTAAACCAACAGATTCAATATCCGGCCTGGACGGATGATGATGACTTAATGAGCTGGCTCAGCTCATTTGGCAAACGATAATTATGTCAGCAACTCACATAGTCCTCCAA     FFF<FFAFF7FF<FFA7F7AAAAF<AFFFFF..7FFFFFF<F<FFFFFFAFFF.FFFFFF.AFFFFAFFF<F<FFFFFFFFFFFAFFFF.FFFFFFFA<FAFF.FFFFFFFFFFFFFAFFFFFFFFFFFAFFFFAFFFFFAAAAA    NM:i:0  AM:i:45
NS500633:12:H7JF2BGXX:1:11101:7872:8238 163     NODE_395_length_25884_cov_15.386465     4287    43      78=     =       4487    345 GCATATATGAAGGATTACAAACTCGAACAGGCTCCCAGCACACACCCATTGTTCTGCAACCGCAGCGGCGCAAAATTC   AAAAAFAFFFFFFFFFAFFF.FAFFA.FFFFFFFFFFFF.FFFAFFFFFFFFFFFFFF)AAFF7<F<FFFFFFFAFFF       NM:i:0  AM:i:43
NS500633:12:H7JF2BGXX:1:11101:8333:8251 77      *       0       0       *       *       0       0       CCTACACCGTGGAATGTGACTTCATAAGTAGCGGAACCCTTTCTGGAGCGAACTAATTCGCCGCCTGCACGAGCAGCTTCGAATACGAGTGCGCGTTTGGACTTTGCTGCC      AAAFFFFAFFFFFFFFFFFFFFFFF.AFFAFFAFFAFFFFAFFF<FAFFF<FF..<7A<<AFFFFA<F<FF<AFFFFFFFAFFFFFFFFFFAFAFF<FFFFFF7FFAA<FF

And below is the five lines of grep from my reference headers:

>NODE_1_length_199624_cov_13.770641
>NODE_2_length_193066_cov_15.912808
>NODE_3_length_183817_cov_14.803240
>NODE_4_length_163959_cov_11.179160
>NODE_5_length_128783_cov_13.807804

I am really unable to understand where is the problem coming from. It would really great it someone can help with this.

Thanks in advance!
Cheers,
AR

ADD COMMENTlink modified 4 months ago • written 4 months ago by ARich80

step 2: is it really ? ...}sam.gz) > \ ${AN...

ADD REPLYlink written 4 months ago by Pierre Lindenbaum124k

Thank Pierre for the reply, Yes it is sam.gz. Nonetheless, I also tried without compression verison and I still got the error message during sorting

ADD REPLYlink written 4 months ago by ARich80
2

no, I meant that redirection > \

ADD REPLYlink written 4 months ago by Pierre Lindenbaum124k

Yes its there. Isn't that correct?

ADD REPLYlink written 4 months ago by ARich80
2
gravatar for ARich
4 months ago by
ARich80
United States
ARich80 wrote:

The issue is resolved. I had installed the samtools through conda and it did not install HTSLIB and that was the issue. Now I have installed samtools locally and it ran normally.

Thank you helping me think beyond the script.

Cheers, AR

ADD COMMENTlink modified 4 months ago • written 4 months ago by ARich80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1713 users visited in the last hour