Have anyone been experienced Socrates ?
0
0
Entering edit mode
9.4 years ago
mangfu100 ▴ 800

Hi.

I am struggling to want using Socrates.

I looked up its web pages and followed the procedure written in it.

However, when I tried to run the second step called "realignment process", program show me warning message and I suppose that this warning message might be affect my final results.

Below is my warning message.

./Socrates realignment filteredresults_long_sc_l25_q5_m5_i95.fastq.gz out3.bam --bowtie2_db filteredresults_long_sc_l25_q5_m5_i95.fasta

Bowtie2 alignment, DB=filteredresults_long_sc_l25_q5_m5_i95.fasta
115 reads; of these:
  115 (100.00%) were unpaired; of these:
    11 (9.57%) aligned 0 times
    83 (72.17%) aligned exactly 1 time
    21 (18.26%) aligned >1 times
90.43% overall alignment rate

Add anchor information into re-alignment BAM file
  input BAM file:       -
  output BAM file:      out3.bam
WARNING: 115 alignments with no matching reference chromosome name
in realignment BAM's sequence dictionary. Please ensure the same alignment indexes
are used for both raw read alignment and soft-clip realignment.

I think that Socrates was developed in 2014 and many people do not use it. However, is there someone who already used it and got the final results,

Could you help me and advise for my problem?

alignment next-gen-sequencing • 2.2k views
ADD COMMENT
1
Entering edit mode

SAM/BAM files have what's called a header. This header contains, among other things, information about all of the contig names in the FASTA file which was used as the reference for the alignment. Below is a (very short) example of a FASTA file. The header is the first line:

>reference_contig
ACGTCGATCGTAGCTAGCTCGAACAGCTAGCTACGATCGTACGTCGATCGATCGATCGATCGTACGATCCACCAGTA

You can view the header of a BAM file using samtools:

samtools view -H [my.bam]

I suspect the problem here is that there is some disagreement between the FASTA reference and one or more of the output files. For starters, can you please tell us the output of the above command when run on each of the output BAM files, and the output of this command?:

grep '^>' [reference.fasta]

It might also be helpful to add the results of this command:

samtools view -h [example.bam]

so we can see the first few lines of the BAM file.

ADD REPLY
0
Entering edit mode

Firstly below is my command for running Socrates.

/Socrates realignment filteredresults_long_sc_l25_q5_m5_i95.fastq.gz out3.bam --bowtie2_db filteredresults_long_sc_l25_q5_m5_i95.fasta

input file: filteredresults_long_sc_l25_q5_m5_i95.fastq.gz

output file: out3.bam

and I specified bowtie2_db options, and the input file for this option is that ,firstly, converting filteredresults_long_sc_l25_q5_m5_i95.fastq.gz to filteredresults_long_sc_l25_q5_m5_i95.fasta by using

gunzip -c  filteredresults_long_sc_l25_q5_m5_i95.fastq.gz | \
  awk '
    {
      if(NR%4==1) {
        printf(">%s\n",substr($0,2));
      }
      else if(NR%4==2)
        print;
    }
  ' > filteredresults_long_sc_l25_q5_m5_i95.fasta

and, secondly, I made index databases using bowtie2-build.

In this way, I ran Socrates and below is the first few lines of my input and output files as you mentioned.

1. grep '^>' filteredresults_long_sc_l25_q5_m5_i95.fasta

>DJG84KN1:246:C0VTVACXX:8:2216:14706:73242/2&chr14&19486038&-&1&TCAAAGCCTTGATCTCTTCTTTTTCAGGTACCTGATGCCT
>DJG84KN1:246:C0VTVACXX:8:1212:19548:4085/2&chr14&19489007&-&0&ATATTTGCTGACAGGTGTATCTGCGTTTCCCTCTGAGGACATCACTGAAATCATGGAACCCTTAT
>DJG84KN1:246:C0VTVACXX:8:1309:21210:96438/2&chr14&19489007&-&0&ATATTTNCNGACAGGTGTATCTGCGTTTCCCTCTGAGGACATCACTGAAA
>DJG84KN1:246:C0VTVACXX:8:1112:21154:15113/2&chr14&19489041&-&0&GAGGACATCACTGAAATCATGGAANCCTTATGTTCACTGACCG
>DJG84KN1:246:C0VTVACXX:8:1302:21219:46823/2&chr14&19490615&-&0&CGTGAAAGATTACCAAAGAGAANGNCATATATTTAGCAGATCAGTTAATAGACAAAAGTCAACTTT

2. /DATA2/sclee1/samtools-0.1.19/samtools view -h out3.bam

@HD     VN:1.0  SO:coordinate
@SQ     SN:DJG84KN1:246:C0VTVACXX:8:2216:14706:73242/2&chr14&19486038&-&1&TCAAAGCCTTGATCTCTTCTTTTTCAGGTACCTGATGCCT     LN:53
@SQ     SN:DJG84KN1:246:C0VTVACXX:8:1212:19548:4085/2&chr14&19489007&-&0&ATATTTGCTGACAGGTGTATCTGCGTTTCCCTCTGAGGACATCACTGAAATCATGGAACCCTTAT      LN:28
@SQ     SN:DJG84KN1:246:C0VTVACXX:8:1309:21210:96438/2&chr14&19489007&-&0&ATATTTNCNGACAGGTGTATCTGCGTTTCCCTCTGAGGACATCACTGAAA    LN:43
@SQ     SN:DJG84KN1:246:C0VTVACXX:8:1112:21154:15113/2&chr14&19489041&-&0&GAGGACATCACTGAAATCATGGAANCCTTATGTTCACTGACCG  LN:50
@SQ     SN:DJG84KN1:246:C0VTVACXX:8:1302:21219:46823/2&chr14&19490615&-&0&CGTGAAAGATTACCAAAGAGAANGNCATATATTTAGCAGATCAGTTAATAGACAAAAGTCAACTTT    LN:27

Finally, one thing that I suspect is that I used for input bam file was made by using bwa aligner, however, the output bam files might be made up of using bowtie2. This mismatch could be warning message?

It is just my guess.

Thank you for your helping and look forward to seeing your answer :)

ADD REPLY

Login before adding your answer.

Traffic: 1492 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6