Question: I can't see my bam file on igv
0
gravatar for manubiomed20
17 days ago by
manubiomed200 wrote:

I assembled an exome using the following command:

bwa mem -t 12 -B 4 -O 6 -E 1 -M -R  '@RG\tID:SRR1517898\tSM:HG00096\tPL:ILLUMINA'                 
  /home/ims.santos06/reference/hg38.fa /home/ims.santos06/fastq/SRR1517898_1.fastq.gz  /home/ims.santos06/fastq/SRR1517898_2.fastq.gz 
 | samtools view -1 - >     /home/ims.santos06/bam/SRR1517`898.bam

I received this message from bwa:

(base) ims.santos06@nodesgi4:~$ bwa mem -t 12 -B 4 -O 6 -E 1 -M -R  '@RG\tID:SRR1517898\tSM:HG00096\tPL:ILLUMINA'                   /home/ims.santos06/reference/hg38.fa /home/ims.santos06/fastq/SRR1517898_1.fastq.gz  /home/ims.santos06/fastq/SRR1517898_2.fastq.gz  | samtools view -1 - >     /home/ims.santos06/bam/SRR1517898.bam 
    [M::bwa_idx_load_from_disk] read 0 ALT contigs
    [W::bseq_read] the 1st file has fewer sequences.
    [W::bseq_read] the 1st file has fewer sequences.
    [main] Version: 0.7.17-r1188
    [main] CMD: bwa mem -t 12 -B 4 -O 6 -E 1 -M -R @RG\tID:SRR1517898\tSM:HG00096\tPL:ILLUMINA /home/ims.santos06/reference/hg38.fa /home/ims.santos06/fastq/SRR1517898_1.fastq.gz /home/ims.santos06/fastq/SRR1517898_2.fastq.gz
    [main] Real time: 10.202 sec; CPU: 10.160 sec

I followed the necessary steps to view in igv:

     samtools sort  /home/ims.santos06/bam/SRR1517898.bam      >   /home/ims.santos06/bam/SRR1517898.sorted.bam

I did it using two different commands and still couldn't see in igv:

samtools index /home/ims.santos06/bam/SRR1517898.sorted.bam   /home/ims.santos06/bam/SRR1517898.bam.bai

An error message appears with the file compression. Since this file has already been downloaded compressed from browser 1000 genome

I do not know how to solve this problem. I don't know if something is wrong with my assembly or with the formats of the generated files. Can someone help me ?

alignment assembly genome • 188 views
ADD COMMENTlink modified 13 days ago • written 17 days ago by manubiomed200
[W::bseq_read] the 1st file has fewer sequences.
[W::bseq_read] the 1st file has fewer sequences.

there is a problem with your fastq files: no the same number of reads between R1 and R2

ADD REPLYlink written 17 days ago by Pierre Lindenbaum122k

What is the output of:

samtools view /home/ims.santos06/bam/SRR1517898.sorted.bam | head
ADD REPLYlink written 17 days ago by h.mon27k

I figured that out, but check the fastq no fastqc program the r1 and r2 are with the same sequence number

ADD REPLYlink written 17 days ago by manubiomed200

Something went wrong during alignment. Probably shortage in memory so that the file parsing got corrupted. Your input files are probably ok. How much RAM is available? Also, did you manipulate the files after checking with fastqc like adapter trimming in non-paired mode?

ADD REPLYlink modified 16 days ago • written 16 days ago by ATpoint21k

I used fastqc just to evaluate the quality of bases. I believe that the problem happened in the assembly, I could not find the library of my fastq because I took the browser 1000 genomes and only later realized that this could affect the assembly. how do i find out my fastq library?

ADD REPLYlink written 13 days ago by manubiomed200

The memory of my machine is 19.6. Is there a chance the problem is with my reading group? I did not add this information: LB = DNA preparation library identifier

ADD REPLYlink written 13 days ago by manubiomed200

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. SUBMIT ANSWER is for new answers to original question.

ADD REPLYlink written 13 days ago by genomax70k

19.6 GB? An odd number? Anyway, try to align with like 4 threads, the file is not big, should not take long, then repeat bam generation and indexing. No, IGV does not need read groups, neither does bwa.

ADD REPLYlink modified 13 days ago • written 13 days ago by ATpoint21k

By memory manubiomed20 likely means hard disk space. ATpoint meant RAM, typically something like 16 or 32 GB.

ADD REPLYlink written 12 days ago by colindaven1.6k
2
gravatar for ATpoint
13 days ago by
ATpoint21k
Germany
ATpoint21k wrote:

I put that as answer to make it prominent:

It seems that your alignment (not assembly, that is the wrong term) went wrong somewhere. This could be because your fastq files are corrupted or because you went out of memory. I quickly downloaded the SRR1517898 fastq files and aligned with bwa mem to hg38, withour errors, flagstatoutput:

5296205 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
3349 + 0 supplementary
0 + 0 duplicates
5241341 + 0 mapped (98.96% : N/A)
5292856 + 0 paired in sequencing
2646428 + 0 read1
2646428 + 0 read2
5165450 + 0 properly paired (97.59% : N/A)
5215672 + 0 with itself and mate mapped
22320 + 0 singletons (0.42% : N/A)
31342 + 0 with mate mapped to a different chr
26625 + 0 with mate mapped to a different chr (mapQ>=5)

The files are ok. Maybe something during download went wrong, but this is unlikely, you can still re-download.

What is the available RAM in GB in your machine?

ADD COMMENTlink modified 13 days ago • written 13 days ago by ATpoint21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1676 users visited in the last hour