Question

I can't see my bam file on igv

0

Entering edit mode

4.7 years ago

manubiomed20 ▴ 10

I assembled an exome using the following command:

bwa mem -t 12 -B 4 -O 6 -E 1 -M -R  '@RG\tID:SRR1517898\tSM:HG00096\tPL:ILLUMINA'                 
  /home/ims.santos06/reference/hg38.fa /home/ims.santos06/fastq/SRR1517898_1.fastq.gz  /home/ims.santos06/fastq/SRR1517898_2.fastq.gz 
 | samtools view -1 - >     /home/ims.santos06/bam/SRR1517`898.bam

I received this message from bwa:

(base) ims.santos06@nodesgi4:~$ bwa mem -t 12 -B 4 -O 6 -E 1 -M -R  '@RG\tID:SRR1517898\tSM:HG00096\tPL:ILLUMINA'                   /home/ims.santos06/reference/hg38.fa /home/ims.santos06/fastq/SRR1517898_1.fastq.gz  /home/ims.santos06/fastq/SRR1517898_2.fastq.gz  | samtools view -1 - >     /home/ims.santos06/bam/SRR1517898.bam 
    [M::bwa_idx_load_from_disk] read 0 ALT contigs
    [W::bseq_read] the 1st file has fewer sequences.
    [W::bseq_read] the 1st file has fewer sequences.
    [main] Version: 0.7.17-r1188
    [main] CMD: bwa mem -t 12 -B 4 -O 6 -E 1 -M -R @RG\tID:SRR1517898\tSM:HG00096\tPL:ILLUMINA /home/ims.santos06/reference/hg38.fa /home/ims.santos06/fastq/SRR1517898_1.fastq.gz /home/ims.santos06/fastq/SRR1517898_2.fastq.gz
    [main] Real time: 10.202 sec; CPU: 10.160 sec

I followed the necessary steps to view in igv:

     samtools sort  /home/ims.santos06/bam/SRR1517898.bam      >   /home/ims.santos06/bam/SRR1517898.sorted.bam

I did it using two different commands and still couldn't see in igv:

samtools index /home/ims.santos06/bam/SRR1517898.sorted.bam   /home/ims.santos06/bam/SRR1517898.bam.bai

An error message appears with the file compression. Since this file has already been downloaded compressed from browser 1000 genome

I do not know how to solve this problem. I don't know if something is wrong with my assembly or with the formats of the generated files. Can someone help me ?

Assembly alignment genome • 1.9k views

ADD COMMENT • link 4.7 years ago by manubiomed20 ▴ 10

0

Entering edit mode

[W::bseq_read] the 1st file has fewer sequences.
[W::bseq_read] the 1st file has fewer sequences.

there is a problem with your fastq files: no the same number of reads between R1 and R2

ADD REPLY • link 4.7 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

What is the output of:

samtools view /home/ims.santos06/bam/SRR1517898.sorted.bam | head

ADD REPLY • link 4.7 years ago by h.mon 35k

0

Entering edit mode

I figured that out, but check the fastq no fastqc program the r1 and r2 are with the same sequence number

ADD REPLY • link 4.7 years ago by manubiomed20 ▴ 10

0

Entering edit mode

Something went wrong during alignment. Probably shortage in memory so that the file parsing got corrupted. Your input files are probably ok. How much RAM is available? Also, did you manipulate the files after checking with fastqc like adapter trimming in non-paired mode?

ADD REPLY • link 4.7 years ago by ATpoint 81k

0

Entering edit mode

I used fastqc just to evaluate the quality of bases. I believe that the problem happened in the assembly, I could not find the library of my fastq because I took the browser 1000 genomes and only later realized that this could affect the assembly. how do i find out my fastq library?

ADD REPLY • link 4.7 years ago by manubiomed20 ▴ 10

0

Entering edit mode

The memory of my machine is 19.6. Is there a chance the problem is with my reading group? I did not add this information: LB = DNA preparation library identifier

ADD REPLY • link 4.7 years ago by manubiomed20 ▴ 10

0

Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. SUBMIT ANSWER is for new answers to original question.

ADD REPLY • link 4.7 years ago by GenoMax 141k

0

Entering edit mode

19.6 GB? An odd number? Anyway, try to align with like 4 threads, the file is not big, should not take long, then repeat bam generation and indexing. No, IGV does not need read groups, neither does bwa.

ADD REPLY • link 4.7 years ago by ATpoint 81k

0

Entering edit mode

By memory manubiomed20 likely means hard disk space. ATpoint meant RAM, typically something like 16 or 32 GB.

ADD REPLY • link 4.7 years ago by colindaven 6.3k

score 2 · Answer 1 · 2019-08-12

I put that as answer to make it prominent:

It seems that your alignment (not assembly, that is the wrong term) went wrong somewhere. This could be because your fastq files are corrupted or because you went out of memory. I quickly downloaded the SRR1517898 fastq files and aligned with bwa mem to hg38, withour errors, flagstatoutput:

5296205 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
3349 + 0 supplementary
0 + 0 duplicates
5241341 + 0 mapped (98.96% : N/A)
5292856 + 0 paired in sequencing
2646428 + 0 read1
2646428 + 0 read2
5165450 + 0 properly paired (97.59% : N/A)
5215672 + 0 with itself and mate mapped
22320 + 0 singletons (0.42% : N/A)
31342 + 0 with mate mapped to a different chr
26625 + 0 with mate mapped to a different chr (mapQ>=5)

The files are ok. Maybe something during download went wrong, but this is unlikely, you can still re-download.

What is the available RAM in GB in your machine?