Question: Uploaded bam file into IGV but couldn't see anything
1
gravatar for dreamfeathers08
12 days ago by
dreamfeathers0810 wrote:

Hi,

I'm totally new to bioinformatics and using linux. So, I've run a 'read until' targeting on several genes and these are the steps that I've been taking:

1) porechop the fastq file 2) nanofilt it to exclude reads which are > 50k b.p 3) use minimap to align it to hg38 and sort it at the same time 4) Index the bamfile 5) load into IGV.

I couldn't see a thing. So, referring to old posts, that someone facing the same issue. It is recommended to pull up the header to see what's the first few output.

I use this script: samtools view -h bamfile | head -30

It gave me a whole list of this:

@HD VN:1.3 SO:coordinate
@SQ SN:chr1 LN: 248956422

etc

I can't see any indicators on which chromosome should i focus on in IGV...this makes me wonder did one of the steps gone wrong?

Your help is much appreciated. Thank you.

alignment • 126 views
ADD COMMENTlink modified 12 days ago • written 12 days ago by dreamfeathers0810

Hi,

Thank you for swift reply. May I know how to see the readname?

ADD REPLYlink written 12 days ago by dreamfeathers0810

If you look at your alignment file with just samtools view bamfile the read names will be in column 1. Make sure you select a read that shows alignment. SAM file format specification page 6 is what you need to check for explanation of SAM alignment lines.

ADD REPLYlink written 12 days ago by GenoMax96k

Hi,

Does this mean I need to convert my bamfile into samfile beforehand?

Tq.

ADD REPLYlink written 12 days ago by dreamfeathers0810

Use this script:

samtools view -h "/WORKSPACE/Kee2/trialone23022021/bamfile/trialone23022021.bam" | head -30

I got this:

@HD     VN:1.3  SO:coordinate
@SQ     SN:chr1 LN:248956422
@SQ     SN:chr10        LN:133797422
@SQ     SN:chr11        LN:135086622
@SQ     SN:chr11_KI270721v1_random      LN:100316
@SQ     SN:chr12        LN:133275309
@SQ     SN:chr13        LN:114364328
@SQ     SN:chr14        LN:107043718
@SQ     SN:chr14_GL000009v2_random      LN:201709
@SQ     SN:chr14_GL000225v1_random      LN:211173
@SQ     SN:chr14_KI270722v1_random      LN:194050
@SQ     SN:chr14_GL000194v1_random      LN:191469
@SQ     SN:chr14_KI270723v1_random      LN:38115
@SQ     SN:chr14_KI270724v1_random      LN:39555
@SQ     SN:chr14_KI270725v1_random      LN:172810
@SQ     SN:chr14_KI270726v1_random      LN:43739
@SQ     SN:chr15        LN:101991189
@SQ     SN:chr15_KI270727v1_random      LN:448248
@SQ     SN:chr16        LN:90338345
@SQ     SN:chr16_KI270728v1_random      LN:1872759
@SQ     SN:chr17        LN:83257441
@SQ     SN:chr17_GL000205v2_random      LN:185591
@SQ     SN:chr17_KI270729v1_random      LN:280839
@SQ     SN:chr17_KI270730v1_random      LN:112551
@SQ     SN:chr18        LN:80373285
@SQ     SN:chr19        LN:58617616
@SQ     SN:chr1_KI270706v1_random       LN:175055
@SQ     SN:chr1_KI270707v1_random       LN:32032
@SQ     SN:chr1_KI270708v1_random       LN:127682
@SQ     SN:chr1_KI270709v1_random       LN:66860

Use this script: samtools view "/WORKSPACE/Kee2/trialone23022021/bamfile/trialone23022021.bam"

nothing happen? no new things appear...

ADD REPLYlink modified 12 days ago by GenoMax96k • written 12 days ago by dreamfeathers0810

Do you use -h in the command. This is just showing you the header of the BAM file which is more than 30 lines. You don't need to convert BAM file to SAM.

ADD REPLYlink written 12 days ago by GenoMax96k

Hi,

I did use use the '-h' for the first time and It showed me the @SQ lines. When I exclude the '-h', I see nothing..?

This script: samtools view -h "/WORKSPACE/Kee2/trialone23022021/bamfile/trialone23022021.bam" | head -30 produced all the @SQ lines.

Meanwhile, this script: samtools view "/WORKSPACE/Kee2/trialone23022021/bamfile/trialone23022021.bam" and this: samtools view "/WORKSPACE/Kee2/trialone23022021/bamfile/trialone23022021.bam | head -30" did not produce any output. Nothing appear...

ADD REPLYlink written 11 days ago by dreamfeathers0810

When I exclude the '-h', I see nothing..?

So it looks like you have just the header in the file with no actual alignments. Are you sure the alignments worked? If you do samtools idxstats your.bam do you see reads aligned to chromosomes? My guess is there will be nothing.

ADD REPLYlink written 11 days ago by GenoMax96k

Okay, I've trialed on the script and this is what the output are:

chr1    248956422       0       0

chr10   133797422       0       0

chr11   135086622       0       0

chr11_KI270721v1_random 100316  0       0

chr12   133275309       0       0

chr13   114364328       0       0

chr14   107043718       0       0

chr14_GL000009v2_random 201709  0       0

chr14_GL000225v1_random 211173  0       0

I see the chromosome...but how would I know whether they are alligned..?

ADD REPLYlink modified 11 days ago by GenoMax96k • written 11 days ago by dreamfeathers0810

See the zeros in third column. That says there is nothing there. No reads have aligned to any of the references.

You should go back and work your way up the files to see where the issue lies. Perhaps sorting/indexing messed something up if the alignments had worked.

ADD REPLYlink modified 11 days ago • written 11 days ago by GenoMax96k

Hi,

Ah...., no alignment. That's disappointing. Would you be able to help oversee the pathway that I've done...it needs experience to actually see where could potentially gone wrong..

1) After getting the first 200 fastq files (each with 4000 reads = total would be 80000 reads), I combine them into one big fastq file using the cat * > script. 2) Then, I use porechop to trim the adaptors at start and end and some middle with this script: porechop -i input_reads.fastq -o output_reads.fastq 3) Next, I unzip the trimmed fastq file to run through nanofilt (to get rid of reads more than 50K) and then zip them again: gunzip -c input_trimmed.fastq.gz | Nanofilt -q 7 --maxlength 50000 | gzip > output_nanofiltered.fastq.gz 4) Then, I align them using minimap2 and sort them using samtools sort: minimap2 -ax map-ont -t 4 refseq.mmi -o input_nanofiltered.fastq.gz --secondary=no | samtools sort -@8 -o output_file.bam 5) Then I use samtools index : samtools index input_file.bam.

After getting the 'bai' file, I uploaded them into IGV..

I really can't see where could it gone wrong...would really appreciate your feedback.

Thank you.

ADD REPLYlink written 11 days ago by dreamfeathers0810

I assume you have obfuscated file names (which is OK).

Do you see alignments in the BAM file that came off minimap2 step? samtools view output_file.bam | head -20?

ADD REPLYlink written 11 days ago by GenoMax96k

HI,

Thank you for your reply.

1) What would be considered as an un-obfuscated file name?

2) Do you mean the bamfile which were not indexed?

ADD REPLYlink written 11 days ago by dreamfeathers0810

I'm confused, I only have one bamfile. All the samtools viewing done so far have been conducted on this one bamfile.

I've trialed the script: samtools view output_file.bam | head -20

I got no outputs.....

ADD REPLYlink written 11 days ago by dreamfeathers0810

You need to use a different file name to hold sorted output in samtools sort step.

If you literally did just

samtools sort -@8 -o output_file.bam

then you should do

samtools sort -@8 -o sorted_file.bam output_file.bam 
samtools index sorted_file.bam

If samtools view output_file.bam | head -10 is not producing any output then you may have damaged the alignment file and will have to re-run the minimap alignment. Then sort and index the new alignment using steps above.

ADD REPLYlink modified 11 days ago • written 11 days ago by GenoMax96k

Hi,

Sorry for late reply. The forum doesn't allow me to post more than 5 comments in 6 hours.

Back to your last suggestion: 1) Do you mean I should have 2 bamfiles after alignment? 2) my script should be like this?:

minimap2 -ax map-ont -t 4 refseq.mmi -o input_nanofiltered.fastq.gz --secondary=no | samtools sort -@8 -o sorted_file.bam output_file.bam

but means there will be two outputs?

Thank you.

ADD REPLYlink written 10 days ago by dreamfeathers0810

If you choose to pipe your minimap2 output directly to samtools then you should use something like

minimap2 -ax map-ont -t 4 refseq.mmi -o input_nanofiltered.fastq.gz --secondary=no | samtools sort -@8 -o sorted_file.bam -

In this case there should be only one output BAM file that will be sorted. You can then index that.

ADD REPLYlink written 10 days ago by GenoMax96k

Hi,

Okay! Understood!

Thank you for your help. Really appreciate it.

ADD REPLYlink written 10 days ago by dreamfeathers0810
0
gravatar for GenoMax
12 days ago by
GenoMax96k
United States
GenoMax96k wrote:

You probably have sparse data so it is not surprising you don't see something right away.

  1. You will likely need to zoom in to see things.
  2. Find a read name that is aligned in your BAM file and see the start location of where it is mapping. Now manually enter that coordinate location into the search box in IGV.
  3. You could also search for the read name in "search" function to find that read.
ADD COMMENTlink modified 12 days ago • written 12 days ago by GenoMax96k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1325 users visited in the last hour
_