Hi,
I'm totally new to bioinformatics and using linux. So, I've run a 'read until' targeting on several genes and these are the steps that I've been taking:
1) porechop the fastq file 2) nanofilt it to exclude reads which are > 50k b.p 3) use minimap to align it to hg38 and sort it at the same time 4) Index the bamfile 5) load into IGV.
I couldn't see a thing. So, referring to old posts, that someone facing the same issue. It is recommended to pull up the header to see what's the first few output.
I use this script: samtools view -h bamfile | head -30
It gave me a whole list of this:
@HD VN:1.3 SO:coordinate
@SQ SN:chr1 LN: 248956422
etc
I can't see any indicators on which chromosome should i focus on in IGV...this makes me wonder did one of the steps gone wrong?
Your help is much appreciated. Thank you.
Hi,
Thank you for swift reply. May I know how to see the readname?
If you look at your alignment file with just
samtools view bamfile
the read names will be in column 1. Make sure you select a read that shows alignment. SAM file format specification page 6 is what you need to check for explanation of SAM alignment lines.Hi,
Does this mean I need to convert my bamfile into samfile beforehand?
Tq.
Use this script:
I got this:
Use this script: samtools view "/WORKSPACE/Kee2/trialone23022021/bamfile/trialone23022021.bam"
nothing happen? no new things appear...
Do you use
-h
in the command. This is just showing you the header of the BAM file which is more than 30 lines. You don't need to convert BAM file to SAM.Hi,
I did use use the '-h' for the first time and It showed me the @SQ lines. When I exclude the '-h', I see nothing..?
This script: samtools view -h "/WORKSPACE/Kee2/trialone23022021/bamfile/trialone23022021.bam" | head -30 produced all the @SQ lines.
Meanwhile, this script: samtools view "/WORKSPACE/Kee2/trialone23022021/bamfile/trialone23022021.bam" and this: samtools view "/WORKSPACE/Kee2/trialone23022021/bamfile/trialone23022021.bam | head -30" did not produce any output. Nothing appear...
So it looks like you have just the header in the file with no actual alignments. Are you sure the alignments worked? If you do
samtools idxstats your.bam
do you see reads aligned to chromosomes? My guess is there will be nothing.Okay, I've trialed on the script and this is what the output are:
I see the chromosome...but how would I know whether they are alligned..?
See the zeros in third column. That says there is nothing there. No reads have aligned to any of the references.
You should go back and work your way up the files to see where the issue lies. Perhaps sorting/indexing messed something up if the alignments had worked.
Hi,
Ah...., no alignment. That's disappointing. Would you be able to help oversee the pathway that I've done...it needs experience to actually see where could potentially gone wrong..
1) After getting the first 200 fastq files (each with 4000 reads = total would be 80000 reads), I combine them into one big fastq file using the cat * > script. 2) Then, I use porechop to trim the adaptors at start and end and some middle with this script: porechop -i input_reads.fastq -o output_reads.fastq 3) Next, I unzip the trimmed fastq file to run through nanofilt (to get rid of reads more than 50K) and then zip them again: gunzip -c input_trimmed.fastq.gz | Nanofilt -q 7 --maxlength 50000 | gzip > output_nanofiltered.fastq.gz 4) Then, I align them using minimap2 and sort them using samtools sort: minimap2 -ax map-ont -t 4 refseq.mmi -o input_nanofiltered.fastq.gz --secondary=no | samtools sort -@8 -o output_file.bam 5) Then I use samtools index : samtools index input_file.bam.
After getting the 'bai' file, I uploaded them into IGV..
I really can't see where could it gone wrong...would really appreciate your feedback.
Thank you.
I assume you have obfuscated file names (which is OK).
Do you see alignments in the BAM file that came off
minimap2
step?samtools view output_file.bam | head -20
?HI,
Thank you for your reply.
1) What would be considered as an un-obfuscated file name?
2) Do you mean the bamfile which were not indexed?
I'm confused, I only have one bamfile. All the samtools viewing done so far have been conducted on this one bamfile.
I've trialed the script: samtools view output_file.bam | head -20
I got no outputs.....
You need to use a different file name to hold sorted output in
samtools sort
step.If you literally did just
then you should do
If
samtools view output_file.bam | head -10
is not producing any output then you may have damaged the alignment file and will have to re-run theminimap
alignment. Then sort and index the new alignment using steps above.Hi,
Sorry for late reply. The forum doesn't allow me to post more than 5 comments in 6 hours.
Back to your last suggestion: 1) Do you mean I should have 2 bamfiles after alignment? 2) my script should be like this?:
minimap2 -ax map-ont -t 4 refseq.mmi -o input_nanofiltered.fastq.gz --secondary=no | samtools sort -@8 -o sorted_file.bam output_file.bam
but means there will be two outputs?
Thank you.
If you choose to pipe your
minimap2
output directly tosamtools
then you should use something likeIn this case there should be only one output BAM file that will be sorted. You can then index that.
Hi,
Okay! Understood!
Thank you for your help. Really appreciate it.