alignment containing sequences from position a to b
1
0
Entering edit mode
6.4 years ago

Hello everyone,

I'm looking for an alignment containing reads started or clipped at the first position and ended or clipped at the last position of my alignment. I don't want to have reads ended in the middle of the sequence.

For that I've done an alignment from fastq input with bwa on my entire gene, then selected a region between 2 positions (samtools view -b trialsorted.bam AA_AD:30-230 > trialsorted30-230.bam) and then realign these sequences with a reference containing only the first bases of my start position and the same for the last bases of my end position in order to keep only reads of interest. However I didn't get a selection of sequences starting from a position and ending to another position.

Does anyone have any idea how to solve my problem?

Thanks

alignment sequence sequencing • 2.6k views
0
Entering edit mode
java -jar picard.jar FilterSamReads I=align-sort.bam O=out.bam JS=clippFilter.js FILTER=includeJavascript


with clippFilter.js containing:

function accept(r)
{
}
accept(record);


Using this command, I got something, however this one

function accept(r)
{
r.getStart()==30 &&
r.getEnd()==130;
}
accept(record);


does not allow me to crop sequences from 30 to 130, nothing pass through the filter. Using igv, I saw my input contain sequences that overlap this region.

0
Entering edit mode
6.4 years ago
samtools view -b in.bam AA_AD:30-230 | java -jar dist/samjs.jar -e 'record.getContig()=="AA_AD" && record.getStart()==30 && record.getEnd()==230'

0
Entering edit mode

thank you for your rapid answer! I can't however run it with 'FilterSamReads'

0
Entering edit mode

I still don't get it, how to run this command with FilterSamReads? I tried without success:

samtools view -b in.bam AA_AD:30-230 | java -jar FilterSamReads.jar -e 'record.getContig()=="AA_AD" && record.getStart()==30 && record.getEnd()==230'

java -jar FilterSamReads.jar INPUT=in.bam -e 'record.getContig()=="AA_AD:30-230" && record.getStart()==30 && record.getEnd()==230' OUTPUT=out.bam

0
Entering edit mode

And what was the error displayed ? What did you see ?

0
Entering edit mode

ERROR: Invalid argument '-e'

impossible to find something equivalent in the options

0
Entering edit mode

"Invalid argument".

I will not give you the solution: the manual for this picard tool is here https://broadinstitute.github.io/picard/command-line-overview.html#FilterSamReads .

1
Entering edit mode

by the way, nowadays, there is no such file "FilterSamReads.jar ", there is only "picard.jar". Your version of picard looks very old. What is the version ? current is 2.7.1

1
Entering edit mode

Ok thank you! After having updated picard and carefully read of the manual, I ran this command: java -jar picard.jar FilterSamReads I="insorted.bam", O="outsorted.bam", READ_LIST_FILE="read_names.txt", FILTER="includeAligned"

with read_names.txt file containing: AA_AD:30-230 -e 'record.getContig()=="AA_AD:30-230" && record.getStart()==30 && record.getEnd()==230'

However, there is still something wrong, I don't understand what happen with insorted.bam (sorted and present in my folder): "ERROR 2016-11-22 08:54:11 FilterSamReads Failed to filter aa.bam, htsjdk.samtools.SAMException: Cannot read non-existent file: /home/admin/Bureau/insorted.bam, at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:338) at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:325) at picard.sam.FilterSamReads.doWork(FilterSamReads.java:210) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:208) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)"

2
Entering edit mode

The error is quite clear. The file doesn't exist in the path specified. There is no /home/admin/Bureau/insorted.bam. Where is insorted.bam?

And what's up with the comma's and "" in your command? I can't find those in the manual.

0
Entering edit mode

Thanks for opening my eyes!

0
Entering edit mode
java -jar picard.jar FilterSamReads I=insorted.bam O=outsorted.bam JS=clippFilter.js FILTER=includeJavascript


with -e 'record.getContig()=="AA_AD" && record.getStart()==30 && record.getEnd()==230' in clippFilter.js file. I however got a syntax error after the first " ' "

0
Entering edit mode

You're mixing everything. There is no such '-e' . This option '-e' is for another tool, not for picard. Your script in picard should be only:

record.getContig()=="AA_AD" && record.getStart()==30 && record.getEnd()==230

0
Entering edit mode

that's what I've finally done (and also with a script using 'function'), no problem with 'record.getContig' but when I add 'record.getStart' and/or 'record.getEnd' I get no record, as mentioned in my post above. I've tried with other files, other positions without success Here is the start of my sorted bam:

ZUXCF:05542:10628   0   AA_AD   1   34  33S62M1D5M  *   0   0   GCTTCGTAAGCTGTCAGCCTCTCACGGTCCGCTATGTTTTTCAACCCGTATCTGAGCGGCGGCGTGACCGGCGGTGCGGTCGCGGGTGGCCGGCGCAGCG    >8828<<<9=>>>===<<8<<=<<>>9==8<==>@=C>C=3=?9?A8>>=?<<<><<<78808:<===9=9::6:===:>>>666+68/:6<8<<<==<>    NM:i:1  MD:Z:62^T5  AS:i:62 XS:i:0
ZUXCF:03174:13233   16  AA_AD   1   60  68S98M  *   0   0   ATCTAGTGCTGCATGTGTATTTTCTTTGTGATTTTGCTTCGTAAGCTGTCAGCCTCTCACGGTCCGCTATGTTTTTCAACCCGTATCTGAGCGGCGGTGTGACCGGCGGTGCGGTCGCGGGTGGCCGGCGTCAGCGTTCGCAGCCCGGCTCCGCGCAGGGCTCGGG  0)++++..,,-..<<<:::4=<<;6;;8887)7777;7=<<<8>==;;::770788<<883=<7<;;8888)888886;3;;;@====<7777770::;;;94<7994=</0,><99;6<<<77282<<<;;;;880888;;;7==9=<=9<<<:;82::<0/.?A  NM:i:1  MD:Z:29C68  AS:i:93 XS:i:0

0
Entering edit mode

so you don't have any read starting at 30...

0
Entering edit mode

it's only the beginning of the file... My aim is not to select reads starting at a defined position and ending at a defined position but get an interval of sequences between two positions even if I have piece of reads

1
Entering edit mode

You may find a way to trim the aligned sequence at position 30, then do the same with 230 and remove all the part-aligned sequences using FilterSamReads record.getStart()==30 && record.getEnd()==230

0
Entering edit mode

how can I trim aligned sequences at position 30 and 230?