Question: PAML Batch File?
2
gravatar for zgayk
4.0 years ago by
zgayk90
United States
zgayk90 wrote:

Hello,

Sorry to ask so many questions, but this is related to a problem I am trying to solve with PAML. I have no prior experience with PAML other than the few test analyses I've run, but I would like to use it to calculate pairwise dn/dS ratios for a large number of (1000+) orthologs or gene fragments. I was hoping I could submit a batch file to PAML consisting of the aligned sequences with the number of sequences and alignment length above each pair of sequences (phylip format). Because each aligned pair would have independent header information, I though PAML would give me an output file with omega values for each pair. But so far I have not been able to batch process these orthologs and it seems I will have to split this massive infile into individual files for each alignment. 

The question I am left with is: how are dN/dS ratios usually calculated in PAML or other programs by large genome projects that have far too many coding sequences alignments to read in by hand. Once my files are split into individual alignments, I am assuming I'll have to develop some sort of loop, but PAML will require the control file to be updated for each of the 1000 orthologs I wish to analyze. So, I am a bit confused how to analyze such a large amount of data. If anyone had any tips, they'd be very useful.

Thanks,

Zach

2 1011
seq11
CCCCGTCTGCTAAATGGGATGAGGATTGCAGCTGCGTAGCCTGATTTTTGTGGGCGAGATAATACATCAGTGAGCAAAGCCTGGCAACATGGTCCTTAAAAGCCAGGAGATCTTTGCT-GCTCGCGAGACACTCGCAGCACACC-GCAGCGTGAGGAAT--AAAAGCGGGCGTGTGGGACTTTCTAGGAGATTTTTCTTGGCAATGAGTCTTGCTTCAGACGTAAAACCGGTAGCTGTGCCACTGGTAGGGCTCCAGGTCGCTGTATGTAGAGCAACGCATGTCTGATGCTCTGACCTACAGGCAAGAAGGAGGAGAGGAACCAGGAAAACCACGTTTGTCTGTACCTTTGGGCCTGCACAGGGCCACATTGATGG-CAACACAG-ACCCGTGTTTCTGGCAGGGCAGAAGCCATGGGTGGGGATTTGCACGA-CGGGAGCTCCAGGCAGTTTAGGATTTGGCACAGCGATCTCAGAGAGGAGAACTCCTCTCCAAGAAGAGCAGCCCTTTGACAAGGTTACCCCTCATTTCCT-CAGTCGAACTGCCCTTTGCAAGGAGCATC-CCTCAGCTGCCACAGCCTATCGCATCTTCCTCCTGAAATCTGATGCTACCCTGCTTTGATGGTGATCAGTTGTTGACTGCAGAAAAGCTGAATCATGAAACAGAGGTTATAAACAAGGTTATTTTTAACGAAGGAAATAAATCCCACATCACGCAGAAGTC-GGAGCAAGGGACTGTCAGCTAGGGCTCGCAGATCTCACTCCTGCCCTGATATTCTCCATGTCCTGGAGCAAATTACAATCTCTCTAACCTTTAATTTACCTGCTTGTAACATGGACAAAGCAGCAGACTCCAGTTGTGCTTTCTTTGTACAGCAAAGCTCCTGGTGTGACAGTAAAGCGGTGCTGGGAGCCGTGCGATGCCA-GGG--G-CT-GC-CTGGC----G-C-C-TTGCAGAGCTATTTTCAGCCACACAAAATGATCGTACACGGTATTTGAACAATGTTCATAGACGTTTTGAATGCAGAGAGAGACTACCCAGTCTGATTGCC-CTCTCCCGTATTTCACAGTCTTACATTTCAATTCCCACATGCAGCCCTGTAACTTCTGCTTCATTTTGCTCACTGAAAACTCAAAACAGACAAGTTTTGTAGCCCAGAGTCTCACG-TCGTATCCTTCAGGCACTGACGTGCTGCAGGGTGTGTGTTGTAGATATCCAGCCCAGGGGGAGGGAAGATGAGTCTGGTACACCTACAGCTA-ACTGGGAAATTTCCTGATTGTGCCACTTCTTATCTCTCTCTCATACCTAAATTCTTCTCA-GAAAAGCGTATTTCATTTTCCCTGGGAAAGATGCCTCAACTCAGGGAAGGAGGCGATTGAGTCTGGGACTGGCCAGGCTTCCCTGTGCTGCCTGAGGAATACGAACAACCCCACAGAGAGACCCTTGCAAACCCATCTTCTCTGCT-CTACGCCCAAGTTAACAAGTGCTTTGGGTGTGTCT-CC-AACTTTACCCAGATAACGACTAAGGTTAAAGAACTCTAGCTTTTTTGAAA-GTATATTGTAAATAT-CTGTTTTTTTAAATATATATACATAAATGTACACACACACATAAATTTATGAAAACTTCAGGAGACTGGAAACTTTTCCTGCCGTACTTTATTTACAGCCCTGCTAA-G-GCTCTTT-GCTGCTGAAGGAGCTTACCTTCCTCGCTCAAGTTTTCTTTGAGCTAAATAGTGATTTCCACAAAGCAAATATCAAATAAAAACCAAGGCATCGATGGATTGGAT-CCAGAGTGGGTTCTGCTCCCTCTGTCCATTCACTGGTGCAAAGCCCTGGCCCACCTTCACCTGCCTGCTGTGGAAGCAGACACCAAACCTGCACACGCTCGGGACACACAGCAGCATT-TTGGCACAGGTGCCCGGGCACCTCCAGCGC-GGCCGTGAGAAGTGGAACAGGACCAACCTTGCAAGCTAGTTCCGTCTTGCAGGTGCCTGCAAGGACACATGGCTGCGTGCCAGCGGATGGAATAAGCATCGCCCCGTGCTAGTTATGGGGTAGGGATGCGGCAAGGCT--GCGTGAACTAGTGCTTGAGACGCCTCTGTGATCTGTAGATGAAAGGTGATTTACAGCTGCAAATTATTGCTCTTCCT-CATGTGAAGTCTGTTATTTTCTGGTGTCCCTCATTTTGACTCATGCCACGTCCCGTATTTGGCGCGGGGCAGGCTCCTCTCTATG-GGTGGCTGTATAGGTTCTTCACATCTCCAGCTGGCTGCATGCTTTAGAAGCGTGTGATCTGCTCACAGCATCCATTCGAGGCGGCCCGGCAGAGCCCATGCTTGGGTAACTTGGGCTGGGATTTGCACAAAACTGCAGCACTT-AGCAGATTGCTTAGCCGGTGGGTTTAGAGGGCCTGTATTTGAG-GCTGGATACCCAGCTGGGTTGTGCACTCAACTCTGTGAACAGGCAGAGGACCAGCTGGGAGAACCTTAAATCCTGTGGCAGAGAGGAGAAACAGGAATATCCCGAGCCTCTGTGCTTGGGGCTGCCCCAGGGAAGGGGAGGAAGCCAGAGAGCACTAGACTGGGGGCC-CTCAGTCTGCAGCTTCTCATCTGCGTCGCAAACCTCCTGGGCAGGAATTACAAAGCAGGAGAGGTTTGTATCTTTGGCAGGCGTCTGGAGGGAAGGGGTATTAGGGTGGTTTTGTGAACAGCCCATGGACAGCAGAAGGAGCCGGTAGATTCGATGTCCTTCCCGCATTATGCACCATGGCATGCTCACCCCATCAGGCTGCAGAGAGCAAGCGACCTTTTGCTCTTTACCGCTAAATAAAACAGCAGAAAAT-AC-CATGTG-GTATAAGATAGTTATAACGATGGAGGAAGAAATATCCCTGATGCTC
seq12
CCCAGTCTGCTAAACGGCATGAGGATTGCAGCTGTGTAGCCTGATTTTTGCAGGCAACATAATGCAGCAGTTAGCAAACCCTCACAACATGGTCATTAGAAGCCTGGGGAGCTGTGCTGGCT-GTGAGACA--CG--TCACACCAG-AGTGCGAGGAATAAAAAAGTGGGCGTGTGGGACTTTCTAGGAGATTTTTCTTGGCAATGAGTCTTGCTTTAAATGTAGAACTGGTAGCTGTGCCACTGGTAGGGCCCCAGGTCGCCGCATGTAGAGCAGCGCATGTCTGATGCTCTGACCTACAGGCAAGAAGGAGGAGAGGAACCAGGAAAACCACGTTTGTCTGTAGCTTTGGGCCTGCACGGGGCCACATTCATGGTGATGA-GGCACTGGTGTTTCTGGCAGGGCAGAAGCCATGGGTGGGGATTTGCAC-AGCAGGAACTCCAGGTAGTGTAGGATTTGGCGCAGCAATTTCAGACAGGAGAACTCCTCTCCAAAAAGAGCAGCCCTTTGACAAGG-CATCCCTCATTTCCTCCA-TCAAACTGCCCTTTGCAAGGAGCA-CGCCTGAGCTGCAACAGCCTATTGCATCTTCCTCCTGAAATCTGATGCTGCCCTGCTTTGATGGCGATCAGTTGATGACTGCAGAAAAGCTGAATCATGAAAGAGAGGTTATAAACAAGGTTATTTTTAACAAAGGAAATGAATCCCACATCATTCAGAAGTCAGG-GCAAGGGACTGTCAGCTAGGGCTTGCAGGTCCCACTTCTGCCCCGACGTTCTTTATGTCCTGGAGCAAATTAAAATCTCTCTAGCCTTTAATTTACCTCCTTGTAGCATGGACAATGCAGCAGACTCCAGTTGTGCTTTCTTTGTACAGCAAAGCTCCTGGTGTGACAATAAAGCGATGCTGGGAGCTGTGTGATGCCAAGGGCTGGCTTGCACTGCCACCCGGCACCTTGCAGAGCTATTTTCAGCCATGCAAAACGATCGCACGCAGTATTTCAACAATGTTCATAGACATTTCAAATGCAGAGAGAGACTAACCAGTCTGATT-CCTCTCTCCCATATTTCACAGTCTTACATTTCAATTCCCACATGAAGCCCAGTAACTTCTGCTTCATTTTGCTCACTGTACACTCAAAATAGACAAGTTTTGCAACCCAGAGACTCATGCT-GTGTCCTTCAGGCACTGGGGTGCTGCTGA-TGTGA-TA-TAGGTAACCAGCCCAGGGGGAAGGAAGACGGGTCTGGTTTACCTACAGCTACA-TGGGAAATTTCCTGA--GT---A-TACT-A-CTTTCTCGCATT--TAAATTCTTC-CAAGAAAAGCATATTTCATTTTTCCTACAAAAGATGACTCCACTCAGAGAAGCAGGAGATTGAGGCTGGGGCTGGCCAGGCTTCCAAG---------A--AATACGAACAACTCCACATAGAGATCCTTGCAAACCTGTCTTCTCTGCTTC-ATGCCCAAGTTAACAAGTGTTTTGGGTGTGTCTTCCCAACATTACAGAGATAACTACCAAGGTTAAAGAACTCTAGATTTTTTTTTATGT-T-TTGTAAATATTC-GTATTTT-AA-TATATACACATG--TGTA----------T---TT-ACAAAAACTTTAGGAGATTGTAAACTTC-CCTGCCATACTT-ACTCAAAATGCTGTTAAAGAGGACTTTTGCTCCTGGAGGAGCTCACTTTGCTCACTCAAGTTTTCTTTAAG-TAG---G-GACTTCCACACAGAAAATATCAGATAGAACCCAAAGCAGAAATGGGTTGGATGCCA-AGTA--T---G-T---TCTGT---T-C-C-----C----CC-TGGCCCACCTTCACCTGCCTGCTGTGGAAGCAGACACCAAACCTGCACTGGCTTGGGGGACACAGCAG-AGCCTTAGCACAGGTGCCCAGGAACCTCCAGC-CTGGTTGGGAGAAGTGGAACAGGACCAACCCTGCAAGCCAGTTCCCTCTTGCTGGTGCCTGCAAGGACATGTGGCTGTGTGCCAGCAGATGGAATAACCATTGCCCCGTGCTAGTTATGGGGCAGGGATGCTGCACGGCTCTGC-T-AATTAGTGCCTGAGACGCCTCTGTGATCTGTAGATGAAAGGTGATTTACAGCTGCTAATTATTGCTCT-CCTTCATGTGAAGTCTATTATTTCCTGGTGTCCCTCATTTTGACTCATGCCACATCCCGTATTTGGCACAGGGCAGGTTCCTCA-TA-GAGGTGG-TGC-T---TTG--CA--TCTCCAGCTGCCTGCACACTGAACAAGCCTGAGATCTGCTCTCAGCGTCCATTTGAGGCAGCCTGGCTGTGCCCATGCTTGGGAAACTTGGCCTGGAATTTGCATAAAGTTACAACATTTGAGCAGGT-GCTTAGCTAGTGGATTGAGGGGGC-TGCATTC-AGTGCTGGGTACTCAGCTGGGCTGTGCACTCAACTCTGTGA---G-CAGAGGACCAGCTGGGTGAACC--A------G----AGAG-G-AGAAACAGGAATATCCCGAACTTCTCTGCTCGGGACTGCCCCAGGGAAGGGGAAGAAGCCAGAGAGGACCAGACTAGGG-CCTCTTAGTCTATGGCTTCTCATCTGTGTCCCAAACCTCCTGGGCAGGAGTTACAAAGCAGGAGAGGTTCGTGTCTCTGGCAGGTGTCTGGAGGGAAGGGATATTAACGTGGTTTTCTGAACAGCCCATGGATAGCAGAAGGAACAAGTATTTTCAATGTCCTTCCCATGTTATGCCCCATGGCA--C--AT---A-CAGGTTGCAAA-AGCAAGTGACATTTTGCTCTT-ATTGCAAAATAAAAGAGCACAAAGTGACACATG-GCGTA-AGG-TAGTTGTAACGACAGAGGATGAAATATCCCTGATACTC

 

2 1015
seq21
GTCCCTGTAGCTTATAGCAAAGCATGGCACTGAAGATGCCAAGACGGTTGCCTTC-ATCATACCCAGGGACAAAAGACTTAGTCCTAACCTTACAGTTAATTCTTGCTAAACATATACATGCAAGTATCCGCGCACCAGTGTAAATGCCCTCAATCTCTTGCTTGCAAGACAAAGGAGCGGGTATCAGGCACACCTGTAATTGAACCGTAGCCCAAGACGCCTTGCTTAGCCACACCCCCACGGGTATTCAGCAGTAGTTAACATTAAGCAATAAGTGTAAACTTGACTTAGTTATAGCAACACTCAGGGTCGGTAAATCTTGTGCCAGCCACCGCGGTCACACAAGAGGCCCAAATTAACCGTATACACGGCGTAAAGAGTGGTACCATGCTATCCCATCAACTAGGATCAAAGTGCAACTGAGCTGTCGTAAGCCCAAGATGCATTAAAAGCCACCCTCAAGACGATCTTAGCACCCCCGATCAATTGAACCCCACGAAAGCTGGGACACAAACTGGGATTAGATACCCCACTATGCCCAGCCCTAAATCTTGATGCTTACCCCACTGAAGCATCCGCCTGAGAACTACGAGCACAAACGCTTAAAACTCTAAGGACTTGGCGGTGCCCCAAACCCACCTAGAGGAGCCTGTTCTGTAATCGATAACCCACGATACACCCAACCGTCCCTTGCCACAGCAGCCTACATACCGCCGTCGCCAGCTCACCTCTACCTGAGAGTGCA-A-CAGTGAGCACAATAGCCCTAC-G-C--CGCTAACAAGACAGGTCAAGGTATAGCTCATGGGGCGGAAGAAATGGGCTACATTTTCTAAG-ATAGAAAACACGAAAAGGGGTATGAAACTACCCCTGGAAGGCGGATTTAGCAGTAAAGCGGGACAATAAAGCCCCCTTTAAGTCGGCCCTGGAGCACGTACATACCGCCCGTCACCCTCCTCATAAGCCCCTATTGCTCATAACTAATACACCTACCAGCTGAAGATGAGGTAAGTCGTAACAAGGTAAGTGTACCGGAAGGTGCACTTAGCACACCAAGATGTAGCTAAACGTAAAGCATTCAGCTTACACCTGAAAGATATCTGCC-TCTTACCGGATCATCTTGAAG-CCAACTCTAGCCCAACCATATTACTAATAGAGCACACCA-AAAAAATCCACTCCACC-ACCAAATTAAAACATTTTTTCCACAACTTAGTATAGGCGATAGAAAAGATACTTTGGCGCTATAGAGATATTTGTACCGCAAGGGAAAGATGAAATAACAATGAAAAACTCAAGCAACAAATAGCAAAGATAAGCCCTTGTACCTTTTGCATCATGATTTAGCAAGAACCACCAAGCAAAATGAATTTTAGCTTGCCACCCCGAAACCTGAGCGAGCTACTTACAAGCAGCTATCCTAGAGCGAACCCGTCTCTGTTGCAAAAGAGTGGGAAGACTTGCCAGTAGAGGTGAAAAGCCTACCGAGCCAGGTGATAGCTGGTTGCCTGTGAAACGAATCTAAGTTCCCTCTTAATTTTCCTCTACGGACCCCACCCAACCCCCAACGTAGTGAATCAAGAGCTATTTAAAGGGGGTACAGCCCCTTTAAAGAAGGACACGCCTTCCCTAGCGGATAACTTACCCAACCCCACCCCCTAAACTTGTAGGCCCTTAAGCAGCCATCAGCAAAGAGTGCGTCAAAGCTCCACAC--CCCAAAAATCTGAAGACTGTACGACTCCCTTACCACCAACAGGCCAACCTATAACAATAGAAGGATTAATGCTAAAATAAGTAACTAGGGCCTCTCACCCTCTCAGGCGCAAGCTTACATGATTCCATTATTAACAGGCTAACTAATACCGCAACTTTGACAAGACAAAATATTGAACCCGTC-CTGTTAACCCAACTCAGGAGCGCCCATAAGAAAGATTTAAATCTGCAGAAGGAACTAGGCAAACCCAAGGCCCGACTGTTTACCAAAAACATAGCCTTCAGCCAACCAAGTATTGAAGGTGATGCCTGCCCAGTGACCCCACGTTCAACGGCCGCGGTATCCTAACCGTGCGAAGGTAGCGCAATCAATTGTCCCATAAATCGAGACTTGTATGAATGGCTAAACGAGGTCTTAACTGTCTCCTGTAGATAATCAGTGAAATTGATCTTCCTGTGCAAAAGCAGGAATAGGCACATAAGACGAGAAGACCCTGTGGAACTTAAAAATCAGCGGCCACCACACATTTA-ACTCCTAAGCCTACTAGGCCCGCACACCCCC-TCCAAACACTGGCCCGCATTTTTCGGTTGGGGCGACCTTGGAGAAAAACGAATCCTCCAAAAATAAGACCACACCTCTTAACCAAGAGCAACATCTCAACGTACCAACAGTAACCAGACCCAGCACAAGCCTGACTAATGGACCAAGCTACCCCAGGGATAACAGCGCAATCTCCTTCAAGAGCCCATATCGACGAGGAGGTTTACGACCTCGATGTTGGATCAGGACATCCTAATGGTGCAGCCGCTATTAAGGGTTCGTTTGTTCAACGATTAACAGTCCTACGTGATCTGAGTTCAGACCGGAGCAATCCAGGTCGGTTTCTATCTATGAC-GAACTTTTCCTAGTACGAAAGGACCGGAAAAGTAGAGCCAATACTACAAGCATGCCCTCCCTCTAAGCAGTGAATCCAACTAAACTGCCAAAAGGACACCCACAA-CCCC-TACATCCTAGAAAAGGACCGCTAGCGTGGCAGAGCTCGGCAAATGCAAAAGGCTTAAGCCCTTTACCCA
seq22
GTCCCTGTAGCTTACAGCAAAGCATGGCACTGAAGATGCCAAGACGGTTGTC-TCTATCATACCCAAGGACAAAAGACTTAGTCCTAACCTTACAGTTAATTCTTGCTAGACATATACATGCAAGTATCCGCGCACCAGTGTAAATGCCCTCAATCTCTTGCTTGCAAGACAAAGGAGCGGGCATCAGGCACACCCATGATTAAATCGTAGCCCAAGACGCCTTGCTTAGCCACACCCCCACGGGTATTCAGCAGTAATTAACATTAAGCAATAAGTGTAAACTTGACTTAGTTATAGCAGCCCTTAGGGTCGGTAAATCTTGTGCCAGCCACCGCGGTCACACAAGAGACCCAAATTAACTGTA-ATACGGCGTAAAGAGTGGCATCATGTTATCCCACCAACTAAGATCAAAGTGCAACTGAGCTGTCACAAGCCCAAGATGCATTAAAAACCACCCTCAAGACGGTCTTAGCACTCACGATCGATTGAATCCCACGAAAGCTGGGGCACAAACTGGGATTAGATACCCCACTATGCCCAGCCCTAAATCTTGATGCTTACCCTACTGAAGCATCCGCCTGAGAACTACGAGCACAAACGCTTAAAACTCTAAGGACTTGGCGGTGCCCCAAACCCACCTAGAGGAGCCTGTTCTATAATCGATAACCCACGATACACCCAACCATCCCTTGCCACAGCAGCCTACATACCGCCGTCGCCAGCTCACCTCTACCTGAGA--GCATAGCAGTGAGCGCAATAGCCCAACAGACATCGCTAACAAGACAGGTCAAGGTATAGCCCATGGGACGGAAGAAATGGGCTACATTTTCT-AGAATAGAAAACACGAAAAGGGGTGTGAAACTACCCCTGGAAGGCGGATTTAGCAGTAAAGCGGGACAATAAAGCCCCCTTTAAGTTGGCCCTGGGGCACGTACATACCGCCCGTCACCCTCCTCATAAGCCCCCATTACTTATAACTAATACATTTACAAGCTGAAGATGAGGTAAGTCGTAACAAGGTAAGTGTACCGGAAGGTGCACTTAGCACACCAAGATGTAGCTAAACATAAAGCATTCAGCTTACGCCTGAAAGATATCTACCATC-TATCGGATCATCTTGAAGCCCAACTCTAGCCCGACCATATCAATAA-CGAG-ACA-CACTAAGAAGCTACTCC-CCTACCAGATTAAACCA-TTTTTCCACAACTTAGTATAGGCGATAGAAAGGACACTTTGGCGCGATAGAGATATCTGTACCGCAAGGGAAAAATGAAATAATAATGAAAAACTCAAGCAACAAACAGCAAAGATAAACCCTTGTACCTTTCGCATCATGATTTAGCAAGAACAACCAAGCAAAATGAATTTTAGCCTGCCATCCCGAAACCTGAGCGAGCTACTTACAAGCAGCTACCCCAGAGCGAACCCGTCTCTGTTGCAAAAGAGTGGGAAGACTTGCCAGTAGAGGTGAAAAGCCTACCGAGCCAGGTGATAGCTGGTTGCCTGTGAAATGAATCTAAGTTCCCTCTTAATTTTCCTCTACGGAGCCCACCTAA-CCCCAACGTAGTGAATCAAGAGCTATTTAAAGGGGGTACAGCCCCTTTAAAAAAGGACACACCTCCCCTAGCGGATAA-TTACCCAACCTTACGTCCT-AACTTGTAGGCCCTTAAGCAGCCACCAGCAAAGAGTGCGTCAAAGCTCCACACATCAAAAAAATCTGAAAACCACATGACTCCCTTACCACTAACAGGCCAACCTATAACAATAGGAGAATCAATGCTAGAATAAGTAACTAGGGCCCCTCACCCTCTCAGGCGCAAGCTTACATCATTATATTATTAACAGACCAACTAATACCACAACTTTAACAAGATAGAATATTAAACCC-ACTCTGTTAACCCAACCCAGGAGCGCCCATAAGAAAGATTTAAATCTACAAAAGGAACTAGGCAAACCCAAGGCCCGACTGTTTACCAAAAACATAGCCTTCAGCCAACCAAGTATTGAAGGTGATGCCTGCCCAGTGACCCCACGTTTAACGGCCGCGGTATCCTAACCGTGCGAAGGTAGCGCAATCAATTGTCCCATAAATCGAGACTTGTATGAATGGCTAAACGAGGTCTTAACTGTCTCCTGTAGATAATCAGTGAAATTGATCTTCCTGTGCAAAAGCAGGAATAAACACATAAGACGAGAAGACCCTGTGGAACTTAAAAATCAGCAGCCACCACACAAC-AGACTCCCAAGCCTACCAGGCCCACATACCCCCCTCCAAACACTGGCCTGCATTTTTCGGTTGGGGCGACCTTGGAGAAAAACGAATCCTCCAAAAACAAGACCACACCTCTTAACCAAGAGCAACACCTCGACGTACTAACAGTACCCAGACCCAGCACAAGTCTGACCAATGGACCAAGCTACCCCAGGGATAACAGCGCAATCTCCTTCAAGAGCCCATATCGACAAGGAGGTTTACGACCTCGATGTTGGATCAGGACATCCTAATGGTGCAGCCGCTATTAAGGGTTCGTTTGTTCAACGATTAACAGTCCTACGTGATCTGAGTTCAGACCGGAGCAATCCAGGTCGGTTTCTATCTATGACAGA-CTTTTCCTAGTACGAAAGGACCGGAGAAGTAGGGCCAATGCTGCAGGTACGCCCTCCCCC-AAGCAATGAATCCAACTAAACCGCTAAAAGGACACACATAAACCCCGTACATCCTAGAAAAGGATCGCTAGCGTGGCAGAGCTCGGCAAATGCAAAAGGCTTAAGCCCTTTACCCA

dn/ds • 1.6k views
ADD COMMENTlink modified 4.0 years ago by Brice Sarver2.6k • written 4.0 years ago by zgayk90
1
gravatar for Brice Sarver
4.0 years ago by
Brice Sarver2.6k
United States
Brice Sarver2.6k wrote:

I get around this by splitting the dataset into genes/transcripts of interest, then using sed (or any other find/replace) on a template control file to add the appropriate output filename, trees, and dataset. I then submit these control files either in batches on stand-alone systems or all at once to a distributed cluster. I've submitted up to 60k this way before (though how many you can submit simultaneously will be a function of your cluster's settings or stand-alone's resources). I make summary tables after the fact and delete any extraneous files so as to not bog down the filesystem.

Hope this helps.

ADD COMMENTlink written 4.0 years ago by Brice Sarver2.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1514 users visited in the last hour