The following statistics come from ohloh or from cloc.pl count.
Project Language Code Comment Blank Date/Ver Source FTEs
Bioclipse Java 578,095 349,515 154,338 04/02/2011 Ohloh ?
Bioconductor R/C/C++ 1,248,634 276,358 218,222 03/30/2011 cloc+awk ?
BioJava Java 272,864 129,237 59,074 03/30/2011 Ohloh ?
BioMart Java/Perl 98,637 43,231 24,346 03/30/2011 Ohloh ?
BioPerl Perl 323,007 258,987 167,907 03/30/2011 Ohloh ?
BioPython Python 120,824 39,085 22,183 03/30/2011 Ohloh ?
BioRuby Ruby 68,390 27,032 15,636 03/30/2011 Ohloh ?
EMBOSS C 633,014 258,265 215,110 04/02/2011 Ohloh ?
flystockdb JS/Ruby 7,845 ? ? ? ? 1
JKsrc C 827,908 111,490 105,524 03/31/2011 Ohloh ?
Jmol Java 213,645 58,930 28,784 03/30/2011 Ohloh ?
ncbi_cxx C++/C 1,112,817 318,441 250,134 Jun_15_2010 cloc.pl ?
OpenMS C++ 219,835 77,201 51,512 04/02/2011 Ohloh ?
SeqAn C++/C 250,390 89,885 55,212 03/30/2011 Ohloh ?
SHOGUN C++/C 128,232 53,367 33,488 04/02/2011 Ohloh ?
There are a few caveats to get the table. As the others have argued, these numbers are not a good indication of how large the project is. Just give you a very rough idea.
EDIT 03/31/2011: JKsrc from ohloh, LOCs very similar to cloc.pl results.
EDIT 04/02/2011: Updated EMBOSS with LOCs from ohloh (I modified its Enlistment list because the old one points its documentation only); added OpenMS (I modified its Enlistment list because the old one includes SVN tags and branches but we should count trunk only); added SHOGUN; added Ensembl to Ohloh, but Ohloh has problems with analyzing its repository; updated Bioclipse as Egon has updated its enlistment. Sorry to push this answer up. I just want to keep it updated.
Further to demonstrate cloc.pl. I downloaded Jim Kent's source codes jksrc.zip, unzipped it and counted lines of codes with the following command line:
find -type f|egrep "\.(c|h|cpp|cc|hpp|hh|java|py|pl|pm|rb|lua|html|htm|js|php|sql)$" > file.list; cloc.pl --list-file=file.list
The output is:
This jksrc.zip is one of the largest collections of C source codes (if not the largest). It is the base of the UCSC genome browser and a lot of other utilities such as the famous BLAT.
Please include FTE estimates when available.