Eric Schrock's Blog

Fun source code facts

June 25, 2005

A while ago, for my own amusement, I went through the Solaris source base and searched for the source files with the most lines. For some unknown reason this popped in my head yesterday so I decided to try it again. Here are the top 10 longest files in OpenSolaris:

Length Source File
29944 usr/src/uts/common/io/scsi/targets/sd.c
25920 [closed]
25429 usr/src/uts/common/inet/tcp/tcp.c
22789 [closed]
16954 [closed]
16339 [closed]
15667 usr/src/uts/common/fs/nfs4_vnops.c
14550 usr/src/uts/sfmmu/vm/hat_sfmmu.c
13931 usr/src/uts/common/dtrace/dtrace.c
13027 usr/src/uts/sun4u/starfire/io/idn_proto.c

You can see some of the largest files are still closed source. Note that the length of the file doesn’t necessarily indicate anything about the quality of the code, it’s more just idle curiosity. Knowing the quality of online journalism these days, I’m sure this will get turned into “Solaris source reveals completely unmaintable code” …

After looking at this, I decided a much more interesting question was “which source files are the most commented?” To answer this question, I ran evey source file through a script I found that counts the number of commented lines in each file. I filtered out those files that were less than 500 lines long, and ran the results through another script to calculate the percentage of lines that were commented. Lines which have a comment along with source are considered a commented line, so some of the ratios were quite high. I filtered out those files which were mostly tables (like uwidth.c), as these comments didn’t really count. I also ignored header files, because they tend to be far more commented that the implementation itself. In the end I had the following list:

Percentage File
62.9% usr/src/cmd/cmd-inet/usr.lib/mipagent/snmp_stub.c
58.7% usr/src/cmd/sgs/libld/amd64/amd64unwind.c
58.4% usr/src/lib/libtecla/common/expand.c
56.7% usr/src/cmd/lvm/metassist/common/volume_nvpair.c
56.6% usr/src/lib/libtecla/common/cplfile.c
55.6% usr/src/lib/libc/port/gen/mon.c
55.4% usr/src/lib/libadm/common/devreserv.c
55.1% usr/src/lib/libtecla/common/getline.c
54.5% [closed]
54.3% usr/src/uts/common/io/ib/ibtl/ibtl_mem.c

Now, when I write code I tend to hover in the 20-30% comments range (my best of those in the gate is gfs.c, which with Dave’s help is 44% comments). Some of the above are rather over-commented (especially snmp_sub.c, which likes to repeat comments above and within functions).

I found this little experiment interesting, but please don’t base any conclusions on these results. They are for entertainment purposes only.

Technorati Tag:

Recent Posts

April 21, 2013
February 28, 2013
August 14, 2012
July 28, 2012