In build 60 (Beta 5 or SX 7/04), I fixed a long standing Solaris bug: mounted filesystems could not contain spaces. We would happily mount the filesystem, but then all consumers of /etc/mnttab would fail. This resulted in sad situations like:
# df -h
Filesystem size used avail capacity Mounted on
/dev/dsk/c0d0s0 36G 13G 22G 38% /
/devices 0K 0K 0K 0% /devices
/dev/dsk/c0d0p0:boot 11M 2.3M 8.4M 22% /boot
/proc 0K 0K 0K 0% /proc
mnttab 0K 0K 0K 0% /etc/mnttab
fd 0K 0K 0K 0% /dev/fd
swap 1002M 24K 1002M 1% /var/run
swap 1003M 1.3M 1002M 1% /tmp
# mount -F lofs /export/space\ dir /mnt/space\ mnt
/export/space dir /mnt/space mnt lofs dev=1980000 1090718041
# df -h
df: a line in /etc/mnttab has too many fields
#
Luckily you could unmount the filesystem, but it was quite annoying to say the least. The resulting fix was really an exploration into bad interface design.
/etc/mnttab
This file has been around since the early days of Unix (at least as far back as SVR3). Each line is a whitespace-delimited set of fields, including special device, mount point, filesystem type, mount options, and mount time (see mnttab(4) for more information). Historically, this was a plain text file. This meant that the user programs mount(1M) and umount(1M) were responsible for making sure its contents were kept up to date. This could be very problematic: imagine what would happen if the program died partway through adding an entry, or root accidently removed an entry without actually unmounting it. Once the contents were corrupted, the admin usually had to resort to rebooting, rather than trying to guess what the proper contents. Not to mention it makes mounting filesystems from within the kernel unnecessarily complicated.
In Solaris 8, we solved part of the problem by creating the mntfs pseudo filesystem. From this point onward, /etc/mnttab was no longer a regular text file, but a mounted filesystem. The contents are generated on-the-fly from the kernel data structures. This means that the contents are always in sync with the kernel1, and that the user can’t accidentally change the contents. However, we still had the problem that the mount points could not contain spaces, because space was a delimiter with special meaning.
getmntent() and friends
On top of this broken interface, a C API was developed that had even worse problems. Consider getmntent(3c):
int getmntent(FILE *fp, struct mnttab *mp);
There are several problems with this interface:
- The user is responsible for opening and closing the file
There is only one mount state for the kernel; why should the user have to know that /etc/mnttab is the place where the entries are stored?
- The first parameter is a FILE *
If you’re developing a system interface, you should not enforce using the C stdio library. Every other system API takes a normal file descriptor instead./p>
- The memory is allocated by the function on demand
This causes all sorts of problems, including making multithreaded difficult, and preventing the user from controlling the size of the buffer used to read in the data.
- There is no relationship between the memory and the open file
Because of this, a lazy programmer can close the file after the last call to getmntent() while still using the memory, so it must be kept around indefinitely.
By now, it should be obvious that this was an ill-conceived API built on top of a broken interface. Off the top of my head, if I were to re-design these interfaces I would come up with something more like:
mnttab_t *mnttab_init(void);
int mnttab_get(mnttab_t *mnttab, struct mntent *ent, void *scratch, size_t scratchlen);
void mnttab_fini(mnttab_t *mnttab);
The solution
Once /etc/mnttab became a filesystem, we could add ioctl(2) calls to do whatever we wanted. Once we’re in the kernel, we know exactly how long each field of the structure is. We create a set of NULL-terminated strings directly in user space, and simply return pointers to them. This was more complicated than it sounds for the reasons outlined above. We also had to maintain the ability to read the file directly. With this fix, all C consumers “just work”. Scripted programs will still choke on a mnttab entry with spaces, but this is a minority by far.
Note that the files /etc/vfstab and /etc/dfs/sharetab still suffer from this problem. There has been some discussion about how to resolve these issues, with the new Service Management Facility being touted as a possible solution. And ZFF (Sun’s next generation filesystem) is avoiding /etc/vfstab altogether.
1 There is always the possibility that the mounted filesystems change between the time the file is opened and the data is read.