Eric Schrock's Blog

Month: June 2005

So talking to Ben last night convinced me I needed to finish up the GDB to MDB reference that I started last month. So here’s part two.

GDB MDB Description

Program Stack

backtrace n ::stack
Display stack backtrace for the current thread
Display a stack for a given thread. In the kernel, thread
is the address of the kthread_t. In userland, it's the thread
info ... - Display information about the current frame. MDB doesn't support
the debugging data necessary to maintain the frame abstraction.

Execution Control

:c Continue target.
Step to the next machine instruction. MDB does not support
stepping by source lines.
Step over the next machine instruction, skipping any function
finish ::step
Continue until returning from the current frame.
jump *address address&gtreg Jump to the given location. In MDB, reg depends on your
platform. For SPARC it's 'pc', for i386 its 'eip', and for amd64 it's


print expr addr::print
Print the given expression. In GDB you can specify variable
names as well as addresses. For MDB, you give a particular address and then
specify the type to display (which can include dereferencing of members,
print /f addr/f Print data in a precise format. See ::formats for a
list of MDB formats.
disassem addr addr::dis Dissasemble text at the given address, or the current PC if no
address is specified

This is just a primer. Both programs support a wide variety of additional
options. Running 'mdb -k', you can quickly see just how many commands are out

> ::dcmds ! wc -l
> ::walkers ! wc -l

One helpful trick is ::dcmds ! grep thing, which searches the
description of each command. Good luck, and join the discussion over at the
OpenSolaris MDB
if you have any questions or tips of your own.

Technorati tag:
Technorati tag:
Technorati tag:

When I first started in the Solaris group, I was faced with two equally
difficult tasks: learning the development model, and understanding the source
code. For both these tasks, the recommended method is usually picking a small
bug and working through the process. For the curious, the first bug I putback
to ON was 4912227
(ptree call returns zero on failure), a simple bug with near zero risk. It
was the first step down a very long road.

As a another first step, someone suggested adding a very simple system call to the
kernel. This turned out to be a whole lot harder than one would expect, and has
so many subtle aspects that experienced Solaris engineers (myself included)
still miss some of the necessary changes. With that in mind, I thought a
reasonable first OpenSolaris blog would be describing exactly how to add a new
system call to the kernel.

For the purposes of this post, we will assume that it’s a simple system call
that lives in the generic kernel code, and we’ll put the code into an existing
file to avoid having to deal with Makefiles. The goal is to print an arbitrary
message to the console whenever the system call is issued.

1. Picking a syscall number

Before writing any real code, we first have to pick a number that will
represent our system call. The main source of documentation here is
which describes all the available system call numbers, as well as which ones are
reserved. The maximum number of syscalls is currently 256 (NSYSCALL), which
doesn’t leave much space for new ones. This could theoretically be extended – I
believe the hard limit is in the size of sysset_t, whose 16 integers
must be able to represent a complete bitmask of all system calls. This puts our
actual limit at 16*32, or 512, system calls. But for the purposes of our
tutorial, we’ll pick system call number 56, which is currently unused. For my
own amusement, we’ll name our (my?) system call ‘schrock’. So first we add the
following line to syscall.h

#define SYS_uadmin      55
#define SYS_schrock     56
#define SYS_utssys      57

2. Writing the syscall handler

Next, we have to actually add the function that will get called when we
invoke the system call. What we should really do is add a new file
schrock.c to usr/src/uts/common/syscall,
but I’m trying to avoid Makefiles. Instead, we’ll just stick it in getpid.c:

#include <sys/cmn_err.h>
schrock(void *arg)
char	buf[1024];
size_t	len;
if (copyinstr(arg, buf, sizeof (buf), &len) != 0)
return (set_errno(EFAULT));
cmn_err(CE_WARN, "%s", buf);
return (0);

Note that declaring a buffer of 1024 bytes on the stack is a very bad
thing to do in the kernel. We have limited stack space, and a stack overflow
will result in a panic. We also don’t check that the length of the string was
less than our scratch space. But this will suffice for illustrative purposes.
The cmn_err()
function is the simplest way to display messages from the kernel.

3. Adding an entry to the syscall table

We need to place an entry in the system call table. This table lives in sysent.c,
and makes heavy use of macros to simplify the source. Our system call takes a
single argument and returns an integer, so we’ll need to use the
SYSENT_CI macro. We need
to add a prototype for our syscall, and add an entry to the sysent and
sysent32 tables:

int     rename();
void    rexit();
int     schrock();
int     semsys();
int     setgid();
/* ... */
/* 54 */ SYSENT_CI("ioctl",             ioctl,          3),
/* 55 */ SYSENT_CI("uadmin",            uadmin,         3),
        /* 56 */ SYSENT_CI("schrock",		schrock,	1),
/* 57 */ IF_LP64(
SYSENT_2CI("utssys",    utssys64,       4),
SYSENT_2CI("utssys",    utssys32,       4)),
/* ... */
/* 54 */ SYSENT_CI("ioctl",             ioctl,          3),
/* 55 */ SYSENT_CI("uadmin",            uadmin,         3),
        /* 56 */ SYSENT_CI("schrock",		schrock,	1),
/* 57 */ SYSENT_2CI("utssys",           utssys32,       4),

4. /etc/name_to_sysnum

At this point, we could write a program to invoke our system call, but the
point here is to illustrate everything that needs to be done to integrate
a system call, so we can’t ignore the little things. One of these little things
is /etc/name_to_sysnum, which provides a mapping between system call
names and numbers, and is used by dtrace(1M), truss(1), and
friends. Of course, there is one version for x86 and one for SPARC, so you will
have to add the following lines to both the

ioctl                   54
uadmin                  55
schrock                 56
utssys                  57
fdsync                  58

5. truss(1)

Truss does fancy decoding of system call arguments. In order to do this, we
need to maintain a table in truss that describes the type of each argument for
every syscall. This table is found in systable.c.
Since our syscall takes a single string, we add the following entry:

{"ioctl",       3, DEC, NOV, DEC, IOC, IOA},                    /*  54 */
{"uadmin",      3, DEC, NOV, DEC, DEC, DEC},                    /*  55 */
{"schrock",     1, DEC, NOV, STG},                              /*  56 */
{"utssys",      4, DEC, NOV, HEX, DEC, UTS, HEX},               /*  57 */
{"fdsync",      2, DEC, NOV, DEC, FFG},                         /*  58 */

Don’t worry too much about the different constants. But be sure to read up
on the truss source code if you’re adding a complicated system call.

6. proc_names.c

This is the file that gets missed the most often when adding a new syscall.
Libproc uses the table in proc_names.c
to translate between system call numbers and names. Why it doesn’t make use of
/etc/name_to_sysnum is anybody’s guess, but for now you have to update
the systable array in this file:

"ioctl",                /* 54 */
"uadmin",               /* 55 */
        "schrock",              /* 56 */
"utssys",               /* 57 */
"fdsync",               /* 58 */

7. Putting it all together

Finally, everything is in place. We can test our system call with a simple

#include <sys/syscall.h>
main(int argc, char **argv)
syscall(SYS_schrock, "OpenSolaris Rules!");
return (0);

If we run this on our system, we’ll see the following output on the

June 14 13:42:21 halcyon genunix: WARNING: OpenSolaris Rules!

Because we did all the extra work, we can actually observe the behavior using
truss(1), mdb(1), or dtrace(1M). As you can see,
adding a system call is not as easy as it should be. One of the ideas that has
been floating around for a while is the Grand Unified Syscall(tm) project, which
would centralize all this information as well as provide type information for
the DTrace syscall provider. But until that happens, we’ll have to deal with
this process.

Technorati Tag:

Technorati Tag:

The last day of FISL has come and gone, thankfully. I’m completely drained, both physically and mentally. As you can probably tell from the comments on yesterday’s blog entry, we had quite a night out last night in Porto Alegre. I didn’t stay out quite as late as some of the Brazil guys, but Ken and I made it back in time to catch about 4 hours of sleep before heading off to the conference. Thankfully I remembered to set my alarm, otherwise I probably would have ended up in bed until the early afternoon. The full details of the night are better told in person…

This last day was significantly quieter than previous days. With the conference winding down, I assume that many people took off early. Most of our presentations today were to an audience of 2 or 3 people, and we even had to cancel some of the early ones as no one was there. I managed to give presentations for Performance, Zones, and DTrace, despite my complete lack of sleep. The DTrace presentation was particularly rough because it’s primarily demo-driven, with no set plan. This turns out to be rather difficult after a night of no sleep and a few too many caipirinhas.

The highlight of the day was when a woman (stunningly beautiful, of course) came up to me while I was sitting in one of the chairs and asked to take a picture with me. We didn’t talk at all, and I didn’t know who she was, but she seemed psyched to be getting her picture taken with someone from Sun. I just keep telling myself that it was my stunning good looks that resulted in the picture, not my badge saying “Sun Microsystems”. I can dream, can’t I?

Tomorrow begins the 24 hours of travelling to get me back home. I can’t wait to get back to my own apartment and a normal lifestyle.

The exhaustion continues to increase. Today I did 3 presentations: DTrace, Zones, and FMA (which turned into OpenSolaris). Every one took up the full hour allotted. And tomorrow I’m going to add a Solaris performance presentation, to bring the grand total to 4 hours of presentations. Given how bad the acoustics are on the exposition floor, my goal is to lose my voice by the end of the night. So far, I’ve settled into a schedule: wake up around 7:00, check email, work on slides, eat breakfast, then get to the conference around 8:45. After a full day of talking and giving presentations, I get back to the hotel around 7:45 and do about an hour of work/email before going out to dinner. We get back from dinner around 11:30, at which point I get to blogging and finishing up some work. Eventaully I get to sleep around 1:00, at which point I have to do the whole thing the next day. Thank god tomorrow is the end, I don’t know how much more I can take.

Today’s highlight was when Dimas (from Sun Brazil) began an impromptu Looking Glass demo towards the end of the day. He ended up overflowing our booth with at least 40 people for a solid hour before the commotion started to die down. Those of us sitting in the corner were worried we’d have to lave to make room. Our Solaris presentations hit 25 or so people, but never so many for so long. The combination of cool eye candy and a native Portuguese speaker really helped out (though most people probably couldn’t hear him anyway).

Other highlights included hanging out with the folks at CodeBreakers, who really seem to dig Solaris (Thiago had S10 installed on his laptop within half a day). We took some pictures with them (which Dave should post soon), and are going out for barbeque and drinks tonight with them and 100+ other open source Brazil folks. I also helped a few other people get Solaris 10 installed on their laptops (mostly just the “disable USB legacy support” problem). It’s unbelievably cool to see the results of handing out Solaris 10 DVDs before even leaving the conference. The top Solaris presentations were understandably DTrace and Zones, though the booth was pretty well packed all day.

Let’s hope the last day is as good as the rest. Here’s to Software Livre!

Another day at FISL, another day full of presentations. Today we did mini-presentations every hour on the hour, most of which were very well attended. When we overlapped with the major keynote sessions, turnout tended to be low, but other than that it was very successful. We covered OpenSolaris, DTrace, FMA, SMF, Security, as well as a Java presentation (by Charlie, not Dave or myself). As usual, lots of great questions from the highly technical audience.

The highlight today was a great conversation with a group of folks very interested in starting an OpenSolaris users group in Brazil. Extremely nice group of guys, very interested in technology and helping OpenSolaris build a greater presence in Brazil (both through user groups and Solaris attendance at conferences). I have to say that after experiencing this conference and seeing the enthusiasm that everyone has for exciting technology and open source, I have to agree that Brazil is a great place to focus our OpenSolaris presence. Hopefully we’ll see user groups pop up here as well as the rest of the world. We’ll be doing everything we can to help from within Sun.

The other, more amusing, highlight of the day was during my DTrace demonstration. I needed an interesting java application to demonstrate the jstack() DTrace action, so I started up the only java application (apart from some internal Sun tools) that I use on a regular basis: Yahoo! Sports Fantasy Baseball StatTracker (the classic version, not the new flash one). I tried to explain that maybe I was trying to debug why the app was lying to me about Tejada going 0-2 so far in the Sox/Orioles game; really he should have hit two homers and I should be dominating this week’s scores1. I was rather amused, but I think the cultural divide was a little too wide. Not only baseball, but fantasy baseball: I don’t blame the audience at all.

Technorati tags:

1 This is clearly a lie. Despite any dreams of fantasy baseball domination, I would never root for my players in a game over the Red Sox. In the end, Ryan’s 40.5 ERA was worth the bottom of the ninth comeback capped by Ortiz’s 3-run shot.

So the first day of FISL has come to a close. I have to say it went better than expected, based on the quality of questions posed by the audience and visitors to the Sun booth. If today is any indication, my voice is going to completely gone by the end of the conference. I started off the day with a technical overview of Solaris 10/OpenSolaris. You can find the slides for this presentation here. Before taking too much credit myself, the content of these slides are largely based off of Dan’s USENIX presentation (thanks Dan!). This is a whirlwind tour of Solaris features – three slides per topic is nowhere near enough. Each of the major topics has been presented many times as a standalone 2-hour presentation, so you can imagine the corners I have to cut to cover them all.

My presention was followed by a great OpenSolaris overview from Tom Goguen. His summary of the CDDL was one of the best I’ve ever seen – it was the first time I’ve seen an OpenSolaris presentation without a dozen questions about GPL, CDDL, and everybody’s favorite pet license. Dave followed up with a detailed description of how Solaris is developed today and where we see OpenSolaris development heading in the future. All in all, we managed to cram 10+ hours of presentations into a measley 3 1/2 hours. For those of you who still have lingering questions, please stop by the Sun booth and chat with us about anything and everything. We’ll be here all week

After retiring to the booth, we had several great discussions with some of the attendees. The highlight of the day was when Dave was talking to an attendee about SMF (and the cool GUI he’s working on) and I was feeling particularly bored. Since my laptop was hooked up to the monitor in the “community theater”, I decided to play around with some DTrace scripts to come up with a cool demo. Within three minutes I had 4 or 5 people watching what I was doing, so I decided to start talking about all the wonders of DTrace. The 4 or 5 people quickly turned into 10 or 12, and pretty soon I found myself in the middle of a 3 hour mammoth DTrace demo, from which my voice is still recovering. This brings us to the major thing I learned today:

“If you DTrace it, they will come”

Technorati tags:

Recent Posts

April 21, 2013
February 28, 2013
August 14, 2012
July 28, 2012