Blog

My Dell T620

It's been quite a while since my last blog post, for the three people that actually read my blog I know I've promised new posts about debugging in virtual machines, east coast vs west

It’s been quite a while since my last blog post, for the three people that actually read my blog I know I’ve promised new posts about debugging in virtual machines, east coast vs west coast company culture, and Java support for DTrace. Rather then blog about any of those I decided I might as well continue on the trend of debugging systems that are hung!

We recently purchased a new Dell T620 server for our Boston office; I was really excited to install Delphix and run some benchmarks! I burned an ISO (a subject for a future blog post titled “why won’t my Dell T620 boot from a USB stick”), turned on the host, and waited. I then continued to wait, as shortly after we displayed the OS banner the installer hung. At first I assumed it was just taking a long time to probe devices, but after several minutes my hopes were sunk. With the system hung I turned to the most important tool for debugging physical machines:

paper clip

Using the paper retention device I tried to generate an NMI, which should cause a kernel panic, to see where we were stuck. My day got decidedly worse when I realized that we were hung before we registered the NMI handler, and my paperclip was useless (except for holding paper together). My next thought was to boot with -v to see if I could get anymore information about where we were hanging. This was rewarded with output that ended with:

...
x86_feature: vmx
mem = 201280048K (0x2ffd28c00)
...

 

Sadly this didn’t provide with much more insight. Delphix OS is derived from illumos, which has support for giving additional status at boot via the prom (from its root with SPARC hardware). While we don’t have a prom on the T620, enabling prom debugging does cause the messages to be logged to the console. Enabling prom debugging can be done with boot flags, but I find it’s easier to boot with -k to enable kmbd and -d to drop into the debugger on boot.

 
...
> prom_debug/W 1
> prom_debug:     0               =       0x1
> :c
...

 

Enabling prom debugging generates more data than booting with just -v. The last log I saw was:

startup_modules() starting...

 

So far so good. As I started digging in further I realized how badly I was missing the virtual world. A simple vprobe script could have given the backtrace of the CPU. A VMware suspend file could have been converted into a core that would have shown me the state of the system. Even vCPU samples could have given the last frame of the stack. Debugging on on physical hardware meant kmdb was my only hope.

I put a break point in startup_modules and began stepping through the code to determine where we were hanging. My first iteration led me into microfind, which it turns out is not a function you want to step through. Perturbing the timing in microfind can cause it to loop forever. I’ll leave it as an exercise to the reader to figure out why this comment only *seems* clever — hint: failed assumptions.

 
...
        while (1) {
                debug_dump("outer loop start", min, max, count);

                /*
                 * microdata is the loop count used in tenmicrosec. The first
                 * time around microdata is set to 1 to make tenmicrosec
                 * return quickly. The purpose of this while loop is to
                 * warm the cache for the next time around when the number
                 * of PIT counts are measured.
                 */
                microdata = 1;
...

 

After determining that my debugging was causing a new hang in microfind hang I put a breakpoint in startup_modules after microfind and continued searching. It took a number of tries, but eventually I was able to piece together a picture of what was happening. The call stack looked like:

startup_modules
    setup_ddi
        create_devinfo_tree
            impl_setup_ddi
                startup_bios_disk
                    read_firstblock
                        do_bsys_doint (HANG IN HERE)

 

read_firstblock is being called from startup_bios_disk with the first drive number, 0x80. read_firstblock does:

 
...
        bufp = (caddr_t)BIOS_RES_BUFFER_ADDR;


        rp.eax.byte.ah = 0x2;   /* Read disk */
        rp.eax.byte.al = 1;     /* nsect */
        rp.ecx.byte.ch = 0;     /* cyl & 0xff */
        rp.ecx.byte.cl = 1;     /* cyl >> 2 & 0xc0 (sector number) */
        rp.edx.byte.dh = 0;     /* head */
        rp.edx.byte.dl = drivenum;      /* drivenum */

        /* es:bx is buf address */
        rp.ebx.word.bx = (uint16_t)FP_OFF((uint_t)(uintptr_t)bufp);
        rp.es = FP_SEG((uint_t)(uintptr_t)bufp);

        BOP_DOINT(bootops, 0x13, &rp); <**** PROBLEM ***
...

 

It appeared that read_firstblock is trying to issue a realmode interrupt to the BIOS to read the first block of the disk device with drive number 0x80. Looking at the call it’s definitely setting up some realmode voodoo (bufp is a #define for 0x7000), but after reading through some x86 manuals it seems legit.

BOP_DOINT is a wrapper around a call to do_bsys_doint, which in turn does the actual BIOS call:

 
...
        DBG_MSG("Doing BIOS call...");
        DBG(br.ax);
        DBG(br.bx);
        DBG(br.dx);
        rp->eflags = bios_func(intnum, &br);
        DBG_MSG("donen");
...

 

The bios_func logic ends up resolving to:

 
do_bsys_doint+0x13d:            leaq   -0x50(%rbp),%rsi
do_bsys_doint+0x141:            movl   %ebx,%edi
do_bsys_doint+0x143:            xorl   %eax,%eax
do_bsys_doint+0x145:            call   *%r12

 

The value of %r12 when we make the call is 0x5000. As I traced through this I discovered there is actaully valid code at address 0x5000 that sets up the call to the BIOS. Further tracing shows that other calls to the BIOS via this interrupt worked fine, it was only reading the first block that triggered the hang. Looking at startup_bios_disk there was a variable we could set to disable the bios disk startup logic:

...
void
startup_bios_disk()
{
        uchar_t drivenum;
        int got_devparams = 0;
        int got_first_block = 0;
        uchar_t name[20];
        dev_info_t      *devi;
        int extensions;

#if defined(__xpv)
        if (!DOMAIN_IS_INITDOMAIN(xen_info))
                return;
#endif

        if (dobiosdev == 0)
                return;
...

 

I used kmdb to set dobiosdev to 0 (by default it’s 1), and verified that this allowed us to boot. My victory was short lived, however, as the installer was then not able to see any devices to install onto. So while I was able to confirm this was in fact the problem, I still couldn’t install.

Given the problem appeared to be in the BIOS I set about upgrading the BIOS of my T620. While the documentation told me that upgrading the BIOS was now easier than ever, the actual experience was still challenging. After several failed attempts I was able to get it upgraded to the latest version, and lo and behold my machine could boot and see its devices!

As much as I may have complained in the past about debugging hangs of virtual machines, debugging a hang on physical hardware left me missing my virtual environment. Thank goodness for kmdb!