www/ddb.html

211 lines
8.3 KiB
HTML

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset=utf-8>
<title>SecBSD: Crash Reports</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="How to report an SecBSD kernel crash">
<link rel="canonical" href="https://secbsd.org/">
<link rel="stylesheet" href="secbsd.css">
</head>
<body>
<header>
<nav>
<div>
<h1><a href="index.html">
<img src="/img/logo.png" alt="[SecBSD]"></a>
</h1>
</div>
<ul>
<li><a href="index.html">About</a></li>
<li><a href="faq.html">Faq</a></li>
<li><a href="docs.html">Docs</a></li>
<li><a href="team.html">Team</a></li>
<li><a href="sponsors.html">Sponsors</a></li>
</ul>
</nav>
</header>
<div class="box">
<b>&#x2588;</b>
<b>SecBSD: Crash Reports</b><br><br>
<b class="purple">Minimum information for kernel problems</b>
<p>Familiarize yourself with <a href="report.html">the general bug
reporting procedures</a> first.
All of that will apply.
When reporting a kernel panic or crash, please remember:
</p>
<ul>
<li>We need the console output on the screen.
Capture it and save it.
Serial consoles are best, but if you are on a VGA console you can
<a href="https://www.openbsd.org/faq/faq7.html">scroll the console back</a>
and take readable pictures with a phone or camera.
<li>If the kernel panicked we need the traceback.
It may be displayed on the screen.
If you are at a
<a href="https://man.openbsd.org/ddb.4">ddb</a>&#62;
prompt, type <kbd>trace</kbd>.
If you are running SMP, use the <kbd>mach ddbcpu N</kbd> command for each
of the <var>N</var> processors you have and repeat the <kbd>trace</kbd>
command for each processor.
<li>We need the process list.
Use the command <kbd>ps</kbd> to get that.
</ul>
<p>
Reports without the above information are useless.
This is the minimum we need to be able to track down the issue.
</p>
<b class="purple">Additional information you can send</b>
<p>
In some situations more information is desirable.
Below are outlined some additional steps you can take in certain situations:
<ul>
<li>
If your crash appears to involve filesystems.
The following additional things would be helpful
<ul>
<li>The output of the
<a href="https://man.openbsd.org/ddb.4">ddb</a>&#62; command
<kbd>show uvm</kbd>
<li>The output of the
<a href="https://man.openbsd.org/ddb.4">ddb</a>&#62;
command <kbd>show bcstats</kbd>
<li>The output of the <kbd>mount</kbd> command from your running machine, so
we know what filesystems are mounted and how.
</ul>
<li> ... XXX boot crash? XXX
<li> ... XXX show regs? XXX
</ul>
<b>Lost the panic message?</b>
<p>
Under some circumstances, you may lose the very first message of a panic,
stating the reason for the panic.
</p>
<pre class="cmdbox">
ddb&#62; <b>show panic</b>
0: kernel: page fault trap, code=0
ddb&#62;
</pre>
<b>Note for SMP systems</b>
<p>
You should get a trace from each processor as part of your report:
</p>
<pre class="cmdbox">
ddb{0}&#62; <b>trace</b>
pool_get(d05e7c20,0,dab19ef8,d0169414,80) at pool_get+0x226
fxp_add_rfabuf(d0a62000,d3c12b00,dab19f10,dab19f10) at fxp_add_rfabuf+0xa5
fxp_intr(d0a62000) at fxp_intr+0x1e7
Xintr_ioapic0() at Xintr_ioapic0+0x6d
--- interrupt ---
idle_loop+0x21:
ddb{0}&#62; <b>machine ddbcpu 1</b>
Stopped at Debugger+0x4: leave
ddb{1}&#62; <b>trace</b>
Debugger(d0319e28,d05ff5a0,dab1bee8,d031cc6e,d0a61800) at Debugger+0x4
i386_ipi_db(d0a61800,d05ff5a0,dab1bef8,d01eb997) at i386_ipi_db+0xb
i386_ipi_handler(b0,d05f0058,dab10010,d01d0010,dab10010) at i386_ipi_handler+0x
4a
Xintripi() at Xintripi+0x47
--- interrupt ---
i386_softintlock(0,58,dab10010,dab10010,d01e0010) at i386_softintlock+0x37
Xintrltimer() at Xintrltimer+0x47
--- interrupt ---
idle_loop+0x21:
ddb{1}&#62;
</pre>
<p>
Repeat the <code>machine ddbcpu x</code> followed by <code>trace</code> for each
processor in your machine.
</p>
<b>How do I gather further information from a kernel crash?</b>
<p>
A typical kernel crash on SecBSD might look like this:
<pre class="cmdbox">
kernel: page fault trap, code=0
Stopped at <b>pf_route+0x263</b>: mov 0x40(%edi),%edx
ddb&#62;
</pre>
<p>
This crash happened at offset <code>0x263</code> in the function <code>pf_route</code>.
</p>
<p>
The first command to run from the
<a href="https://man.openbsd.org/ddb">ddb(4)</a> prompt is <code>trace</code>:
<pre class="cmdbox">
ddb&#62; <b>trace</b>
<b>pf_route</b>(e28cb7e4,e28bc978,2,1fad,d0b8b120) at <b>pf_route+0x263</b>
pf_test(2,1f4ad,e28cb7e4,b4c1) at pf_test+0x706
pf_route(e28cbb00,e28bc978,2,d0a65440,d0b8b120) at pf_route+0x207
pf_test(2,d0a65440,e28cbb00,d023c282) at pf_test+0x706
ip_output(d0b6a200,0,0,0,0) at ip_output+0xb67
icmp_send(d0b6a200,0,1,a012) at icmp_send+0x57
icmp_reflect(d0b6a200,0,1,0,3) at icmp_reflect+0x26b
icmp_input(d0b6a200,14,0,0,d0b6a200) at icmp_input+0x42c
ipv4_input(d0b6a200,e289f140,d0a489e0,e289f140) at ipv4_input+0x6eb
ipintr(10,10,e289f140,e289f140,e28cbd38) at ipintr+0x8d
Bad frame pointer: 0xe28cbcac
ddb&#62;
</pre>
<p>
This tells us what function calls lead to the crash.
</p>
<p>
To find out the particular line of C code that caused the crash, you can
do the following:
</p>
<p>
Find the source file where the crashing function is defined.
In this example, that would be <code>pf_route()</code> in <code>/sys/net/pf.c</code>.
Use <a href="https://man.openbsd.org/objdump">objdump(1)</a> to get the
disassembly:
<pre class="cmdbox">
$ <b>cd /sys/arch/$(uname -m)/compile/GENERIC</b>
$ <b>objdump -dlr obj/pf.o >/tmp/pf.dis</b>
</pre>
<p>
In the output, grep for the function name:
<pre class="cmdbox">
$ <b>grep "&lt;pf_route&#62;:" /tmp/pf.dis</b>
0000<b>7d88</b> &lt;pf_route&#62;:
</pre>
<p>
Take this first hex number <code>7d88</code> and add the offset <code>0x263</code> from
the <code>Stopped at</code> line:
<pre class="cmdbox">
$ <b>printf '%x\n' $((0x7d88 + 0x263))</b>
7feb
</pre>
<p>
Scroll down to the line <code>7feb</code>.
The assembler instruction should match the one quoted in the <code>Stopped at</code>
line.
Then scroll up to the nearest C line number:
<pre class="cmdbox">
$ <b>more /tmp/pf.dis</b>
/sys/net/pf.c:<b>3872</b>
7fe7: 0f b7 43 02 movzwl 0x2(%ebx),%eax
<b>7feb</b>: 8b 57 40 <b>mov 0x40(%edi),%edx</b>
7fee: 39 d0 cmp %edx,%eax
7ff0: 0f 87 92 00 00 00 ja 8088 &lt;pf_route+0x300&#62;
</pre>
<p>
So, it's precisely line <code>3872</code> of <code>pf.c</code> that crashes:
<pre class="cmdbox">
$ <b>nl -ba /sys/net/pf.c | sed -n 3872p</b>
3872 if ((u_int16_t)ip-&#62;ip_len &lt;= ifp-&#62;if_mtu) {
</pre>
<p>
The kernel that produced the crash output and the object file for objdump must
be compiled from the exact same source file, otherwise the offsets won't match.
</p>
<p>
If you provide both the ddb trace output and the relevant objdump section,
that's very helpful.
</p>
</div>
</body>
</html>